This talk was originally presented by Thomas Winters on the 20th of November 2020 at the 29th Belgian Dutch Conference on Machine Learning (Benelearn 2020). The conference awarded this presentation the "Best Video Award".
A video of this talk is also available on https://www.youtube.com/watch?v=U1cShms67ec
More information, see https://thomaswinters.be/talk/2020benelearn
Abstract:
3. 3
Incongruity-Resolution Theory
Based on: Ritchie, G. (1999). Developing the incongruity-resolution theory.
Two fish are in a tank.
Says one to the other:
“Do you know how to
drive this thing?”
4. 4
Incongruity-Resolution Theory
Based on: Ritchie, G. (1999). Developing the incongruity-resolution theory.
Two fish are in a tank.
Says one to the other:
“Do you know how to
drive this thing?”
Setup
5. 5
Incongruity-Resolution Theory
Based on: Ritchie, G. (1999). Developing the incongruity-resolution theory.
Obvious
Interpretation
Two fish are in a tank.
Says one to the other:
“Do you know how to
drive this thing?”
Setup
6. 6
Incongruity-Resolution Theory
Based on: Ritchie, G. (1999). Developing the incongruity-resolution theory.
Obvious
Interpretation
Two fish are in a tank.
Says one to the other:
“Do you know how to
drive this thing?”
Setup
Punchline
7. 7
Incongruity-Resolution Theory
Based on: Ritchie, G. (1999). Developing the incongruity-resolution theory.
Obvious
Interpretation
Hidden
Interpretation
Two fish are in a tank.
Says one to the other:
“Do you know how to
drive this thing?”
Setup
Punchline
8. 8
Human-focused definition!
Machine should not only spot
two mental images
Obvious
Interpretation
Hidden
Interpretation
But also this is
not too hard or too easy for a human!
9. 9
Transformer models
Large language models, pretrained on large corpora
Outperforming previous neural architectures
on most language tasks
GPT-2 & GPT-3
Completes any textual prompt
BERT
Classifies any text sequence / token
Brown, Tom B., et al. "Language models are few-shot learners."
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding
10. 10
Not just for English Dutch RobBERT!
RobBERT is a Dutch RoBERTa-based language model
Vastly outperforms other architectures on large
range of Dutch NLP tasks & generally outperforms
other BERT models, especially on small datasets
Easy to use: just import & fine-tune on your task
But can it learn to recognise humor?
Delobelle, P., Winters, T., & Berendt, B. (2020). RobBERT: a dutch RoBERTa-based language model.
RobBERT
Our Dutch BERT-like model
from transformers import RobertaTokenizer, RobertaForSequenceClassification
tokenizer = RobertaTokenizer.from_pretrained("pdelobelle/robbert-v2-dutch-base")
model = RobertaForSequenceClassification.from_pretrained("pdelobelle/robbert-v2-dutch-base")
11. 11
Early Humor Detector
• Designed humor features e.g. alliteration, antonym, adult slang...
• Used Naive Bayes and Support Vector Machines
• Task: One-liners vs news, neutral corpus & proverbs
Mihalcea, R., & Strapparava, C. (2005). Making computers laugh: Investigations in automatic humor recognition.
12. 12
But is this a good dataset?
News & proverbs have completely different types
of words than jokes!
Looking at word frequencies is often already “enough”!
Is this really humor detection?
13. 13
Jokes are fragile!
Two fish are in a tank. Says one to the other:
“Do you know how to drive this thing?”
Winters, T. (2019). Generating philosophical statements using interpolated markov models and dynamic templates.
14. 14
Jokes are fragile!
Two fish are in a tank. Says one to the other:
“Do you know how to drive this thing?”
Generate non-jokes using dynamic templates! (@TorfsBot)
Winters, T. (2019). Generating philosophical statements using interpolated markov models and dynamic templates.
15. 15
Jokes are fragile!
Two fish are in a tank. Says one to the other:
“Do you know how to drive this thing?”
men
Generate non-jokes using dynamic templates! (@TorfsBot)
Winters, T. (2019). Generating philosophical statements using interpolated markov models and dynamic templates.
16. 16
Jokes are fragile!
Two fish are in a tank. Says one to the other:
“Do you know how to drive this thing?”
men bar
Generate non-jokes using dynamic templates! (@TorfsBot)
Winters, T. (2019). Generating philosophical statements using interpolated markov models and dynamic templates.
17. 17
Jokes are fragile!
Two fish are in a tank. Says one to the other:
“Do you know how to drive this thing?”
men bar
Generate non-jokes using dynamic templates! (@TorfsBot)
Word-based features won’t work anymore!
Winters, T. (2019). Generating philosophical statements using interpolated markov models and dynamic templates.
18. 18
Examples of generated Dutch non-jokes
Het is groen en het is een mummie?
Kermit de Waterkant
Wat is het toppunt van principe?
1) Wachten totdat een Nederlander gaat twijfelen
2) Een Zuster met een autoladder
3) Een brandwacht brandmeester met een brandmeester
van 9 maanden
“Ober, kunt u die schrik uit mijn politieman halen? Want
ik eet liever alleen.”
"Mijn hond is heel vreselijk: Hij schreeuwt mij iedere zus
de broer.“
"Maar dat is toch niet zo heel vreselijk?“
"Jawel, want ik heb geen rapport!"
Wat staat er midden in het bos?
De kapper.
Er loopt een super vriendelijk blondje langs een armband.
Last er een toonbank: “zo, waargaan die mooie mannen
heen?” Blondje: “naar de barkeeper als er niets tussen
komt…”
Hoe heet de vrouw van Sinterklaas?
Keukentafel.
"Twee tanden zwemmen in de zee en ze zien een
stamgast op een stamgast. De ene raad zegt tegen de
andere raad: 'Hé kijk! Ons eten op een bord!'"
19. 19
51%
60%
50%
94% 94%
47%
94% 94%
47%
99% 96%
89%
Jokes vs News Jokes vs Proverbs Jokes vs Generated Jokes
Binary classification of Dutch jokes versus texts from other domains
Naive Bayes LSTM CNN RobBERT
Much more challenging dataset!
More truthful humor detection?
20. 20
Conclusion
Novel joke detection
dataset creation method
Easily scales to other languages
Illustrated humor
insights of transformer
Strongly outperforming
previous neural networks
Created first Dutch
humor detectors
Humour
https://github.com/twinters/dutch-humor-detection
21. 21
Some images (based on the works) of dooder & alekksall on freepik.com
Thomas Winters & Pieter Delobelle
PhD Students at DTAI, KU Leuven
firstname.lastname@kuleuven.be
Dutch Humor Detection
by Generating Negative Examples
@thomas_wint
thomaswinters.be
@pieterdelobelle
people.cs.kuleuven.be
/~pieter.delobelle