Project overview
• Started as my bachelor’s thesis in 2011
• Has been running ever since with little disruptions
• https://tvitediens.tk/ - go check it out
• Has its own Twitter account https://twitter.com/Twitediens
• Every day it tweets
• 5 most mentioned foods of the last 24 hours
• 5 most active users of the last 24 hours
• A random recommendation for lunch
• Twitter users occasionally interact with it
Dataset overview
https://github.com/Usprogis/Latvian-Twitter-Eater-Corpus
Domain-specific about food and eating written in Latvian
2.4M+ tweets
• ~5,500 + 744 tweets with manually annotated sentiment (positife, neutral,
negative) for training and testing
• 744 tweets with manually annotated named entity classes of person names,
locations, organizations, food and drinks, and miscellaneous named entities, like
• ~43,000 automatically aggregated question-answer tweet pairs
• ~155,000 tweets have images
• ~165,000 have location info
Data collection
• A script uses the Twitter streaming API to listen for any of the 363
pre-defined keywords
• Records the latest tweets in a large MySQL database for storage and a
separate database for displaying data from the last 3 months online
• At the beginning of each month a scheduled script converts data from
the database into JSON format for easier processing
Experiments
• Sentiment analysis – about 5,500 tweets annotated for training and
744 as a test dataset
• Named entity recognition – the same 744 tweets annotated with
place, person, food, time, and misc entities
• Question answering – about 19,000 tweets that express questions
along with any replies to the tweets make up about 43,000 question-
answer tweet pairs
• Multimodal experiments – about 155,000 tweets have images, but
experiments are still in progress with no outstanding results just yet
How to determine sentiment?
It was difficult to agree upon sentiment of some tweets
Consider those:
• “Batars tak arī viņus ēda paļube tgd mums no 9 izlabos uz 3 :D”
• “Batars was also eating them and now our grades will be marked from 9 to 3 :D”
• “Ja vēlies pazaudēt pāris kilogramus, izrauj savus zobus! Tad arī turpmāk būs
grūti apēst parāk daudz”
• “If you want to lose weight, just pull out your teeth! Then it is going to be
difficult to eat too much”
Relations to smell, taste and temperature
Food/Drink Pleasant smell Bad smell Food/Drink
Tea 598 156 Meat
Chocolate 408 96 Tea
Coffee 386 93 Fish
Gingerbread 293 67 Cheese
Tangerines 244 65 Garlic
Strawberries 227 57 Coffee
Apple 220 48 Potatoes
Meat 183 39 Egg
Potatoes 142 38 Onion
Pancakes 142 37 Chocolate
Question answering
• We performed a small-scale human evaluation results by asking
5 annotatorsto evaluate a random 10% of the evaluation set by
marking generated answers as either OK or not good (NG).
• The evaluators marked 46.40% of answers as OK.
• The evaluators had an overall agreement of 66.27% (Free-
marginal kappa 0.33), which indicatesmoderate agreement.
Pancakes during the week / time of day
Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Morning 1,107 1,128 1,122 1,049 1,221 1,617 1,887
Afternoon 2,122 2,071 2,015 2,030 2,236 2,704 3,410
Evening 2,133 2,171 2,096 2,044 1,810 1,856 2,515
Night 615 603 609 601 588 583 668
Salad during the week / time of day
Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Morning 3,613 3,679 3,521 3,399 3,265 1,148 1,146
Afternoon 3,628 3,219 3,071 3,057 2,658 2,630 2,970
Evening 2,838 2,852 2,682 2,619 2,187 2,241 2,696
Night 923 904 908 883 776 678 899
Some conclusions so far, more to come
Large scale social network data can be helpful for better understanding human and food
relationships and forming strategies and tactics for nudging for healthier (but not
necessarily less tasty) food behavior
By researching food related behavior on social media we can move from fragmented and
valuable data to a better understanding and knowledge of food choice and sentiment
associated with it
Our research results reveal that negative sentiment expressed about meat in Twitter is
rising steadily, however, large part of neutral tweets remain
Neutrality has, however, sharply decreased with the beginning of Covid-19 pandemic
Publications
• Sproģis, U., Rikters, M. (2020). What Can We Learn From Almost a Decade of
Food Tweets. In The 9th Conference on Human Language Technologies - the Baltic
Perspective.
• Kāle, M., Rikters, M. (2021). Fragmented and Valuable: Following Sentiment
Changes in Food Tweets. In Smell, Taste, and Temperature Interfaces. ACM CHI
2021 workshop.
• Kāle, M., Rikters, M., Šķilters, J. (2021). Tracing Multisensory Food Experience on
Twitter. In review for International Journal of Food Design.
All on GitHub
• Website - https://github.com/M4t1ss/TwitEdiens
• Main corpus - https://github.com/Usprogis/Latvian-Twitter-Eater-Corpus
• NER corpus - https://github.com/RinaldsViksna/Latvian-food-NER-corpus
• Sentiment analysis - https://github.com/M4t1ss/sentiment-analysis-toolkit
• Processing scripts - https://github.com/M4t1ss/Latvian-Twitter-Eater-Corpus-
Processing