SlideShare a Scribd company logo
1 of 28
What makes a tweet relevant for a topic?
                                     #MSM2012 Lyon
                                     April 16th, 2012




     Ke Tao, Fabian Abel, Claudia Hauff, Geert-Jan Houben
                      Web Information Systems, TU Delft



          What makes a tweet relevant for a topic?          1
Get information from Twitter
How do people use Twitter as a source of information?

• Twitter is more like a news media.


• How do people search Twitter?
   • Search on Twitter (Teevan et al.)



                                 Web Search     Twitter Search
     Query length (chars)            18.80          12.00
     Query length (words)                3.08       1.64
      Is a celebrity name            3.11%         15.22%



            What makes a tweet relevant for a topic?             2
Research Questions
What are the challenges we are facing?

 1. Given a topic, can we identify the relevant tweets
 based on the characteristics of the tweets?


 2. Are semantics meaningful for determining the tweets’
 relevance for a topic?




           What makes a tweet relevant for a topic?        3
Search on Twitter
Traditional solution


• Twitter search interface

• Ordered by time

• Keyword-based match




            What makes a tweet relevant for a topic?   4
Syntactical feature : Hashtag
Is a tweet more relevant if it contains a #hashtag?

  Hypothesis 1: tweets that contain hashtags
  are more likely to be relevant than tweets
  that do not contain hashtags.




           What makes a tweet relevant for a topic?   5
Syntactical feature : hasURL
Is a tweet that contains a URL more relevant?

  Hypothesis 2: tweets that contain a URL are
  more likely to be relevant than tweets that
  do not contain a URL.




           What makes a tweet relevant for a topic?   6
Syntactical feature : isReply
Is a tweet which is a reply to @somebody more relevant?

  Hypothesis 3: tweets that are formulated as
  a reply to another tweet are less likely to be
  relevant than other tweets.




          What makes a tweet relevant for a topic?        7
Syntactical feature : length
Does the length of a tweet influence its relevance for a topic?




  Hypothesis 4: the longer a tweet, the more
  likely it is to be relevant and interesting.



            What makes a tweet relevant for a topic?              8
Overview of features
 Short summary




    Topic sensitive             Topic insensitive
     Keyword-based
                                Syntactical features
       relevance
   Are there further features that
allow for estimating the relevance?


           What makes a tweet relevant for a topic?    9
Semantic features
Find semantics in a tweet to estimate the relevance


         dbp:Tim_Berners-Lee          dbp:World_Wide_Web




                                              dbp:France

                                        dbp:Lyon

                          dbp:International_World_Wide_Web_Conference




           What makes a tweet relevant for a topic?                10
Semantic features : #entity
Is a tweet with more entities more interesting?


•5 entities extracted.




  Hypothesis 5: the more entities a tweet
  mentions, the more likely it is to be relevant
  and interesting.
           What makes a tweet relevant for a topic?   11
Semantic features : #entity(type)
Do the types of semantics influence the relevance?


•Person : 1
•Artifact : 1
•Event : 1
•Location : 2




  Hypothesis 6: different types of entities are
  of different importance for estimating the
  relevance of a tweet.
           What makes a tweet relevant for a topic?   12
Semantic features : diversity
How many types are there in the entities?


•4 types of entities




  Hypothesis 7: the greater the diversity of
  concepts mentioned in a tweet, the more
  likely it is to be interesting and relevant.
           What makes a tweet relevant for a topic?   13
Semantic features : sentiment
Was the author of the tweet happy or not?


•Sentiment : Neutral




  Hypothesis 8: the likelihood of a tweet’s
  relevance is influenced by its sentiment
  polarity.
           What makes a tweet relevant for a topic?   14
Overview of features
What do we have now?




   Topic sensitive             Topic insensitive
    Keyword-based                 Syntactical
          ?                       Semantics

   Semantics in the given topic?



          What makes a tweet relevant for a topic?   15
Semantic-based relevance
Expand the queries to match more tweets.




                                      dbp:Lyon

                dbp:Tim_Berners-Lee



  Reformulated query is expected to get a
  more accurate retrieval score.

           What makes a tweet relevant for a topic?   16
Semantic-based relatedness
     Is there a semantic overlap between the query and the tweet?




                                         dbp:Lyon
dbp:Tim_Berners-Lee




                What makes a tweet relevant for a topic?            17
Overview of features
By now, we have 4 types of features.




    Topic sensitive              Topic insensitive
     Keyword-based                  Syntactical
    Semantic-based                  Semantics



       Can we utilize the contextual
          information of tweets?
           What makes a tweet relevant for a topic?   18
Topic insensitive contextual features
Does the number of posts influence the relatedness?




    Hypothesis 9: the higher the number of
    tweets that have been published by the
    creator of a tweet, the more likely it is that
    the tweet is relevant.

           What makes a tweet relevant for a topic?   19
Topic-sensitive contextual features
Is this tweet too old?

 Hypothesis 10: the lower the temporal
 distance between the query time and the
 creation time of a tweet, the more likely is
 the tweet relevant to the topic.




                   T                 Query


            March 31           April 16

            What makes a tweet relevant for a topic?   20
Summary of Features
The features




    Topic sensitive             Topic insensitive
     Keyword-based                 Syntactical
    Semantic-based                 Semantics
       Contextual                  Contextual




           What makes a tweet relevant for a topic?   21
Analysis

• Research Questions:
  1. Which features are more influential on predicting the
     relatedness of a tweet to a certain topic?
  2. Which types of features are more important? Are
     semantics meaningful?
  3. What’s the performance that we can achieve by utilizing
     these features?

• Experimental Setup
  • Consider the search problem as a classification task
  • Classification algorithm = Logistic Regression



          What makes a tweet relevant for a topic?             22
Dataset
From TREC 2011 Microblog Track


• Twitter corpus
   • 16 million tweets (Jan. 24th, 2011 – Feb. 8th)
   • 4,766,901 tweets classified as English
   • 6.2 million entity-extractions (140k distinct entities)

• Relevance judgments
   • 49 topics
   • 40,855 (topic, tweet) pairs
   • 60.31 relevant tweets per topic (on average)




            What makes a tweet relevant for a topic?           23
Results
Which type of features matters?


Features              Precision        Recall            F-measure
keyword relevance             0.3040            0.2924         0.2981
semantic                     0.3053             0.2931        0.2991
relevance
topic-sensitive              0.3017             0.3419        0.3206
topic-insensitive             0.1294            0.0170         0.0300
without semantics             0.3363            0.4828         0.3965
all features                 0.3674             0.4736        0.4138

 Overall, we can achieve the precision and
 recall of over 35% and 45% respectively by
 applying all the features.

               What makes a tweet relevant for a topic?              24
Weights of features                          Syntactical
 Which feature matters?          2
                                 1
     Keyword-based               0
2                                    hasHashtag       hasURL               isReply       length
                                -1
1
0                                                   Semantics
-1   Keyword-based relevance     2
                                 1
     Semantic-based              0
 2                                      #entities              diversity             sentiment
                                -1
 1
                                                    Contextual
 0                               2
-1   Relevance    Relatedness    1
                                 0
                                           social context                    temporal context
                                -1

            What makes a tweet relevant for a topic?                                            25
Conclusions

1. We constructed 17 features, including keyword-based
   relevance, semantics-based relevance, syntactical features,
   semantic features and contextual features.

2. Semantic features and topic-sensitive features are
   meaningful.

3. The contextual features have little impact on the
   prediction.




           What makes a tweet relevant for a topic?              26
Future work

• We plan to leverage the implementation of search engine for
  Twitter based on the work done in this paper.
   • Twinder (Tweets Finder)

• The progress on this work on be found at:
        http://wis.ewi.tudelft.nl/twinder/




           What makes a tweet relevant for a topic?             27
QUESTIONS?



April 16th, 2012       Slides : http://goo.gl/fQLaQ

k.tao@tudelft.nl       http://ktao.nl/


THANK YOU!


             What makes a tweet relevant for a topic?   28

More Related Content

Similar to What makes a tweet relevant for a topic?

Conversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems DesignConversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems DesignCommunitySense
 
Vector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdfVector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdfConnorShorten2
 
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...Paul Shapiro
 
Twinder: A Search Engine for Twitter Streams
Twinder: A Search Engine for Twitter Streams Twinder: A Search Engine for Twitter Streams
Twinder: A Search Engine for Twitter Streams Ke Tao
 
Tutorial of Sentiment Analysis
Tutorial of Sentiment AnalysisTutorial of Sentiment Analysis
Tutorial of Sentiment AnalysisFabio Benedetti
 
Social Media Algorithms 2020 (Facebook, Twitter, LinkedIn, Instagram, and You...
Social Media Algorithms 2020 (Facebook, Twitter, LinkedIn, Instagram, and You...Social Media Algorithms 2020 (Facebook, Twitter, LinkedIn, Instagram, and You...
Social Media Algorithms 2020 (Facebook, Twitter, LinkedIn, Instagram, and You...David Clemen
 
Social Commerce Exchange Presentation
Social Commerce Exchange PresentationSocial Commerce Exchange Presentation
Social Commerce Exchange PresentationJordan Kasteler
 
Kevin Lee- The New Online Marketing
Kevin Lee- The New Online MarketingKevin Lee- The New Online Marketing
Kevin Lee- The New Online MarketingDidit Marketing
 
Twitter - Measuring Reach and Engagement
Twitter - Measuring Reach and EngagementTwitter - Measuring Reach and Engagement
Twitter - Measuring Reach and EngagementSarah Baughman
 
Beyond Hashtags: Collecting and Analysing Conversations on Twitter
Beyond Hashtags: Collecting and Analysing Conversations on TwitterBeyond Hashtags: Collecting and Analysing Conversations on Twitter
Beyond Hashtags: Collecting and Analysing Conversations on TwitterBrenda Moon
 
Make the Most of Your Station's Facebook and Twitter Pages
Make the Most of Your Station's Facebook and Twitter PagesMake the Most of Your Station's Facebook and Twitter Pages
Make the Most of Your Station's Facebook and Twitter PagesEric Athas
 
TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...
TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...
TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...TAUS - The Language Data Network
 
Citation Management
Citation ManagementCitation Management
Citation ManagementJohn Pell
 
Unsexy, compliant and boring? The 5 missed opportunities of B2B content marke...
Unsexy, compliant and boring? The 5 missed opportunities of B2B content marke...Unsexy, compliant and boring? The 5 missed opportunities of B2B content marke...
Unsexy, compliant and boring? The 5 missed opportunities of B2B content marke...Sticky Content
 
BEST PRACTICE: Unsexy, compliant and boring? 5 missed opportunities of b2b co...
BEST PRACTICE: Unsexy, compliant and boring? 5 missed opportunities of b2b co...BEST PRACTICE: Unsexy, compliant and boring? 5 missed opportunities of b2b co...
BEST PRACTICE: Unsexy, compliant and boring? 5 missed opportunities of b2b co...B2B Marketing
 
Advanced Social Media by LightBox Collaborative
Advanced Social Media by LightBox CollaborativeAdvanced Social Media by LightBox Collaborative
Advanced Social Media by LightBox Collaborativepcgak
 
Twitter in efl classroom
Twitter in efl classroomTwitter in efl classroom
Twitter in efl classroomHala Fawzi
 
A2 revision guide section acopy
A2 revision guide section acopyA2 revision guide section acopy
A2 revision guide section acopyjphibbert1979
 
How to Find Link Love
How to Find Link LoveHow to Find Link Love
How to Find Link LoveMarqui CMS
 
Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Marieke van Erp
 

Similar to What makes a tweet relevant for a topic? (20)

Conversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems DesignConversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems Design
 
Vector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdfVector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdf
 
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
 
Twinder: A Search Engine for Twitter Streams
Twinder: A Search Engine for Twitter Streams Twinder: A Search Engine for Twitter Streams
Twinder: A Search Engine for Twitter Streams
 
Tutorial of Sentiment Analysis
Tutorial of Sentiment AnalysisTutorial of Sentiment Analysis
Tutorial of Sentiment Analysis
 
Social Media Algorithms 2020 (Facebook, Twitter, LinkedIn, Instagram, and You...
Social Media Algorithms 2020 (Facebook, Twitter, LinkedIn, Instagram, and You...Social Media Algorithms 2020 (Facebook, Twitter, LinkedIn, Instagram, and You...
Social Media Algorithms 2020 (Facebook, Twitter, LinkedIn, Instagram, and You...
 
Social Commerce Exchange Presentation
Social Commerce Exchange PresentationSocial Commerce Exchange Presentation
Social Commerce Exchange Presentation
 
Kevin Lee- The New Online Marketing
Kevin Lee- The New Online MarketingKevin Lee- The New Online Marketing
Kevin Lee- The New Online Marketing
 
Twitter - Measuring Reach and Engagement
Twitter - Measuring Reach and EngagementTwitter - Measuring Reach and Engagement
Twitter - Measuring Reach and Engagement
 
Beyond Hashtags: Collecting and Analysing Conversations on Twitter
Beyond Hashtags: Collecting and Analysing Conversations on TwitterBeyond Hashtags: Collecting and Analysing Conversations on Twitter
Beyond Hashtags: Collecting and Analysing Conversations on Twitter
 
Make the Most of Your Station's Facebook and Twitter Pages
Make the Most of Your Station's Facebook and Twitter PagesMake the Most of Your Station's Facebook and Twitter Pages
Make the Most of Your Station's Facebook and Twitter Pages
 
TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...
TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...
TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...
 
Citation Management
Citation ManagementCitation Management
Citation Management
 
Unsexy, compliant and boring? The 5 missed opportunities of B2B content marke...
Unsexy, compliant and boring? The 5 missed opportunities of B2B content marke...Unsexy, compliant and boring? The 5 missed opportunities of B2B content marke...
Unsexy, compliant and boring? The 5 missed opportunities of B2B content marke...
 
BEST PRACTICE: Unsexy, compliant and boring? 5 missed opportunities of b2b co...
BEST PRACTICE: Unsexy, compliant and boring? 5 missed opportunities of b2b co...BEST PRACTICE: Unsexy, compliant and boring? 5 missed opportunities of b2b co...
BEST PRACTICE: Unsexy, compliant and boring? 5 missed opportunities of b2b co...
 
Advanced Social Media by LightBox Collaborative
Advanced Social Media by LightBox CollaborativeAdvanced Social Media by LightBox Collaborative
Advanced Social Media by LightBox Collaborative
 
Twitter in efl classroom
Twitter in efl classroomTwitter in efl classroom
Twitter in efl classroom
 
A2 revision guide section acopy
A2 revision guide section acopyA2 revision guide section acopy
A2 revision guide section acopy
 
How to Find Link Love
How to Find Link LoveHow to Find Link Love
How to Find Link Love
 
Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia
 

Recently uploaded

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Recently uploaded (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

What makes a tweet relevant for a topic?

  • 1. What makes a tweet relevant for a topic? #MSM2012 Lyon April 16th, 2012 Ke Tao, Fabian Abel, Claudia Hauff, Geert-Jan Houben Web Information Systems, TU Delft What makes a tweet relevant for a topic? 1
  • 2. Get information from Twitter How do people use Twitter as a source of information? • Twitter is more like a news media. • How do people search Twitter? • Search on Twitter (Teevan et al.) Web Search Twitter Search Query length (chars) 18.80 12.00 Query length (words) 3.08 1.64 Is a celebrity name 3.11% 15.22% What makes a tweet relevant for a topic? 2
  • 3. Research Questions What are the challenges we are facing? 1. Given a topic, can we identify the relevant tweets based on the characteristics of the tweets? 2. Are semantics meaningful for determining the tweets’ relevance for a topic? What makes a tweet relevant for a topic? 3
  • 4. Search on Twitter Traditional solution • Twitter search interface • Ordered by time • Keyword-based match What makes a tweet relevant for a topic? 4
  • 5. Syntactical feature : Hashtag Is a tweet more relevant if it contains a #hashtag? Hypothesis 1: tweets that contain hashtags are more likely to be relevant than tweets that do not contain hashtags. What makes a tweet relevant for a topic? 5
  • 6. Syntactical feature : hasURL Is a tweet that contains a URL more relevant? Hypothesis 2: tweets that contain a URL are more likely to be relevant than tweets that do not contain a URL. What makes a tweet relevant for a topic? 6
  • 7. Syntactical feature : isReply Is a tweet which is a reply to @somebody more relevant? Hypothesis 3: tweets that are formulated as a reply to another tweet are less likely to be relevant than other tweets. What makes a tweet relevant for a topic? 7
  • 8. Syntactical feature : length Does the length of a tweet influence its relevance for a topic? Hypothesis 4: the longer a tweet, the more likely it is to be relevant and interesting. What makes a tweet relevant for a topic? 8
  • 9. Overview of features Short summary Topic sensitive Topic insensitive Keyword-based Syntactical features relevance Are there further features that allow for estimating the relevance? What makes a tweet relevant for a topic? 9
  • 10. Semantic features Find semantics in a tweet to estimate the relevance dbp:Tim_Berners-Lee dbp:World_Wide_Web dbp:France dbp:Lyon dbp:International_World_Wide_Web_Conference What makes a tweet relevant for a topic? 10
  • 11. Semantic features : #entity Is a tweet with more entities more interesting? •5 entities extracted. Hypothesis 5: the more entities a tweet mentions, the more likely it is to be relevant and interesting. What makes a tweet relevant for a topic? 11
  • 12. Semantic features : #entity(type) Do the types of semantics influence the relevance? •Person : 1 •Artifact : 1 •Event : 1 •Location : 2 Hypothesis 6: different types of entities are of different importance for estimating the relevance of a tweet. What makes a tweet relevant for a topic? 12
  • 13. Semantic features : diversity How many types are there in the entities? •4 types of entities Hypothesis 7: the greater the diversity of concepts mentioned in a tweet, the more likely it is to be interesting and relevant. What makes a tweet relevant for a topic? 13
  • 14. Semantic features : sentiment Was the author of the tweet happy or not? •Sentiment : Neutral Hypothesis 8: the likelihood of a tweet’s relevance is influenced by its sentiment polarity. What makes a tweet relevant for a topic? 14
  • 15. Overview of features What do we have now? Topic sensitive Topic insensitive Keyword-based Syntactical ? Semantics Semantics in the given topic? What makes a tweet relevant for a topic? 15
  • 16. Semantic-based relevance Expand the queries to match more tweets. dbp:Lyon dbp:Tim_Berners-Lee Reformulated query is expected to get a more accurate retrieval score. What makes a tweet relevant for a topic? 16
  • 17. Semantic-based relatedness Is there a semantic overlap between the query and the tweet? dbp:Lyon dbp:Tim_Berners-Lee What makes a tweet relevant for a topic? 17
  • 18. Overview of features By now, we have 4 types of features. Topic sensitive Topic insensitive Keyword-based Syntactical Semantic-based Semantics Can we utilize the contextual information of tweets? What makes a tweet relevant for a topic? 18
  • 19. Topic insensitive contextual features Does the number of posts influence the relatedness? Hypothesis 9: the higher the number of tweets that have been published by the creator of a tweet, the more likely it is that the tweet is relevant. What makes a tweet relevant for a topic? 19
  • 20. Topic-sensitive contextual features Is this tweet too old? Hypothesis 10: the lower the temporal distance between the query time and the creation time of a tweet, the more likely is the tweet relevant to the topic. T Query March 31 April 16 What makes a tweet relevant for a topic? 20
  • 21. Summary of Features The features Topic sensitive Topic insensitive Keyword-based Syntactical Semantic-based Semantics Contextual Contextual What makes a tweet relevant for a topic? 21
  • 22. Analysis • Research Questions: 1. Which features are more influential on predicting the relatedness of a tweet to a certain topic? 2. Which types of features are more important? Are semantics meaningful? 3. What’s the performance that we can achieve by utilizing these features? • Experimental Setup • Consider the search problem as a classification task • Classification algorithm = Logistic Regression What makes a tweet relevant for a topic? 22
  • 23. Dataset From TREC 2011 Microblog Track • Twitter corpus • 16 million tweets (Jan. 24th, 2011 – Feb. 8th) • 4,766,901 tweets classified as English • 6.2 million entity-extractions (140k distinct entities) • Relevance judgments • 49 topics • 40,855 (topic, tweet) pairs • 60.31 relevant tweets per topic (on average) What makes a tweet relevant for a topic? 23
  • 24. Results Which type of features matters? Features Precision Recall F-measure keyword relevance 0.3040 0.2924 0.2981 semantic 0.3053 0.2931 0.2991 relevance topic-sensitive 0.3017 0.3419 0.3206 topic-insensitive 0.1294 0.0170 0.0300 without semantics 0.3363 0.4828 0.3965 all features 0.3674 0.4736 0.4138 Overall, we can achieve the precision and recall of over 35% and 45% respectively by applying all the features. What makes a tweet relevant for a topic? 24
  • 25. Weights of features Syntactical Which feature matters? 2 1 Keyword-based 0 2 hasHashtag hasURL isReply length -1 1 0 Semantics -1 Keyword-based relevance 2 1 Semantic-based 0 2 #entities diversity sentiment -1 1 Contextual 0 2 -1 Relevance Relatedness 1 0 social context temporal context -1 What makes a tweet relevant for a topic? 25
  • 26. Conclusions 1. We constructed 17 features, including keyword-based relevance, semantics-based relevance, syntactical features, semantic features and contextual features. 2. Semantic features and topic-sensitive features are meaningful. 3. The contextual features have little impact on the prediction. What makes a tweet relevant for a topic? 26
  • 27. Future work • We plan to leverage the implementation of search engine for Twitter based on the work done in this paper. • Twinder (Tweets Finder) • The progress on this work on be found at: http://wis.ewi.tudelft.nl/twinder/ What makes a tweet relevant for a topic? 27
  • 28. QUESTIONS? April 16th, 2012 Slides : http://goo.gl/fQLaQ k.tao@tudelft.nl http://ktao.nl/ THANK YOU! What makes a tweet relevant for a topic? 28

Editor's Notes

  1. Greetings…Self Introduction: I’m 2nd year PhD student from Web Information Systems Group in TU Delft. Today, I will report our work titled “What makes a tweet relevant for a topic?”The general purpose : we want to supply a better experience for users who are seeking information on Twitter.
  2. In 2010, there was a paper which suggested that Twitter is more like a news media rather than a social network. Because 85% of the tweets are related to the news. Here is a picture which shows how much tweets were posted per second when several big events occurred, such as Japanese Earthquake, U.K. Royal wedding, Raid on Osama bin Laden, and UEFA champions league. valuable source for information, particularly for news The twitter is a news media, a source for getting information…As large amount of information has been being generated on Twitter, how can we retrieve the interesting ones or the relevant one for our needs? The straightforward solution is to search, just what you do several times per day on Google. But Teevan et al. tell us that the search behavior on Twitter is quite different. For example, (i) length are shorter (ii) special expression, such as the usage of hashtag (iii) mention celebrity or name (how this differs)?  check the paperColor of the table, black and white, green to highlight, list only the important rowsTherefore, we are curious how we can improve finding relevant or interesting tweets on Twitter.
  3. RQ:Topic -> determine whether the tweets are relevant based on the characteristics in them.If we consider semantics in the tweets, will the relevance estimation be better?numberingChallenge : amount of data.. Make them explicitly
  4. Traditional Twitter SearchHighlight what does keyword matching means, the keywords, in search query and tweets.
  5. Title -> syntactical featuresBox in the tweetGreen boxes for the hypothesesFlow from keyword-based relevance to … Slide 5-8, flow
  6. Subtitle -> question?
  7. Introduction to the usage of @, including mentions, and reply. Reply tweets frequently occur in private conversations. Therefore particularly, make a hypothesis about reply tweet.
  8. The 21st International World Wide Web Conference #www2012 will take place in Lyon, France April 16-20 2012 @www2012Lyon www2012.wwwconference.orgSubtitle questionOne short, one longcomparison
  9. Fade in the question later.
  10. Fade in the entities one by one.
  11. Mention the name of the features.
  12. Consider comparison between this and another tweet with only one type of entities.Title, subtitle
  13. SentimentRaise a discussion -> relevant, interestingAnother example tweet could be…
  14. First summarize. Fade in the question later.How can we extract the semantic features that are topic-sensitive.
  15. Color of the box (query)
  16. Not highlight www, lyon, france"How can you infer that WWW in the query refers to the Conference and not to the World Wide Web?"
  17. 18Can we utilize the contextual features.
  18. Titles,
  19. Timeline
  20. Number of features.----- Meeting Notes (11/4/12 16:19) -----The temporal context -> senstiveWSDM <- Named Entities Recognition.
  21. Numbering research questionsRefers to the paper, section 5.Explain the setup given a topic…
  22. What is the relevance judgment?Number of topics.
  23. ComparisonFade in the pairsHighlightTextbox -> Conclusion, (precision)
  24. ----- Meeting Notes (11/4/12 16:19) -----The indri score <- relevance, not enoughhardly visible <- point outTitle <- feature weights (influence)...
  25. Too much textNumber of featuresLast -> future work, link to Twinder
  26. Too much textNumber of featuresLast -> future work, link to Twinder
  27. ----- Meeting Notes (11/4/12 16:13) -----The example for explaining what do I mean by relevant or interesting. Unified the style of animation