SlideShare uma empresa Scribd logo
1 de 35
Leveraging the Semantics of Tweets for
  Adaptive Faceted Search on Twitter

  ISWC, Bonn, Germany, Oct 27th 2011



Fabian   Abel1,             Ilknur   Celik 1,   Geert-Jan Houben, Patrick        Siehndel2

                                        1Web    Information Systems, TU Delft, the Netherlands
                                                    2L3S Research Center, Hannover, Germany



            Delft
            University of
            Technology
What we do: Science and Engineering
      for the Personal Web
domains: news social mediacultural heritage public datae-learning

         Personalized             Personalized
                                                            Adaptive Systems
       Recommendations              Search


                                 Analysis and
                                User Modeling


                         Semantic Enrichment,
                         Linkage and Alignment

                                              user/usage data


                               Social Web
                       Adaptive Faceted Search on Twitter                      2
200,000,000
 number of tweets published per day



         Adaptive Faceted Search on Twitter   3
1
number of tweets that are interesting for me now



              Adaptive Faceted Search on Twitter   4
Searching on Twitter




          Adaptive Faceted Search on Twitter   5
Issues with Multiple Keywords Search




          Adaptive Faceted Search on Twitter   6
Let’s try to search with One Keyword




           Adaptive Faceted Search on Twitter   7
Page 1


Adaptive Faceted Search on Twitter   8
Page 2


Adaptive Faceted Search on Twitter   9
Page 3


Adaptive Faceted Search on Twitter   10
Music Artist
                                       Page 60!!


    Next Saturday @thatsimpsonguyaka Guilty Simpson will be performing at
tweet I was
    Area51 in my hometwonEindhoven. #realliveshit #iwillspinrecords
 looking hours ago via Blackberry
    about 9 for




Locations

                                Adaptive Faceted Search on Twitter         11
Is there an easier way?
   Faceted Search can help(hypothesis)
   Expand Query:                       Current Query:
   Locations more...                    Eindhoven              Music

   Events more...                      Results:
                                       1. Yskiddd: Next
   Music Artists:                         saturday@thatsimpsonguy aka Guilty
                                          Simpson will be performing at Area51
   + Guilty Simpson
                                          in my homeytown Eindhoven.
   + Bryan Adams                          #realliveshit#iwillspinrecords2
   + Elton John
   + Golden Earring                    2. Usee123: Cool #EV3door7980 !!!
   + Rihanna                              http://bit.ly/igyyRhL
   + The eagles                        3. sanmiquelmusic: This Saturday I'm
   + 3 Doors Down                         joining @KrusadersMusic to Intents
   more...

                      Adaptive Faceted Search on Twitter                         12
Challenges


             Adaptive Faceted Search on Twitter   13
Facets of a Tweet

      @bob: JulianAssange got
     arrested

       Facet type    Facet Value
          Creator    @bob
         Location    Delft, the Netherlands
    Creation time    Nov 29  th 2011
Challenge 1: How to infer facets that
    describe the content of a tweet?
             Adaptive Faceted Search on Twitter   14
Faceted Search:
   selecting facet-value pairs
      Expand Query:                   Current Query:
       Locations                        Music
       + Aachen
       + Aalborg                        Number of selectable
                                      Results:
       + Aalesund                        facet values may be
                                      1. Yskiddd: Next
       + Aarhus                          saturday@thatsimpsonguy aka Guilty
       + Aasiaat                                   very high!
                                         Simpson will be performing at Area51
       + Abaiang                         in my homeytown Eindhoven.
       + Abakan                          #realliveshit#iwillspinrecords2
       more...                        2. Usee123: Cool #EV3door7980 !!!
                                         http://bit.ly/igyyRhL
Challenge more...
     Events 2: How to adapt the faceted search
                       3. sanmiquelmusic: This Saturday I'm
 interface to the current demands ofIntents
                          joining @KrusadersMusic to a user?
       Music Artists more…
                     Adaptive Faceted Search on Twitter                     15
Adaptive Faceted Search Framework


           Adaptive Faceted Search on Twitter   16
Adaptive Faceted Search Framework
                                      user

                       Adaptive Faceted Search

How to represent                                          How to adapt the
the content of a                                           facet-value pair
                     User and Context Modeling              ranking to the
     tweet?
 facet extraction                                        current demands
                                                             of the user?
                        Semantic Enrichment




                            Twitter posts
                     Adaptive Faceted Search on Twitter            17
Facet Extraction and Semantic Enrichment
   powered by

                                        Julian Assange

                  @bob: Julian Assange got
                         JulianAssange                      Tweet-based
                  arrested                                  enrichment

Julian Assange

                       Julian Assange
                       JulianAssangearrested                Link-based
London           JulianAssange, the founder of
                 Julian Assange                             enrichment
                 WikiLeaks, is under arrest in
                 WikiLeaks
                 London…
                 London
WikiLeaks
                       Adaptive Faceted Search on Twitter          18
Impact of Link-based enrichment



                                               Representation of
                                                     tweets:
                                               significantly more
                                                facets per tweet
                                                 with link-based
                                                  enrichment




          Adaptive Faceted Search on Twitter              19
Faceted Search Strategies

• Challenge: most-relevant facet-value pair should appear at the
  top of the ranking              Locations          Locations
                                            1. Aachen         1.   Eindhoven
                                            2. Aalborg        2.   Delft
                                            3. Aalesund       3.   Amsterdam
• Baseline: hashtag-based                   4. Aarhus
                                            …
                                                              4.
                                                              5.
                                                                   Rotterdam
                                                                   London
            keyword search                  2145. Eindhoven   …




                  Adaptive Faceted Search on Twitter                20
Faceted Search Strategies
                                              number of tweets that
• Challenge:facet-value pair facet-value pair should appear at the
               most-relevant                  contain the FVP
  top of the ranking                Locations              Locations
                                               1. Aachen         1.   Eindhoven
                                               2. Aalborg        2.   Delft
                                               3. Aalesund       3.   Amsterdam
• Baseline: hashtag-based                      4. Aarhus
                                               …
                                                                 4.
                                                                 5.
                                                                      Rotterdam
                                                                      London
            keyword search of matching tweets
               current hit list                2145. Eindhoven   …


• Faceted Search Strategies:
   1. Occurrence frequency: count occurrence frequencies of FVP (baseline)




                     Adaptive Faceted Search on Twitter                21
Faceted Search Strategiesstratey Profile
 Personalized FVP ranking
            weight in user profile
                                     User
                                   FVP    weight
                              =               number of tweets that
                                                  (location, Delft)      6
• Challenge:facet-value pair facet-value pair shouldJazzBaltica) at the
               most-relevant the FVP
                       rank of                    (event, appear
                                              contain the FVP 4
  top of the ranking                Locations     (person, ChetBaker) 3
                                                               Locations
                                                1. Aachen          1.   Eindhoven
                                                2. Aalborg         2.   Delft
                                                3. Aalesund        3.   Amsterdam
• Baseline: hashtag-based                       4. Aarhus          4.   Rotterdam

               current hit June 27 matching tweets time
                                                …                  5.   London
     user   keyword search of
                            list                2145. Eindhoven
                                                          July 4   …


• Faceted Search Strategies:
   1. Occurrence frequency: count occurrence frequencies of FVP (baseline)
   2. Personalization: adapt ranking to user profile ( different user
      modeling strategies possible; here: entire tweeting history of the user)




                      Adaptive Faceted Search on Twitter                 22
Faceted Search Strategiesstratey Profile
 Personalized FVP ranking
 Genre      weight in user profile
                                      User
                                   Genre
                                    FVP    weight
   + Blues                    =               number of tweets that
                                                   + Blues
                                                  (location, Delft)      6
• Challenge:facet-value pair facet-value pair shouldJazzBaltica) at the
   + Jazz      most-relevant the FVP
                       rank of
                                                   + Jazz appear
                                              contain the FVP 4
                                                  (event,
   + JazzMusic                                     + Rock
  top of the ranking
   + Rock                           Locations     (person, ChetBaker) 3
                                                   + Classic Locations
                                    1. Aachen                  1. Eindhoven
   more...                          2. Aalborg     more... 2. Delft
                                                3. Aalesund        3. Amsterdam
• Baseline: hashtag-based                       4. Aarhus          4. Rotterdam

               current hit June 27 minimize overlaps time
                                                …                  5. London
     user   keyword search of matching tweets
                            list                2145. Eindhoven
                                                          July 4   …


• Faceted Search Strategies:
   1. Occurrence frequency: count occurrence frequencies of FVP (baseline)
   2. Personalization: adapt ranking to user profile ( different user
      modeling strategies possible; here: entire tweeting history of the user)
   3. Diversification: increase variety among the top-ranked FVPs



                      Adaptive Faceted Search on Twitter               23
Faceted Search Strategiesstratey Profile
  Personalized FVP ranking
                     weight in user profile
  Genre (event,FrenchOpen)
                                                    User
                                          search Genre
                                                  FVP    weight
   + Blues                    =               number of tweets that
                                                  + Blues               6
occurrence
                                                 (location, Delft)
frequency
• Challenge:facet-value pair facet-value pair shouldJazzBaltica) at the
   + Jazz      most-relevant the FVP                        Event
                                                  + Jazz appear
                       rank of JazzBaltica) contain the + JazzBaltica
                                                             FVP 4
  of FVP
                          (event,                (event,
   + JazzMusic                                    + Rock
  top of the ranking
   + Rock                           Locations    (person, ChetBaker) 3
                                                            + FrenchOpen
                                                  + Classic Locations
                                    1. Aachen                 1. Eindhoven
   more...                          2. Aalborg    more... more...
                                                              2. Delft
                                                 3. Aalesund       3. Amsterdam
• Baseline: hashtag-based                        4. Aarhus         4. Rotterdam

            current hit June 27list27 minimize overlaps time
                  currentlist of matching tweets time
                                                 …                 5. London
     user June keyword search of matching tweets
               20
                           hit June              2145. Eindhoven
                                                   July 4 July 4   …


• Faceted Search Strategies:
    1. Occurrence frequency: count occurrence frequencies of FVP (baseline)
    2. Personalization: adapt ranking to user profile ( different user
       modeling strategies possible; here: entire tweeting history of the user)
    3. Diversification: increase variety among the top-ranked FVPs
    4. Time-sensitivity:adapt FVP ranking to temporal context
• Semantic enrichment: (i) tweet-based and (ii) link-based enrichment
                       Adaptive Faceted Search on Twitter              24
Research Questions

1. How well does faceted search that is supported by the
   semantic enrichment perform in comparison to
   keyword search?

2. What strategy performs best in ranking facet-value
   pairs that allow users to find relevant tweets on Twitter?

3. How do the different building blocks of the faceted
   search framework influence the performance?




                 Adaptive Faceted Search on Twitter      25
Dataset
      more than:

    20,000         Twitter users

         4         months

30,000,000         tweets
                                                       Egyptian revolution

                                               Jan 25



  Nov 15       Dec 15               Jan 15                   Feb 15          time

                        Adaptive Faceted Search on Twitter                   26
Evaluation Framework
• User Simulation Model [cf. Koren et al., WWW’08]:
  • Input: search settings = { (user who searches, relevant target tweet) }
  • Drill down search result list until no more FVPscan be applied or less than
    10 tweets match the query
  • Simulating click behavior: first-matching FVP is selected ( user knows
    target resource)
• Ground truth  relevant target tweet = tweet that has been
  re-tweeted by the user
• Metrics:
  • Succes@k: probability that relevant FVP appears in the top k (the higher
    the Succes@k, the faster the search and fewer the user effort)
  • MRR: mean reciprocal rank of the target tweet when the user selected it



                      Adaptive Faceted Search on Twitter              27
Faceted-search vs. hashtag-based
    (keyword) search
                                          Faceted search based on
                                          semantic enrichment of
                                            tweets outperforms
                                           hashtgag-based search
                                                significantly.




          Adaptive Faceted Search on Twitter              28
Personalized strategy
Results: Overview                       achieves ~12% better
                                       performance than other
                                     semantic strategies (and 2 x
                                       better than hashtag-based)




          Adaptive Faceted Search on Twitter              29
Impact of link-based enrichment
                               Personalized strategy
                               outperforms baseline
                                   significantly

                                      Link-based enrichment
                                     improves quality for both
                                            strategies




           Adaptive Faceted Search on Twitter            30
Impact of time-sensitivity
                                           Time-sensitivity based
                                          ranking improves quality
                                           for both frequency and
                                          diversification strategies




            Adaptive Faceted Search on Twitter               31
Application of the Faceted Search
  Framework

            Adaptive Faceted Search on Twitter   32
1.
2.                                                   Twitcident.com
                                                    Twitter-based crisis
                                                    management system


                                                    Semantic
                                                    enrichment
                                                    allows for:
                                                    1. Grouping tweets
3.                                             4.      into incidents
                                                    2. Faceted search
                                                    3. Thematic Views
                                                    4. Analysis


          Adaptive Faceted Search on Twitter                   33
Conclusions
What we did:
• Adaptive Faceted Search on Twitter + Evaluation Framework
• Analysis and Evaluation (+ Application in Twitcident)
Findings:
1. Semantic Enrichment allows for structured representation of the
   content of tweets  basis for faceted search
2. Faceted search performs significantly better than hashtag-based
   keyword search
3. Different building blocks for making faceted search on Twitter
   adaptive improve the search quality:
  a) Link-based enrichment: more discoverable tweets, better search performance
  b) Personalization leads to significant improvements
  c) Time-sensitivity improves performance as well

                       Adaptive Faceted Search on Twitter              34
Thank you!


Twitter: @fabianabel
http://wis.ewi.tudelft.nl/iswc2011/

Adaptive Faceted Search on Twitter    35

Mais conteúdo relacionado

Mais de Web Information Systems, TU Delft

GeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic WebGeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic WebWeb Information Systems, TU Delft
 
Generating Resource Profiles by Exploiting the Context of Social Annotations
Generating Resource Profiles by Exploiting the Context of Social AnnotationsGenerating Resource Profiles by Exploiting the Context of Social Annotations
Generating Resource Profiles by Exploiting the Context of Social AnnotationsWeb Information Systems, TU Delft
 
#SDoW2011 Keynote: User Modeling and Personalization on Twitter
#SDoW2011 Keynote: User Modeling and Personalization on Twitter#SDoW2011 Keynote: User Modeling and Personalization on Twitter
#SDoW2011 Keynote: User Modeling and Personalization on TwitterWeb Information Systems, TU Delft
 
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...Web Information Systems, TU Delft
 
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...Web Information Systems, TU Delft
 

Mais de Web Information Systems, TU Delft (8)

GeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic WebGeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic Web
 
Generating Resource Profiles by Exploiting the Context of Social Annotations
Generating Resource Profiles by Exploiting the Context of Social AnnotationsGenerating Resource Profiles by Exploiting the Context of Social Annotations
Generating Resource Profiles by Exploiting the Context of Social Annotations
 
Payday on the Social Semantic Web
Payday on the Social Semantic WebPayday on the Social Semantic Web
Payday on the Social Semantic Web
 
#SDoW2011 Keynote: User Modeling and Personalization on Twitter
#SDoW2011 Keynote: User Modeling and Personalization on Twitter#SDoW2011 Keynote: User Modeling and Personalization on Twitter
#SDoW2011 Keynote: User Modeling and Personalization on Twitter
 
About the Social Semantic Web
About the Social Semantic WebAbout the Social Semantic Web
About the Social Semantic Web
 
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
 
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
 
Analyzing Cross-System User Modeling on the Social Web
Analyzing Cross-System User Modeling on the Social WebAnalyzing Cross-System User Modeling on the Social Web
Analyzing Cross-System User Modeling on the Social Web
 

Último

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 

Último (20)

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 

Leveraging Semantics of Tweets for Adaptive Faceted Search

  • 1. Leveraging the Semantics of Tweets for Adaptive Faceted Search on Twitter ISWC, Bonn, Germany, Oct 27th 2011 Fabian Abel1, Ilknur Celik 1, Geert-Jan Houben, Patrick Siehndel2 1Web Information Systems, TU Delft, the Netherlands 2L3S Research Center, Hannover, Germany Delft University of Technology
  • 2. What we do: Science and Engineering for the Personal Web domains: news social mediacultural heritage public datae-learning Personalized Personalized Adaptive Systems Recommendations Search Analysis and User Modeling Semantic Enrichment, Linkage and Alignment user/usage data Social Web Adaptive Faceted Search on Twitter 2
  • 3. 200,000,000 number of tweets published per day Adaptive Faceted Search on Twitter 3
  • 4. 1 number of tweets that are interesting for me now Adaptive Faceted Search on Twitter 4
  • 5. Searching on Twitter Adaptive Faceted Search on Twitter 5
  • 6. Issues with Multiple Keywords Search Adaptive Faceted Search on Twitter 6
  • 7. Let’s try to search with One Keyword Adaptive Faceted Search on Twitter 7
  • 8. Page 1 Adaptive Faceted Search on Twitter 8
  • 9. Page 2 Adaptive Faceted Search on Twitter 9
  • 10. Page 3 Adaptive Faceted Search on Twitter 10
  • 11. Music Artist Page 60!! Next Saturday @thatsimpsonguyaka Guilty Simpson will be performing at tweet I was Area51 in my hometwonEindhoven. #realliveshit #iwillspinrecords looking hours ago via Blackberry about 9 for Locations Adaptive Faceted Search on Twitter 11
  • 12. Is there an easier way? Faceted Search can help(hypothesis) Expand Query: Current Query: Locations more... Eindhoven Music Events more... Results: 1. Yskiddd: Next Music Artists: saturday@thatsimpsonguy aka Guilty Simpson will be performing at Area51 + Guilty Simpson in my homeytown Eindhoven. + Bryan Adams #realliveshit#iwillspinrecords2 + Elton John + Golden Earring 2. Usee123: Cool #EV3door7980 !!! + Rihanna http://bit.ly/igyyRhL + The eagles 3. sanmiquelmusic: This Saturday I'm + 3 Doors Down joining @KrusadersMusic to Intents more... Adaptive Faceted Search on Twitter 12
  • 13. Challenges Adaptive Faceted Search on Twitter 13
  • 14. Facets of a Tweet @bob: JulianAssange got arrested Facet type Facet Value Creator @bob Location Delft, the Netherlands Creation time Nov 29 th 2011 Challenge 1: How to infer facets that describe the content of a tweet? Adaptive Faceted Search on Twitter 14
  • 15. Faceted Search: selecting facet-value pairs Expand Query: Current Query: Locations Music + Aachen + Aalborg Number of selectable Results: + Aalesund facet values may be 1. Yskiddd: Next + Aarhus saturday@thatsimpsonguy aka Guilty + Aasiaat very high! Simpson will be performing at Area51 + Abaiang in my homeytown Eindhoven. + Abakan #realliveshit#iwillspinrecords2 more... 2. Usee123: Cool #EV3door7980 !!! http://bit.ly/igyyRhL Challenge more... Events 2: How to adapt the faceted search 3. sanmiquelmusic: This Saturday I'm interface to the current demands ofIntents joining @KrusadersMusic to a user? Music Artists more… Adaptive Faceted Search on Twitter 15
  • 16. Adaptive Faceted Search Framework Adaptive Faceted Search on Twitter 16
  • 17. Adaptive Faceted Search Framework user Adaptive Faceted Search How to represent How to adapt the the content of a facet-value pair User and Context Modeling ranking to the tweet?  facet extraction current demands of the user? Semantic Enrichment Twitter posts Adaptive Faceted Search on Twitter 17
  • 18. Facet Extraction and Semantic Enrichment powered by Julian Assange @bob: Julian Assange got JulianAssange Tweet-based arrested enrichment Julian Assange Julian Assange JulianAssangearrested Link-based London JulianAssange, the founder of Julian Assange enrichment WikiLeaks, is under arrest in WikiLeaks London… London WikiLeaks Adaptive Faceted Search on Twitter 18
  • 19. Impact of Link-based enrichment Representation of tweets: significantly more facets per tweet with link-based enrichment Adaptive Faceted Search on Twitter 19
  • 20. Faceted Search Strategies • Challenge: most-relevant facet-value pair should appear at the top of the ranking Locations Locations 1. Aachen 1. Eindhoven 2. Aalborg 2. Delft 3. Aalesund 3. Amsterdam • Baseline: hashtag-based 4. Aarhus … 4. 5. Rotterdam London keyword search 2145. Eindhoven … Adaptive Faceted Search on Twitter 20
  • 21. Faceted Search Strategies number of tweets that • Challenge:facet-value pair facet-value pair should appear at the most-relevant contain the FVP top of the ranking Locations Locations 1. Aachen 1. Eindhoven 2. Aalborg 2. Delft 3. Aalesund 3. Amsterdam • Baseline: hashtag-based 4. Aarhus … 4. 5. Rotterdam London keyword search of matching tweets current hit list 2145. Eindhoven … • Faceted Search Strategies: 1. Occurrence frequency: count occurrence frequencies of FVP (baseline) Adaptive Faceted Search on Twitter 21
  • 22. Faceted Search Strategiesstratey Profile Personalized FVP ranking weight in user profile User FVP weight = number of tweets that (location, Delft) 6 • Challenge:facet-value pair facet-value pair shouldJazzBaltica) at the most-relevant the FVP rank of (event, appear contain the FVP 4 top of the ranking Locations (person, ChetBaker) 3 Locations 1. Aachen 1. Eindhoven 2. Aalborg 2. Delft 3. Aalesund 3. Amsterdam • Baseline: hashtag-based 4. Aarhus 4. Rotterdam current hit June 27 matching tweets time … 5. London user keyword search of list 2145. Eindhoven July 4 … • Faceted Search Strategies: 1. Occurrence frequency: count occurrence frequencies of FVP (baseline) 2. Personalization: adapt ranking to user profile ( different user modeling strategies possible; here: entire tweeting history of the user) Adaptive Faceted Search on Twitter 22
  • 23. Faceted Search Strategiesstratey Profile Personalized FVP ranking Genre weight in user profile User Genre FVP weight + Blues = number of tweets that + Blues (location, Delft) 6 • Challenge:facet-value pair facet-value pair shouldJazzBaltica) at the + Jazz most-relevant the FVP rank of + Jazz appear contain the FVP 4 (event, + JazzMusic + Rock top of the ranking + Rock Locations (person, ChetBaker) 3 + Classic Locations 1. Aachen 1. Eindhoven more... 2. Aalborg more... 2. Delft 3. Aalesund 3. Amsterdam • Baseline: hashtag-based 4. Aarhus 4. Rotterdam current hit June 27 minimize overlaps time … 5. London user keyword search of matching tweets list 2145. Eindhoven July 4 … • Faceted Search Strategies: 1. Occurrence frequency: count occurrence frequencies of FVP (baseline) 2. Personalization: adapt ranking to user profile ( different user modeling strategies possible; here: entire tweeting history of the user) 3. Diversification: increase variety among the top-ranked FVPs Adaptive Faceted Search on Twitter 23
  • 24. Faceted Search Strategiesstratey Profile Personalized FVP ranking weight in user profile Genre (event,FrenchOpen) User search Genre FVP weight + Blues = number of tweets that + Blues 6 occurrence (location, Delft) frequency • Challenge:facet-value pair facet-value pair shouldJazzBaltica) at the + Jazz most-relevant the FVP Event + Jazz appear rank of JazzBaltica) contain the + JazzBaltica FVP 4 of FVP (event, (event, + JazzMusic + Rock top of the ranking + Rock Locations (person, ChetBaker) 3 + FrenchOpen + Classic Locations 1. Aachen 1. Eindhoven more... 2. Aalborg more... more... 2. Delft 3. Aalesund 3. Amsterdam • Baseline: hashtag-based 4. Aarhus 4. Rotterdam current hit June 27list27 minimize overlaps time currentlist of matching tweets time … 5. London user June keyword search of matching tweets 20 hit June 2145. Eindhoven July 4 July 4 … • Faceted Search Strategies: 1. Occurrence frequency: count occurrence frequencies of FVP (baseline) 2. Personalization: adapt ranking to user profile ( different user modeling strategies possible; here: entire tweeting history of the user) 3. Diversification: increase variety among the top-ranked FVPs 4. Time-sensitivity:adapt FVP ranking to temporal context • Semantic enrichment: (i) tweet-based and (ii) link-based enrichment Adaptive Faceted Search on Twitter 24
  • 25. Research Questions 1. How well does faceted search that is supported by the semantic enrichment perform in comparison to keyword search? 2. What strategy performs best in ranking facet-value pairs that allow users to find relevant tweets on Twitter? 3. How do the different building blocks of the faceted search framework influence the performance? Adaptive Faceted Search on Twitter 25
  • 26. Dataset more than: 20,000 Twitter users 4 months 30,000,000 tweets Egyptian revolution Jan 25 Nov 15 Dec 15 Jan 15 Feb 15 time Adaptive Faceted Search on Twitter 26
  • 27. Evaluation Framework • User Simulation Model [cf. Koren et al., WWW’08]: • Input: search settings = { (user who searches, relevant target tweet) } • Drill down search result list until no more FVPscan be applied or less than 10 tweets match the query • Simulating click behavior: first-matching FVP is selected ( user knows target resource) • Ground truth  relevant target tweet = tweet that has been re-tweeted by the user • Metrics: • Succes@k: probability that relevant FVP appears in the top k (the higher the Succes@k, the faster the search and fewer the user effort) • MRR: mean reciprocal rank of the target tweet when the user selected it Adaptive Faceted Search on Twitter 27
  • 28. Faceted-search vs. hashtag-based (keyword) search Faceted search based on semantic enrichment of tweets outperforms hashtgag-based search significantly. Adaptive Faceted Search on Twitter 28
  • 29. Personalized strategy Results: Overview achieves ~12% better performance than other semantic strategies (and 2 x better than hashtag-based) Adaptive Faceted Search on Twitter 29
  • 30. Impact of link-based enrichment Personalized strategy outperforms baseline significantly Link-based enrichment improves quality for both strategies Adaptive Faceted Search on Twitter 30
  • 31. Impact of time-sensitivity Time-sensitivity based ranking improves quality for both frequency and diversification strategies Adaptive Faceted Search on Twitter 31
  • 32. Application of the Faceted Search Framework Adaptive Faceted Search on Twitter 32
  • 33. 1. 2. Twitcident.com Twitter-based crisis management system Semantic enrichment allows for: 1. Grouping tweets 3. 4. into incidents 2. Faceted search 3. Thematic Views 4. Analysis Adaptive Faceted Search on Twitter 33
  • 34. Conclusions What we did: • Adaptive Faceted Search on Twitter + Evaluation Framework • Analysis and Evaluation (+ Application in Twitcident) Findings: 1. Semantic Enrichment allows for structured representation of the content of tweets  basis for faceted search 2. Faceted search performs significantly better than hashtag-based keyword search 3. Different building blocks for making faceted search on Twitter adaptive improve the search quality: a) Link-based enrichment: more discoverable tweets, better search performance b) Personalization leads to significant improvements c) Time-sensitivity improves performance as well Adaptive Faceted Search on Twitter 34

Notas do Editor

  1. Motivation:Information overloadPersonalised “better” search
  2. Why do people search on Twitter rather than Google?Real time info & opinion about almost anything
  3. Example: HT’11 @Eindhoven, looking for some entertainment events...http://search.twitter.com/http://search.twitter.com/advanced
  4. Space limitation + selecting keywords (abbreviations –shorthand notations + colloquial expressions)
  5. Highlight 60
  6. Very time consuming and overwhelming indeed!
  7. Very time consuming and overwhelming indeed!
  8. entity extraction and semantic enrichment and relation discovery.
  9. Might be better to remove the Costs column...?
  10. Our framework extracts typed entities from enriched tweets/news and provides strategies for detecting semantic (trending) relationships between entities. We:investigated the precision and recall of the relation detection strategies,analyzed how the strategies perform for each type of relationships andWhich strategy performs best in detecting relationships between entities?Does the accuracy depend on the type of entities which are involved in a relation?How do the strategies perform for discovering relationships which have temporal constraints, and how fast can the strategies detect (trending) relationships?evaluated the quality and speed for discovering trending relationships that possibly have a limited temporal validity.