SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
SEMANTIC
     RECOMMENDATION SYSTEMS
         FOR RESEARCH 2.0
                                    OR
          A Conceptual Prototype for a Twitter based Recommender
                          System for Research 2.0


                              by Patrick Thonhauser


Thursday, October 11, 12
OUTLINE

    • Motivation

    • Basics  (Semantic Web, Recommender Systems, Natural
        Language Processing)

    • Conceptual           Prototype

    • Test          results and Discussion

    • Questions


Thursday, October 11, 12
MOTIVATION

    • Is Twitter useful for discovering new connections between
        researchers in similar subject areas (and why Twitter)?

    • How     much information can we extract form 140 character
        strings?

    • Is      it possible to separate useful information from noise?

    • Are    there any appropriate classifiers and metrics to measure
        the significance of Twitter users and Tweets?

Thursday, October 11, 12
SEMANTIC WEB

    • Additional           Layer of Information

    • Linked   Data (use URIs as names, use HTTP URIs, use
        standards to provide Information, include links to other URIs)

    • RDF  (based on triples -> subject, predicate, object) is like
        HTML for the classic web

    • Nearly  all semantic web standards are based on RDF (like
        FOAF - Friend of a Friend Project)

Thursday, October 11, 12
RECOMMENDER SYSTEMS


    • Collaborative             Filtering (user based/ item based)

    • Content               Based Recommendation

    • Knowledge               Based Recommendation

    • Hybrid               Recommendations



Thursday, October 11, 12
NATURAL LANGUAGE
                            PROCESSING (NLP)
  • Classification               of Microtext Artefacts (This presentation is killer!)
  • Applying                NLP - Pipelines
       • End           of Sentence Detection
       • Tokenization

       • POS               Tagging
       • Chunking

       • Extraction


Thursday, October 11, 12
THE CONCEPT
      OF THOUGHT
        BUBBLES

       Let’s imagine every Twitter
         user belongs to several
     different topic related Bubbles



Thursday, October 11, 12
LET’S SUMMARIZE

    •A        user is part of topic related bubbles

    • Twitter users within topic related bubbles don’t necessarily
        know each other

    • Connections    of already existing connections of the service user
        lead to new information

    • Non             bidirectional connections preferred
     So how can we find such potentially
     interesting users?
Thursday, October 11, 12
PROOF OF CONCEPT SYSTEM
     (1) Preselection of user set, which will
       be analyzed in depth                                               A USERS
                                                                          THOUGHT
                                                                           BUBBLE
                                                             SPORTS


     (2) Apply NLP-Pipeline for measuring
       user similarity                                          SERVICE
                                                                                         IOS DEV


                                                                 USER



     (3) Categorize the top-n best scoring
                                                 TW
                                                RE ITT                    SOCIAL MEDIA
                                                  ST ER      T
                                                           BU HO

       users according to the idea of
                                                    AP       BB UG
                                                       I       LE HT
                                                                 S
                                                                   AP

       Thought Bubbles
                                                                      I

                                                              PRE-
                                                           FILTERING      NLP



     (4) Recommend top-n best scoring                          CLUSTERING


       users of a category to the user
                                                                                         DB
                                                           CATEGORI ANALYZE
                                                            SATION    RECS

                                                                SERVER

     (5) Analyze acceptance of
       recommendations
Thursday, October 11, 12
(1) PRE-SELECTION/FILTERING
                                        Filter accounts    Filter accounts where:        Filter non
                                        that are already   follower_count < 300          English speaking
                                        connected to you   status_count < 1000           accounts
                           Friends of
                                                                                                             Identifiy People
                            Friends
                                                Filter                 Filter               Filter          by using a simple
                            Twitter
                                                                                                              NLP Pipeline
                           Accounts




                                                           Set of Twitter accounts for
                                                               further processing




        • The    set of friends of friend‘s Twitter accounts changes from
            iteration to iteration

        • Filtersare added after analyzing the acceptance of
            recommendations

Thursday, October 11, 12
(2)                             NLP PIPELINE
                                               Tokenization and                                              Neglect 200 most
                                                   stripping
                           Raw Tweets          @mentions and
                                                                            POS tagged Tweets                  used English
                                                                                                                  words
                                                     URLs
                                                                                     [('The', 'AT'),
                                                                                    ('grand', 'JJ'),
                            @testuser The                                            ('jury', 'NN'),
                              grand jury                                           ('commented',
                           commented on a              POS tagging              'VBD'), ('on', 'IN'),           Chunking
                             number of…                                                 ('a', 'AT'),
                                                                                       ('number',
                                                                                  'NN'), ... ('.', '.')]
                                            Set of Frequency
                                            Distributed mined                                         Mined nouns and phrases
                                            nouns and phrases
                                                                                                               [('jury', 'NN'),
                                                 [('jury', 34),                                                   'number',
                                                ('social', 23),                        Frequency                     'NN'),
                               DB                ('test case',                         Distribution                 ('social
                                                    16), ...]                                                       dayly',
                                                                                                                   'NP'), ...]



                                                                  Filter top n words




          400 most recent Tweets of a potential recommendation are
          used for calculating the similarity measure
Thursday, October 11, 12
• Calculate   top-n users by applying Single-Linkage-
        Clustering

    • Categorize              if user belongs to user specific bubbles

    • Present              recommendation lists to users

    • Analyze   acceptance of recommendations (connect
        user accounts with FOAF) and add new filter
        predicate if necessary.


Thursday, October 11, 12
recommendations are framed

                               @gargamit100*
                                     @selvers*
                            @UpsideLearning*
                             @poposkidimitar*
                                     @jkalten*



      SUPERVISED
                                   @cpappas*
                                   @pfidalgo1*
                              @timbuckteeth*
                            @starsandrobots*
                                  @TheJ Russ
                              @cliveshepherd*




       TEST RUN
                                   @Microsoft
                                     @jtcobb*
                              @MichaelPhelps
                            @SebastianThrun*
                                   @elearning*
                                @elvaandrade
                              @BarackObama
                                 @SteveVictor
                           @AnwarRichardson
                                 @pabaker55*
                               @jamesmclynn
                               @DrEvanHarris
                                   @mstrohm*
                               @AmyFrearson
                                       @gekitz
                                     @Hhaitch
                                     @sclater*
                                    @TheRock
                           @MCeraWeakBaby
                                 @fatcharlesh
                                  @FrankViola
                                   @timbarker
                            @AnnaOscarsson
                                  @WithDrake
                              sabrinaVanessa
                                @charliesheen
                           @WWEDanielBryan
                                  @cmccosky
                               @kaitlyntrigger
                                    @judithsei*
                                        @atsc*
                              @melaniedaveid
                                  @Emmadw*
                                    @ladygaga
                                 @marcusfairs
                               @lucyheartsTW
                                 @PeterSmith
                                    @MikeVick
                            @meadd cameron
                                                  0     0.075   0.150   0.225   0.300




Thursday, October 11, 12
UNSUPERVISED TEST RESULTS




              The probability that a recommended item is relevant is
              64.4%. Standard Derivation: 31.5%

Thursday, October 11, 12
DISCUSSION
        Twitter IS useful for discovering new information in sense of
        Research 2.0 but:

    • Recommendations                reflect the Twitter behavior of the user

    • Automated     tweets harm recommendation results (one
        sentence gets an enormous weight because it occurs very
        very often)

    • Twitter‘s            request limitation is a show stopper

    • Comparison               to similar systems (Content and collaborative
        filtering)
Thursday, October 11, 12
THANK YOU!
                             ANY QUESTIONS?




Thursday, October 11, 12

Mais conteúdo relacionado

Semelhante a Semantic Recommandation Sytems for Research 2.0

A Conceptual Prototype for a Twitter Based Recommender System for Research 2.0
A Conceptual Prototype for a Twitter Based Recommender System for Research 2.0 A Conceptual Prototype for a Twitter Based Recommender System for Research 2.0
A Conceptual Prototype for a Twitter Based Recommender System for Research 2.0 Martin Ebner
 
Major_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptxMajor_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptxLokeshKumarReddy8
 
Leveraging Solr and Mahout
Leveraging Solr and MahoutLeveraging Solr and Mahout
Leveraging Solr and MahoutGrant Ingersoll
 
Tag And Tag Based Recommender
Tag And Tag Based RecommenderTag And Tag Based Recommender
Tag And Tag Based Recommendergu wendong
 
Real time semantic search engine for social tv streams
Real time semantic search engine for social tv streamsReal time semantic search engine for social tv streams
Real time semantic search engine for social tv streamsSngular Meaning
 
Learning Analytics for Learning Blogospheres
Learning Analytics for Learning BlogospheresLearning Analytics for Learning Blogospheres
Learning Analytics for Learning BlogospheresYiwei Cao
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonKrishna Sankar
 
The Art of Social Media Analysis with Twitter & Python-OSCON 2012
The Art of Social Media Analysis with Twitter & Python-OSCON 2012The Art of Social Media Analysis with Twitter & Python-OSCON 2012
The Art of Social Media Analysis with Twitter & Python-OSCON 2012OSCON Byrum
 
Roeder rocky 2011_46
Roeder rocky 2011_46Roeder rocky 2011_46
Roeder rocky 2011_46Chris Roeder
 
NoTube: Pattern-based Recommendations (part 1)
NoTube: Pattern-based Recommendations (part 1)NoTube: Pattern-based Recommendations (part 1)
NoTube: Pattern-based Recommendations (part 1)MODUL Technology GmbH
 
NoTube: User Profiling (Beancounter)
NoTube: User Profiling (Beancounter)NoTube: User Profiling (Beancounter)
NoTube: User Profiling (Beancounter)MODUL Technology GmbH
 
Sakai 3, Architectural Choices and Community Impact
Sakai 3, Architectural Choices and Community ImpactSakai 3, Architectural Choices and Community Impact
Sakai 3, Architectural Choices and Community ImpactAuSakai
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on TwitterSmritiAgarwal26
 
Mike davies sentiment_analysis_presentation_backup
Mike davies sentiment_analysis_presentation_backupMike davies sentiment_analysis_presentation_backup
Mike davies sentiment_analysis_presentation_backupm1ked
 
A Journey With Microsoft Cognitive Services II
A Journey With Microsoft Cognitive Services IIA Journey With Microsoft Cognitive Services II
A Journey With Microsoft Cognitive Services IIMarvin Heng
 

Semelhante a Semantic Recommandation Sytems for Research 2.0 (16)

A Conceptual Prototype for a Twitter Based Recommender System for Research 2.0
A Conceptual Prototype for a Twitter Based Recommender System for Research 2.0 A Conceptual Prototype for a Twitter Based Recommender System for Research 2.0
A Conceptual Prototype for a Twitter Based Recommender System for Research 2.0
 
Major_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptxMajor_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptx
 
Leveraging Solr and Mahout
Leveraging Solr and MahoutLeveraging Solr and Mahout
Leveraging Solr and Mahout
 
Tag And Tag Based Recommender
Tag And Tag Based RecommenderTag And Tag Based Recommender
Tag And Tag Based Recommender
 
Real time semantic search engine for social tv streams
Real time semantic search engine for social tv streamsReal time semantic search engine for social tv streams
Real time semantic search engine for social tv streams
 
Learning Analytics for Learning Blogospheres
Learning Analytics for Learning BlogospheresLearning Analytics for Learning Blogospheres
Learning Analytics for Learning Blogospheres
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & Python
 
The Art of Social Media Analysis with Twitter & Python-OSCON 2012
The Art of Social Media Analysis with Twitter & Python-OSCON 2012The Art of Social Media Analysis with Twitter & Python-OSCON 2012
The Art of Social Media Analysis with Twitter & Python-OSCON 2012
 
Roeder rocky 2011_46
Roeder rocky 2011_46Roeder rocky 2011_46
Roeder rocky 2011_46
 
NoTube: Pattern-based Recommendations (part 1)
NoTube: Pattern-based Recommendations (part 1)NoTube: Pattern-based Recommendations (part 1)
NoTube: Pattern-based Recommendations (part 1)
 
NoTube: User Profiling (Beancounter)
NoTube: User Profiling (Beancounter)NoTube: User Profiling (Beancounter)
NoTube: User Profiling (Beancounter)
 
Sakai 3, Architectural Choices and Community Impact
Sakai 3, Architectural Choices and Community ImpactSakai 3, Architectural Choices and Community Impact
Sakai 3, Architectural Choices and Community Impact
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on Twitter
 
Mike davies sentiment_analysis_presentation_backup
Mike davies sentiment_analysis_presentation_backupMike davies sentiment_analysis_presentation_backup
Mike davies sentiment_analysis_presentation_backup
 
Alleviating Data Sparsity for Twitter Sentiment Analysis
Alleviating Data Sparsity for Twitter Sentiment AnalysisAlleviating Data Sparsity for Twitter Sentiment Analysis
Alleviating Data Sparsity for Twitter Sentiment Analysis
 
A Journey With Microsoft Cognitive Services II
A Journey With Microsoft Cognitive Services IIA Journey With Microsoft Cognitive Services II
A Journey With Microsoft Cognitive Services II
 

Mais de Educational Technology

The use of programming tasks in interactive videos to increase learning effec...
The use of programming tasks in interactive videos to increase learning effec...The use of programming tasks in interactive videos to increase learning effec...
The use of programming tasks in interactive videos to increase learning effec...Educational Technology
 
Analysis of students' behavior watching iMooX courses with interactive elements
Analysis of students' behavior watching iMooX courses with interactive elementsAnalysis of students' behavior watching iMooX courses with interactive elements
Analysis of students' behavior watching iMooX courses with interactive elementsEducational Technology
 
Erhebung von Lernaktivitäten in einem Pop-Up-Makerspace mit einer technischen...
Erhebung von Lernaktivitäten in einem Pop-Up-Makerspace mit einer technischen...Erhebung von Lernaktivitäten in einem Pop-Up-Makerspace mit einer technischen...
Erhebung von Lernaktivitäten in einem Pop-Up-Makerspace mit einer technischen...Educational Technology
 
Mixed Reality im Distance Learning in der Hochschullehre
Mixed Reality im Distance Learning in der HochschullehreMixed Reality im Distance Learning in der Hochschullehre
Mixed Reality im Distance Learning in der HochschullehreEducational Technology
 
Development of a WCAG theme for a learning management system
Development of a WCAG theme for a learning management systemDevelopment of a WCAG theme for a learning management system
Development of a WCAG theme for a learning management systemEducational Technology
 
Math trainer as a chatbot via system(push) messages for Android
Math trainer as a chatbot via system(push) messages for AndroidMath trainer as a chatbot via system(push) messages for Android
Math trainer as a chatbot via system(push) messages for AndroidEducational Technology
 
Empirical Analysis of Automated Editing of Raw Learning Video Footage
Empirical Analysis of Automated Editing of Raw Learning Video FootageEmpirical Analysis of Automated Editing of Raw Learning Video Footage
Empirical Analysis of Automated Editing of Raw Learning Video FootageEducational Technology
 
Fächerintegrativer Unterricht am Beispiel der Leichtathletik
Fächerintegrativer Unterricht am Beispiel der LeichtathletikFächerintegrativer Unterricht am Beispiel der Leichtathletik
Fächerintegrativer Unterricht am Beispiel der LeichtathletikEducational Technology
 
DENKEN UND TECHNIK Über manipulative Auswirkungen von Internettechnologien
DENKEN UND TECHNIK Über manipulative Auswirkungen von InternettechnologienDENKEN UND TECHNIK Über manipulative Auswirkungen von Internettechnologien
DENKEN UND TECHNIK Über manipulative Auswirkungen von InternettechnologienEducational Technology
 
Empfehlungen für den Unterricht im Fach Informatik für Menschen mit Autismus-...
Empfehlungen für den Unterricht im Fach Informatik für Menschen mit Autismus-...Empfehlungen für den Unterricht im Fach Informatik für Menschen mit Autismus-...
Empfehlungen für den Unterricht im Fach Informatik für Menschen mit Autismus-...Educational Technology
 
Entwicklung eines Online-Kurses für digitale Kompetenzen für Studienanfänger:...
Entwicklung eines Online-Kurses für digitale Kompetenzen für Studienanfänger:...Entwicklung eines Online-Kurses für digitale Kompetenzen für Studienanfänger:...
Entwicklung eines Online-Kurses für digitale Kompetenzen für Studienanfänger:...Educational Technology
 
Development of a mobile French language learning platform
Development of a mobile French language learning platformDevelopment of a mobile French language learning platform
Development of a mobile French language learning platformEducational Technology
 
Learning Analytics and Spelling Acquisition in German - the Path to Indivdual...
Learning Analytics and Spelling Acquisition in German - the Path to Indivdual...Learning Analytics and Spelling Acquisition in German - the Path to Indivdual...
Learning Analytics and Spelling Acquisition in German - the Path to Indivdual...Educational Technology
 
Fächerintegrativer Unterricht am Beispiel des Lernroboters Thymio
Fächerintegrativer Unterricht am Beispiel des Lernroboters ThymioFächerintegrativer Unterricht am Beispiel des Lernroboters Thymio
Fächerintegrativer Unterricht am Beispiel des Lernroboters ThymioEducational Technology
 
Einsatz von Mixed Reality im Klassenzimmer
Einsatz von Mixed Reality im KlassenzimmerEinsatz von Mixed Reality im Klassenzimmer
Einsatz von Mixed Reality im KlassenzimmerEducational Technology
 
Chatbots for Brand Representation in Comparison with Traditional Websites
Chatbots for Brand Representation in Comparison with Traditional WebsitesChatbots for Brand Representation in Comparison with Traditional Websites
Chatbots for Brand Representation in Comparison with Traditional WebsitesEducational Technology
 
Development of a learning diary for a MOOC platform
Development of a learning diary for a MOOC platformDevelopment of a learning diary for a MOOC platform
Development of a learning diary for a MOOC platformEducational Technology
 

Mais de Educational Technology (20)

The use of programming tasks in interactive videos to increase learning effec...
The use of programming tasks in interactive videos to increase learning effec...The use of programming tasks in interactive videos to increase learning effec...
The use of programming tasks in interactive videos to increase learning effec...
 
Analysis of students' behavior watching iMooX courses with interactive elements
Analysis of students' behavior watching iMooX courses with interactive elementsAnalysis of students' behavior watching iMooX courses with interactive elements
Analysis of students' behavior watching iMooX courses with interactive elements
 
Portability of Mobile Applications
Portability of Mobile ApplicationsPortability of Mobile Applications
Portability of Mobile Applications
 
Erhebung von Lernaktivitäten in einem Pop-Up-Makerspace mit einer technischen...
Erhebung von Lernaktivitäten in einem Pop-Up-Makerspace mit einer technischen...Erhebung von Lernaktivitäten in einem Pop-Up-Makerspace mit einer technischen...
Erhebung von Lernaktivitäten in einem Pop-Up-Makerspace mit einer technischen...
 
Mixed Reality im Distance Learning in der Hochschullehre
Mixed Reality im Distance Learning in der HochschullehreMixed Reality im Distance Learning in der Hochschullehre
Mixed Reality im Distance Learning in der Hochschullehre
 
Development of a WCAG theme for a learning management system
Development of a WCAG theme for a learning management systemDevelopment of a WCAG theme for a learning management system
Development of a WCAG theme for a learning management system
 
Math trainer as a chatbot via system(push) messages for Android
Math trainer as a chatbot via system(push) messages for AndroidMath trainer as a chatbot via system(push) messages for Android
Math trainer as a chatbot via system(push) messages for Android
 
Empirical Analysis of Automated Editing of Raw Learning Video Footage
Empirical Analysis of Automated Editing of Raw Learning Video FootageEmpirical Analysis of Automated Editing of Raw Learning Video Footage
Empirical Analysis of Automated Editing of Raw Learning Video Footage
 
Fächerintegrativer Unterricht am Beispiel der Leichtathletik
Fächerintegrativer Unterricht am Beispiel der LeichtathletikFächerintegrativer Unterricht am Beispiel der Leichtathletik
Fächerintegrativer Unterricht am Beispiel der Leichtathletik
 
DENKEN UND TECHNIK Über manipulative Auswirkungen von Internettechnologien
DENKEN UND TECHNIK Über manipulative Auswirkungen von InternettechnologienDENKEN UND TECHNIK Über manipulative Auswirkungen von Internettechnologien
DENKEN UND TECHNIK Über manipulative Auswirkungen von Internettechnologien
 
Empfehlungen für den Unterricht im Fach Informatik für Menschen mit Autismus-...
Empfehlungen für den Unterricht im Fach Informatik für Menschen mit Autismus-...Empfehlungen für den Unterricht im Fach Informatik für Menschen mit Autismus-...
Empfehlungen für den Unterricht im Fach Informatik für Menschen mit Autismus-...
 
Entwicklung eines Online-Kurses für digitale Kompetenzen für Studienanfänger:...
Entwicklung eines Online-Kurses für digitale Kompetenzen für Studienanfänger:...Entwicklung eines Online-Kurses für digitale Kompetenzen für Studienanfänger:...
Entwicklung eines Online-Kurses für digitale Kompetenzen für Studienanfänger:...
 
School Start Screening Tool
School Start Screening ToolSchool Start Screening Tool
School Start Screening Tool
 
Development of a mobile French language learning platform
Development of a mobile French language learning platformDevelopment of a mobile French language learning platform
Development of a mobile French language learning platform
 
Learning Analytics and Spelling Acquisition in German - the Path to Indivdual...
Learning Analytics and Spelling Acquisition in German - the Path to Indivdual...Learning Analytics and Spelling Acquisition in German - the Path to Indivdual...
Learning Analytics and Spelling Acquisition in German - the Path to Indivdual...
 
Learning Analytics and MOOCs
Learning Analytics and MOOCsLearning Analytics and MOOCs
Learning Analytics and MOOCs
 
Fächerintegrativer Unterricht am Beispiel des Lernroboters Thymio
Fächerintegrativer Unterricht am Beispiel des Lernroboters ThymioFächerintegrativer Unterricht am Beispiel des Lernroboters Thymio
Fächerintegrativer Unterricht am Beispiel des Lernroboters Thymio
 
Einsatz von Mixed Reality im Klassenzimmer
Einsatz von Mixed Reality im KlassenzimmerEinsatz von Mixed Reality im Klassenzimmer
Einsatz von Mixed Reality im Klassenzimmer
 
Chatbots for Brand Representation in Comparison with Traditional Websites
Chatbots for Brand Representation in Comparison with Traditional WebsitesChatbots for Brand Representation in Comparison with Traditional Websites
Chatbots for Brand Representation in Comparison with Traditional Websites
 
Development of a learning diary for a MOOC platform
Development of a learning diary for a MOOC platformDevelopment of a learning diary for a MOOC platform
Development of a learning diary for a MOOC platform
 

Semantic Recommandation Sytems for Research 2.0

  • 1. SEMANTIC RECOMMENDATION SYSTEMS FOR RESEARCH 2.0 OR A Conceptual Prototype for a Twitter based Recommender System for Research 2.0 by Patrick Thonhauser Thursday, October 11, 12
  • 2. OUTLINE • Motivation • Basics (Semantic Web, Recommender Systems, Natural Language Processing) • Conceptual Prototype • Test results and Discussion • Questions Thursday, October 11, 12
  • 3. MOTIVATION • Is Twitter useful for discovering new connections between researchers in similar subject areas (and why Twitter)? • How much information can we extract form 140 character strings? • Is it possible to separate useful information from noise? • Are there any appropriate classifiers and metrics to measure the significance of Twitter users and Tweets? Thursday, October 11, 12
  • 4. SEMANTIC WEB • Additional Layer of Information • Linked Data (use URIs as names, use HTTP URIs, use standards to provide Information, include links to other URIs) • RDF (based on triples -> subject, predicate, object) is like HTML for the classic web • Nearly all semantic web standards are based on RDF (like FOAF - Friend of a Friend Project) Thursday, October 11, 12
  • 5. RECOMMENDER SYSTEMS • Collaborative Filtering (user based/ item based) • Content Based Recommendation • Knowledge Based Recommendation • Hybrid Recommendations Thursday, October 11, 12
  • 6. NATURAL LANGUAGE PROCESSING (NLP) • Classification of Microtext Artefacts (This presentation is killer!) • Applying NLP - Pipelines • End of Sentence Detection • Tokenization • POS Tagging • Chunking • Extraction Thursday, October 11, 12
  • 7. THE CONCEPT OF THOUGHT BUBBLES Let’s imagine every Twitter user belongs to several different topic related Bubbles Thursday, October 11, 12
  • 8. LET’S SUMMARIZE •A user is part of topic related bubbles • Twitter users within topic related bubbles don’t necessarily know each other • Connections of already existing connections of the service user lead to new information • Non bidirectional connections preferred So how can we find such potentially interesting users? Thursday, October 11, 12
  • 9. PROOF OF CONCEPT SYSTEM (1) Preselection of user set, which will be analyzed in depth A USERS THOUGHT BUBBLE SPORTS (2) Apply NLP-Pipeline for measuring user similarity SERVICE IOS DEV USER (3) Categorize the top-n best scoring TW RE ITT SOCIAL MEDIA ST ER T BU HO users according to the idea of AP BB UG I LE HT S AP Thought Bubbles I PRE- FILTERING NLP (4) Recommend top-n best scoring CLUSTERING users of a category to the user DB CATEGORI ANALYZE SATION RECS SERVER (5) Analyze acceptance of recommendations Thursday, October 11, 12
  • 10. (1) PRE-SELECTION/FILTERING Filter accounts Filter accounts where: Filter non that are already follower_count < 300 English speaking connected to you status_count < 1000 accounts Friends of Identifiy People Friends Filter Filter Filter by using a simple Twitter NLP Pipeline Accounts Set of Twitter accounts for further processing • The set of friends of friend‘s Twitter accounts changes from iteration to iteration • Filtersare added after analyzing the acceptance of recommendations Thursday, October 11, 12
  • 11. (2) NLP PIPELINE Tokenization and Neglect 200 most stripping Raw Tweets @mentions and POS tagged Tweets used English words URLs [('The', 'AT'), ('grand', 'JJ'), @testuser The ('jury', 'NN'), grand jury ('commented', commented on a POS tagging 'VBD'), ('on', 'IN'), Chunking number of… ('a', 'AT'), ('number', 'NN'), ... ('.', '.')] Set of Frequency Distributed mined Mined nouns and phrases nouns and phrases [('jury', 'NN'), [('jury', 34), 'number', ('social', 23), Frequency 'NN'), DB ('test case', Distribution ('social 16), ...] dayly', 'NP'), ...] Filter top n words 400 most recent Tweets of a potential recommendation are used for calculating the similarity measure Thursday, October 11, 12
  • 12. • Calculate top-n users by applying Single-Linkage- Clustering • Categorize if user belongs to user specific bubbles • Present recommendation lists to users • Analyze acceptance of recommendations (connect user accounts with FOAF) and add new filter predicate if necessary. Thursday, October 11, 12
  • 13. recommendations are framed @gargamit100* @selvers* @UpsideLearning* @poposkidimitar* @jkalten* SUPERVISED @cpappas* @pfidalgo1* @timbuckteeth* @starsandrobots* @TheJ Russ @cliveshepherd* TEST RUN @Microsoft @jtcobb* @MichaelPhelps @SebastianThrun* @elearning* @elvaandrade @BarackObama @SteveVictor @AnwarRichardson @pabaker55* @jamesmclynn @DrEvanHarris @mstrohm* @AmyFrearson @gekitz @Hhaitch @sclater* @TheRock @MCeraWeakBaby @fatcharlesh @FrankViola @timbarker @AnnaOscarsson @WithDrake sabrinaVanessa @charliesheen @WWEDanielBryan @cmccosky @kaitlyntrigger @judithsei* @atsc* @melaniedaveid @Emmadw* @ladygaga @marcusfairs @lucyheartsTW @PeterSmith @MikeVick @meadd cameron 0 0.075 0.150 0.225 0.300 Thursday, October 11, 12
  • 14. UNSUPERVISED TEST RESULTS The probability that a recommended item is relevant is 64.4%. Standard Derivation: 31.5% Thursday, October 11, 12
  • 15. DISCUSSION Twitter IS useful for discovering new information in sense of Research 2.0 but: • Recommendations reflect the Twitter behavior of the user • Automated tweets harm recommendation results (one sentence gets an enormous weight because it occurs very very often) • Twitter‘s request limitation is a show stopper • Comparison to similar systems (Content and collaborative filtering) Thursday, October 11, 12
  • 16. THANK YOU! ANY QUESTIONS? Thursday, October 11, 12