O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

An Ensemble Model for Cross-Domain Polarity Classification on Twitter

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

Confira estes a seguir

1 de 25 Anúncio

Mais Conteúdo rRelacionado

Semelhante a An Ensemble Model for Cross-Domain Polarity Classification on Twitter (20)

Anúncio

Mais de Symeon Papadopoulos (20)

Mais recentes (20)

Anúncio

An Ensemble Model for Cross-Domain Polarity Classification on Twitter

  1. 1. An Ensemble Model for Cross-Domain Polarity Classification on Twitter Adam Tsakalidis, Symeon Papadopoulos, Yiannis Kompatsiaris 15th Web Information System Engineering (WISE 2014) Thessaloniki, 14 October 2014 Information Technologies Institute (ITI) Center for Research & Technology Hellas (CERTH) #wiseconf2014
  2. 2. Overview • The Problem • Existing Approaches • Proposed Ensemble Model • Experimental Study • Summary – Future Work #wiseconf2014 #2
  3. 3. Polarity Classification motivation & definition #wiseconf2014 #3
  4. 4. Polarity Classification - Problem Finding about whether a piece of text (e.g. tweet) conveys a positive or negative sentiment about a topic or entity of interest (person, party, brand, etc.) #wiseconf2014 #4
  5. 5. Polarity Classification - Applications • Monitoring sentiment about a brand could help take appropriate measures in case of negative sentiment increase (e.g. apology, assistance, compensation). • Aggregating over many tweets could help form an overall impression of the public opinion about the topic/entity of interest. #wiseconf2014 #5
  6. 6. The Cross-Domain Challenge • Depending on the domain (and sometimes even the topic) of interest, the vocabulary and phrasing of sentiment may vary wildly. • Most sentiment detection approaches require manually created ground truth from the same domain to achieve high performance  expensive! • We propose a framework that leverages an ensemble of sentiment detectors to tackle the cross-domain #wiseconf2014 challenge. #6
  7. 7. Related Work #wiseconf2014 #7
  8. 8. Background: Sentiment Analysis Two main problems: subjectivity & polarity TTeexxtt RReepprreesseennttaattioionn LLeeaarrnniningg SSeenntitmimeenntt NNBB SSVVMM #wiseconf2014 #8 Hybrid- Hybrid- Ensemble Unsupervised SSSSLL Ensemble UUnnigigrarammss BBigigrarammss PPOOSS OOththeerr SSuuppeervrvisiseedd Unsupervised
  9. 9. Hybrid Models Barbosa and Feng, COLING 2010 • Combines sentiment labels from three independent sources • Considers the quality of individual sources • Considers the different bias of individual sources Gonçalves et al., COSN 2013 • First evaluates 8 different methods with respect to coverage and agreement • Proposes a combined method, which combines individual method outputs based on their performance (first in terms of coverage and then in terms of accuracy) on an independent dataset #wiseconf2014 #9
  10. 10. Ensemble Model for Cross-Domain Polarity Classification approach description #wiseconf2014 #10
  11. 11. Method Overview #wiseconf2014 #11 TTBBRR FFBBRR CCRR LLBBRR CCLL__TTBBRR CCLL__FFBBRR TTwweeeett CCLL__CCRR CCLL__LLBBRR HHCC LLCC CCOOMMBB representation learning ensembling
  12. 12. Representation • TBR – binary n-grams – n-grams with tf – n-grams with tf-idf – n = 1, 2, 3  9 representations • FBR • CR – TBR + POS tagging • LBR – SentiWordNet (5D): sum of scores for Noun, Verbs, Adjectives, Adverbs, sum of keywords (#pos - #neg) – Opinion Lexicon: Sum of keywords (#pos - #neg) #wiseconf2014 #12 TBR, FBR and CR with tf < 2 were removed!
  13. 13. Ensemble Learning • Combine two types of classifiers: HC and LC • Hybrid classifier (HC) [domain-dependent] i: tweet, c: class (pos/neg), r: TBR, FBR, LR, wr: weights defined based on accuracy on ED dataset, pr: individual classifier output • Lexicon-based classifier (LC) [domain-independent] • If the outputs of the two classifiers agree, then assign the agreed class to the tweet. • If the outputs disagree, use a domain-adapted classifier (TBR) trained on the agreed tweets. #wiseconf2014 #13
  14. 14. Practical aspects • In order for the ensemble model to work well, the two types of classifiers need to be individually accurate at sufficient rates, since we trust their outputs to be used as a new training set! • In a real-time scenario, we need to allocate some time for the domain adaptation process to take place. This should be defined on a per case basis. adaptation ensemble model operation #wiseconf2014 #14
  15. 15. Experimental Study #wiseconf2014 #15
  16. 16. Datasets • Emoticons Dataset (ED) – 250K tweets collected over 2-days (mid-march 2014) containing happy/sad emoticons (removed retweets) [English only!] – used for feature extraction, preliminary development • Development set (DEV) – 10K tweets (equally balanced) collected in the same way as ED • Stanford Twitter dataset (STS) – STS-1: 177 pos, 184 neg – STS-2: 108 pos, 75 neg • Obama Healthcare Reform (HCR) – dev-set: 135 pos, 328 neg – test-set: 116 pos, 383 neg • Obama-McCain Debate (OMD) – subset (agreement>50%): 707 pos, 1190 neg #wiseconf2014 #16
  17. 17. Model building • Performance of TBR variants, role of POS (on 33% of ED) • Performance of LR features • Comparative performance of all classifiers (on DEV) #wiseconf2014 #17
  18. 18. Results more realistic estimate of real-world performance / still, far better than other out-of- domain methods over-optimistic estimate of real-world performance since 90% of set used for training #wiseconf2014 #18
  19. 19. Role of classifier agreement • As expected, accuracy on disagreed tweets is lower – results produced using a classifier trained on “noisy” data – more challenging cases • A similar case when training using automatically collected data based on emoticons #wiseconf2014 #19
  20. 20. Conclusions & Future Work #wiseconf2014 #20
  21. 21. Conclusions & Future Work Conclusion •Ensemble model in tandem with domain adaptation is an effective strategy for tackling the cross-domain challenge! Future Work •Incorporate the “neutral” class and evaluate •Test with more datasets •Check with additional classifiers as input #wiseconf2014 #21
  22. 22. Thank you! #wiseconf2014 #22 Questions? https://github.com/socialsensor/sentiment-analysis http://socialsensor.eu/ adam.tsakalidis@gmail.com / papadop@iti.gr a.tsakalidis@warwick.ac.uk
  23. 23. Key References (1/3) • Alina Andreevskaia and Sabine Bergler. When specialists and generalists work together: Overcoming domain dependence in sentiment tagging. In ACL, pages 290-298, 2008 • Luciano Barbosa and Junlan Feng. Robust sentiment detection on twitter from biased and noisy data. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pages 36-44. ACL, 2010 • Adam Bermingham and Alan F Smeaton. Classifying sentiment in microblogs: is brevity an advantage? In Proceedings of the 19th ACM international conference on Information and knowledge management, pages 1833-1836. ACM, 2010 • Albert Bifet and Eibe Frank. Sentiment knowledge discovery in twitter streaming data. In Discovery Science, pages 1-15. Springer, 2010 • Johan Bollen, Huina Mao, and Alberto Pepe. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In ICWSM, 2011 • Samuel Brody and Nicholas Diakopoulos. Cooooooooooooooollllllllllllll !!!!!!!!!!!!!!: using word lengthening to detect sentiment in microblogs. In Proceedings of the Conference on EMNLP, pages 562-570. ACL, 2011. #wiseconf2014 #23
  24. 24. Key References (2/3) • Andrea Esuli and Fabrizio Sebastiani. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC, vol. 6, pages 417-422, 2006 • Alec Go, Richa Bhayani, and Lei Huang. Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, pages 1-12, 2009 • Pollyanna Goncalves, Matheus Araujo, Fabricio Benevenuto, and Meeyoung Cha. Comparing and combining sentiment analysis methods. In Proceedings of the first ACM COSN, pages 27-38. ACM, 2013 • Minqing Hu and Bing Liu. Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 168-177. ACM, 2004 • Long Jiang, Mo Yu, Ming Zhou, Xiaohua Liu, and Tiejun Zhao. Target-dependent twitter sentiment classification. In Proceedings of the 49th Annual Meeting of the ACL: HLT-Vol. 1, pages 151-160. ACL, 2011 • Jonathon Read. Using emoticons to reduce dependency in machine learning techniques for sentiment classification. In Proceedings of the ACL Student Research Workshop, pages 43-48. ACL, 2005 • Hassan Saif, Yulan He, and Harith Alani. Semantic sentiment analysis of twitter. In The Semantic Web-ISWC 2012, pages 508-524. Springer, 2012 #wiseconf2014 #24
  25. 25. Key References (3/3) • Emmanouil Schinas, Symeon Papadopoulos, Sotiris Diplaris, Yiannis Kompatsiaris, Yosi Mass, Jonathan Herzig, and Lazaros Boudakidis. Eventsense: Capturing the pulse of large-scale events by mining social media streams. In Proceedings of the 17th Panhellenic Conference on Informatics, pages 17-24. ACM, 2013 • Michael Speriosu, Nikita Sudan, Sid Upadhyay, and Jason Baldridge. Twitter polarity classification with label propagation over lexical links and the follower graph. In Proceedings of the First workshop on Unsupervised Learning in NLP, pages 53-63. ACL, 2011 • Kristina Toutanova, Dan Klein, Christopher D Manning, and Yoram Singer. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proceedings of the 2003 Conference of the North American Chapter of the ACL on HLT-Vol. 1, pages 173-180. ACL, 2003 • Andranik Tumasjan, Timm Oliver Sprenger, Philipp Sandner, and Isabell Welpe. Predicting elections with twitter: What 140 characters reveal about political sentiment. In ICWSM 2010, 10:178-185, 2010 • Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on HLT and EMNLP, pages 347-354. ACL, 2005 • Ley Zhang, Riddhiman Ghosh, Mohamed Dekhil, Meichun Hsu, and Bing Liu. Combining Lexicon-based and learning-based methods for twitter sentiment analysis. HP Laboratories, Technical Report HPL-2011, 89, 2011. #wiseconf2014 #25

×