SlideShare uma empresa Scribd logo
1 de 14
Question Classification & Sentiment
             Analysis



                      Kwok-Ping Chan

                    Dept. of Computer Science,
                     The Univ. of Hong Kong


                        March 5, 2010
The Knowledge Forum
  A forum for students to discuss interesting issues, so that they can
  learn during the discussion process.
  Monitor the progress of students participating in the forum.
  Forum articles can be categorized into four different types – as
  Argument, Statement, Information, and Question.
  Examples of Articles
          (Information) Alcohol is an other kind of energy that would not produce
          air-pollution and easy to use. In Brazil, alcohol energy is very popular and
          successful. The Brazil government co-operate with a bank and produce alcohol for
          drivers
          (Argument) but producing fossil fuel need a few million years or maybe more than
          it. So it will too late if we have to wait for a long time until its produced.
          (Question) is it the one using Changjiang River?
          (Statement) we are doing wind energy.

2 of 14
Article Classification
  The progress of a student is reflected by the different types of
  articles the student posted on the forum.
  We would like to use Machine Learning technique to solve this
  problem.
  Two pieces of work which is related to this problem:
          Question Classification — Classify questions into different categories.
          Sentiment Analysis (Opinion Mining) — aims to determine the
          attitude of a writer with respect to some topic. The attitude may be
          their judgment or evaluation, their affective state (the emotional state
          of the author when writing) or the intended emotional communication
          (the emotional effect the author wishes to have on the reader). (from
          Wikipedia) This includes
             determining the polarity of a given text — positive, negative or neutral.
             subjectivity/objectivity identification
             determining the opinions expressed on different aspects of entities

3 of 14
Question Classification
  We have used a local-aligned tree-kernel to do Question
  Classification.
  Application: Question/Answering System.
  Based on the UIUC TREC database.
  5 training set, containing 1,000 to 5,500 training questions, and a
  test set containing 500 questions (Li & Roth).
  The Questions are divided into 6 coarse classes and 50 fine classes.
  We achieved 92.5% accuracy.




4 of 14
Question hierarchy
  ABBREVIATION – abbreviation and expression.
  DESCRIPTION – definition, description, manner, reason.
  ENTITY – animal, body, color, creative, currency,
  disease/medicine, event, food, instrument, lang, letter, other,
  plant, product, religion, sport, substance, symbol, technique, term,
  vehicle, word.
  HUMAN – description, group, individual, title
  LOCATION – city, country, mountain, state, other
  NUMERIC VALUE – code, count, date, distance, money, order,
  period, speed, percent, temp, vol/size, weight, other




5 of 14
Example Questions
The following is the first question extracted from the training dataset
for each broad class:
   (ABBR, exp) What is the full form of .com ?
   (DESC, manner) How did serfdom develop in and then leave
   Russia ?
   (ENTY, animal) What fowl grabs the spotlight after the Chinese
   Year of the Monkey ?
   (HUM, title) What is the oldest profession ?
   (LOC, state) What sprawling U.S. state boasts the most airports ?
   (NUM, date) When was Ozzy Osbourne born ?




 6 of 14
Syntactic Features
   words – words appearing in the question.
   POS tags – their corresponding POS tags.
   Chunks – non-overlapping phrases in the question.
   Head chunks – the first noun/verb chunk in the question.
Examples: (from Li & Roth)
   (Question) : Who was the first woman killed in the Vietnam War?
   (POS Tagging) : [Who WP] [was VBD] [the DT] [first JJ]
   [woman NN] [killed VBN] [in IN] [the DT] [Vietnam NNP] [War
   NNP] [? .]
   (Chunking) : [NP Who] [VP was] [NP the first woman] [VP
   killed] [PP in] [NP the Vietnam War] ?


 7 of 14
Semantic Features
  Named Entities – noun phrases was categorized into different
  semantic categories or varying specificity.
          e.g. Question in the previous slides, we can get the named entity
          [Num first] and [Event Vietnam War].
  WordNet Senses – words are organized into senses in WordNet,
  which are organized in hierarchy. All senses of a word are used as
  features.
          We use the Wu & Palmer metric to measure the similarity between
          words.
  Class-specific related words – some words are related to specific
  question class, e.g. alcohol, lunch, orange etc are related to food
  class.
  Distributional Similarity
          words occurring in similar syntactic structure are similar to each other.
          words can be grouped into semantic categories accordingly.
8 of 14
Classifiers
  Li & Roth used a hierarchical classifier.
          Use two level classifier.
          Coarse classifier – divide into the coarse classes.
          Fine classifier – for the fine classes.
          use Winnows algorithm.
  Zhang & Chan
          Use convolution tree kernels with local alignment
          tree-kernel is semantic-enriched, by measuring the semantic similarity
          of two parse trees, based on WordNet and Wu & Palmer metric, and
          distributional similarity.
          Classification was done by Support Vector Machine (SVM).
  We believe article classification can be done similarly, using both
  general features (for example, all POS tags and WordNet senses)
  and expert features (Class-specific related words).

9 of 14
Sentiment Analysis & Opinion Mining
It involves the following problems (Pang & Li):
    Sentiment polarity and degree of positivity
           classify the position of the opinion in a continuum between two
           polarities.
           for example, in the context of reviews or political speech.
           determine whether a piece of objective information is good or bad.
           more difficult task: rating inference, “pro and con” instead of positive
           or negative.
   Subjectivity detection and opinion identification
           whether an article contain subjective/objective information.
           determining opinion strength (different from rating).
           for example, use adjectives in the sentences.




10 of 14
Features
The following features can be used for sentiment analysis:
  Term presence & frequency
           Although term frequency was commonly used in information retrieval,
           it was found that term presence gives better performance.
           Binary features vs numerical feature.
           topic emphasized by frequent occurrences of keywords
           overall sentiment may not.
           Sometimes single occurrence of word already indicate subjectivity.
   Term-based features
           position of a term within a textual unit.
           use of unigram, bigram or trigram.
           high-contrast pair of words, such as ”delicious an dirty”.




11 of 14
Features
   Parts of Speech
           Adjectives is particularly important in sentiment analysis.
           for example, certain adjective are good indicators.
           Use selected phrases, which are chosen via a pre-specified POS
           patterns, most including an adjective or an adverb.
           Nouns and verbs can also be strong indicators (e.g. ”gem”, and
           ”love”)
   Syntax
           sub-tree syntactic structures have been used.
           collocation and other complex syntactic patterns have also been found
           useful.
   Negation
           Positive and Negative opinion sometimes only differs in one negative
           word (such as ”not”, ”don’t”).
           Negation can be expressed in subtle ways, which is difficult to discover
           (such as sarcasm and irony).
12 of 14
Features
   Topic-oriented features
           topic information should be incorporated into features.
           for example, a piece of good news of rivals can be a bad news.
           may need to include indicators (”this work”) or party names so that
           the features can be attached to different entities.




13 of 14
Suggested Framework
   to apply Machine Learning, we need a labeled corpus with
   sufficient training data.
   Many different features are used. Some system uses more than
   200,000 features! (of course generated by computers)
   Can group terms together to form concepts to reduce number of
   features.
   If we have enough training data, we can find the grouping most
   tailored for the topic involved.
   Features can also be results of another machine learning program,
   such as sentiment analysis, topic related keywords.
   Supervised classification can be employed, such as Support Vector
   Machines or Decision Trees with Adaboost.
   If sufficient data, the entire process can be data-driven.
   Expert knowledge can be used to reduce amount of training data
   needed.
14 of 14

Mais conteúdo relacionado

Mais procurados

Taking into account communities of practice’s specific vocabularies in inform...
Taking into account communities of practice’s specific vocabularies in inform...Taking into account communities of practice’s specific vocabularies in inform...
Taking into account communities of practice’s specific vocabularies in inform...inscit2006
 
Text summarization
Text summarizationText summarization
Text summarizationkareemhashem
 
2015 07-tuto2-clus type
2015 07-tuto2-clus type2015 07-tuto2-clus type
2015 07-tuto2-clus typejins0618
 
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...Traian Rebedea
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLPRupak Roy
 
Dbms Cluster 4
Dbms Cluster 4Dbms Cluster 4
Dbms Cluster 4out2sea5
 
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Quinsulon Israel
 
Task oriented word embedding for text classification
Task oriented word embedding for text classificationTask oriented word embedding for text classification
Task oriented word embedding for text classificationPC LO
 
Ijarcet vol-3-issue-3-623-625 (1)
Ijarcet vol-3-issue-3-623-625 (1)Ijarcet vol-3-issue-3-623-625 (1)
Ijarcet vol-3-issue-3-623-625 (1)Dhabal Sethi
 
Business intelligence analytics using sentiment analysis-a survey
Business intelligence analytics using sentiment analysis-a surveyBusiness intelligence analytics using sentiment analysis-a survey
Business intelligence analytics using sentiment analysis-a surveyIJECEIAES
 
Token classification using Bengali Tokenizer
Token classification using Bengali TokenizerToken classification using Bengali Tokenizer
Token classification using Bengali TokenizerJeet Das
 
Knowledge Representation in Artificial intelligence
Knowledge Representation in Artificial intelligence Knowledge Representation in Artificial intelligence
Knowledge Representation in Artificial intelligence Yasir Khan
 
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015RIILP
 
RCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMinerRCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMinerbohanairl
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vecananth
 
Categorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsCategorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsAndre Freitas
 

Mais procurados (20)

NLP todo
NLP todoNLP todo
NLP todo
 
Taking into account communities of practice’s specific vocabularies in inform...
Taking into account communities of practice’s specific vocabularies in inform...Taking into account communities of practice’s specific vocabularies in inform...
Taking into account communities of practice’s specific vocabularies in inform...
 
Text summarization
Text summarizationText summarization
Text summarization
 
2015 07-tuto2-clus type
2015 07-tuto2-clus type2015 07-tuto2-clus type
2015 07-tuto2-clus type
 
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLP
 
Dbms Cluster 4
Dbms Cluster 4Dbms Cluster 4
Dbms Cluster 4
 
PARCC-ELA
PARCC-ELAPARCC-ELA
PARCC-ELA
 
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
 
Task oriented word embedding for text classification
Task oriented word embedding for text classificationTask oriented word embedding for text classification
Task oriented word embedding for text classification
 
Ijarcet vol-3-issue-3-623-625 (1)
Ijarcet vol-3-issue-3-623-625 (1)Ijarcet vol-3-issue-3-623-625 (1)
Ijarcet vol-3-issue-3-623-625 (1)
 
Business intelligence analytics using sentiment analysis-a survey
Business intelligence analytics using sentiment analysis-a surveyBusiness intelligence analytics using sentiment analysis-a survey
Business intelligence analytics using sentiment analysis-a survey
 
Token classification using Bengali Tokenizer
Token classification using Bengali TokenizerToken classification using Bengali Tokenizer
Token classification using Bengali Tokenizer
 
columbia-gwu
columbia-gwucolumbia-gwu
columbia-gwu
 
Knowledge Representation in Artificial intelligence
Knowledge Representation in Artificial intelligence Knowledge Representation in Artificial intelligence
Knowledge Representation in Artificial intelligence
 
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
 
RCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMinerRCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMiner
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
 
Text Mining Analytics 101
Text Mining Analytics 101Text Mining Analytics 101
Text Mining Analytics 101
 
Categorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsCategorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary Definitions
 

Destaque

Open & Collaborative Learning: How Social Networks Can Transform Learning
Open & Collaborative Learning: How Social Networks Can Transform LearningOpen & Collaborative Learning: How Social Networks Can Transform Learning
Open & Collaborative Learning: How Social Networks Can Transform LearningAlec Couros
 
Academic Learning & Collaboration & Learning
Academic Learning & Collaboration & LearningAcademic Learning & Collaboration & Learning
Academic Learning & Collaboration & LearningAlec Couros
 
eCollaboration a Focus on Wiki In The Workplace
eCollaboration a Focus on Wiki In The WorkplaceeCollaboration a Focus on Wiki In The Workplace
eCollaboration a Focus on Wiki In The WorkplacePhilippe Scheimann
 
Overview Web2.0 Tools For Collaborative Learning
Overview Web2.0 Tools For Collaborative LearningOverview Web2.0 Tools For Collaborative Learning
Overview Web2.0 Tools For Collaborative LearningDavid Brooks
 
E Collaboration Is...
E Collaboration Is...E Collaboration Is...
E Collaboration Is...rob johnstone
 
Creative Collaboration: The power of partnership in design
Creative Collaboration: The power of partnership in designCreative Collaboration: The power of partnership in design
Creative Collaboration: The power of partnership in designPrincess Lasertron
 
Collaborative learning areas
Collaborative learning areasCollaborative learning areas
Collaborative learning areasNigel Ross
 
Technology & Collaborative Learning: Scaffolding for Student Success
Technology & Collaborative Learning: Scaffolding for Student SuccessTechnology & Collaborative Learning: Scaffolding for Student Success
Technology & Collaborative Learning: Scaffolding for Student SuccessJulia Parra
 
12 Principles of Collaboration
12 Principles of Collaboration12 Principles of Collaboration
12 Principles of CollaborationJacob Morgan
 
Collaborative Learning and Communities #fdol131
Collaborative Learning and Communities #fdol131Collaborative Learning and Communities #fdol131
Collaborative Learning and Communities #fdol131Sue Beckingham
 
Collaboration Insights Webinar: The 9 Types of Collaborators
Collaboration Insights Webinar: The 9 Types of CollaboratorsCollaboration Insights Webinar: The 9 Types of Collaborators
Collaboration Insights Webinar: The 9 Types of CollaboratorsCentral Desktop
 

Destaque (11)

Open & Collaborative Learning: How Social Networks Can Transform Learning
Open & Collaborative Learning: How Social Networks Can Transform LearningOpen & Collaborative Learning: How Social Networks Can Transform Learning
Open & Collaborative Learning: How Social Networks Can Transform Learning
 
Academic Learning & Collaboration & Learning
Academic Learning & Collaboration & LearningAcademic Learning & Collaboration & Learning
Academic Learning & Collaboration & Learning
 
eCollaboration a Focus on Wiki In The Workplace
eCollaboration a Focus on Wiki In The WorkplaceeCollaboration a Focus on Wiki In The Workplace
eCollaboration a Focus on Wiki In The Workplace
 
Overview Web2.0 Tools For Collaborative Learning
Overview Web2.0 Tools For Collaborative LearningOverview Web2.0 Tools For Collaborative Learning
Overview Web2.0 Tools For Collaborative Learning
 
E Collaboration Is...
E Collaboration Is...E Collaboration Is...
E Collaboration Is...
 
Creative Collaboration: The power of partnership in design
Creative Collaboration: The power of partnership in designCreative Collaboration: The power of partnership in design
Creative Collaboration: The power of partnership in design
 
Collaborative learning areas
Collaborative learning areasCollaborative learning areas
Collaborative learning areas
 
Technology & Collaborative Learning: Scaffolding for Student Success
Technology & Collaborative Learning: Scaffolding for Student SuccessTechnology & Collaborative Learning: Scaffolding for Student Success
Technology & Collaborative Learning: Scaffolding for Student Success
 
12 Principles of Collaboration
12 Principles of Collaboration12 Principles of Collaboration
12 Principles of Collaboration
 
Collaborative Learning and Communities #fdol131
Collaborative Learning and Communities #fdol131Collaborative Learning and Communities #fdol131
Collaborative Learning and Communities #fdol131
 
Collaboration Insights Webinar: The 9 Types of Collaborators
Collaboration Insights Webinar: The 9 Types of CollaboratorsCollaboration Insights Webinar: The 9 Types of Collaborators
Collaboration Insights Webinar: The 9 Types of Collaborators
 

Semelhante a Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative Learning (CSCL) Data

SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEWSENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEWJournal For Research
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESJournal For Research
 
Sentence level sentiment polarity calculation for customer reviews by conside...
Sentence level sentiment polarity calculation for customer reviews by conside...Sentence level sentiment polarity calculation for customer reviews by conside...
Sentence level sentiment polarity calculation for customer reviews by conside...eSAT Publishing House
 
Supervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmSupervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmIJSRD
 
Reflective Plan Examples
Reflective Plan ExamplesReflective Plan Examples
Reflective Plan ExamplesMonica Turner
 
Paper id 28201441
Paper id 28201441Paper id 28201441
Paper id 28201441IJRAT
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
The Role of Families and the Community Proposal Template (N.docx
The Role of Families and the Community Proposal Template  (N.docxThe Role of Families and the Community Proposal Template  (N.docx
The Role of Families and the Community Proposal Template (N.docxssusera34210
 
Sentiment+Analysis.ppt
Sentiment+Analysis.pptSentiment+Analysis.ppt
Sentiment+Analysis.pptvisheshs4
 
A review on sentiment analysis and emotion detection.pptx
A review on sentiment analysis and emotion detection.pptxA review on sentiment analysis and emotion detection.pptx
A review on sentiment analysis and emotion detection.pptxvoicemail1
 
Mining Opinion Features in Customer Reviews
Mining Opinion Features in Customer ReviewsMining Opinion Features in Customer Reviews
Mining Opinion Features in Customer ReviewsIJCERT JOURNAL
 
Communication Message Planning
Communication Message PlanningCommunication Message Planning
Communication Message Planningslpwendy
 

Semelhante a Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative Learning (CSCL) Data (20)

SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEWSENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
 
Lac presentation
Lac presentationLac presentation
Lac presentation
 
Sentence level sentiment polarity calculation for customer reviews by conside...
Sentence level sentiment polarity calculation for customer reviews by conside...Sentence level sentiment polarity calculation for customer reviews by conside...
Sentence level sentiment polarity calculation for customer reviews by conside...
 
Supervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmSupervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithm
 
Reflective Plan Examples
Reflective Plan ExamplesReflective Plan Examples
Reflective Plan Examples
 
Paper id 28201441
Paper id 28201441Paper id 28201441
Paper id 28201441
 
sent_analysis_report
sent_analysis_reportsent_analysis_report
sent_analysis_report
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
The Role of Families and the Community Proposal Template (N.docx
The Role of Families and the Community Proposal Template  (N.docxThe Role of Families and the Community Proposal Template  (N.docx
The Role of Families and the Community Proposal Template (N.docx
 
Ny3424442448
Ny3424442448Ny3424442448
Ny3424442448
 
Ontology
OntologyOntology
Ontology
 
Sentiment+Analysis.ppt
Sentiment+Analysis.pptSentiment+Analysis.ppt
Sentiment+Analysis.ppt
 
A review on sentiment analysis and emotion detection.pptx
A review on sentiment analysis and emotion detection.pptxA review on sentiment analysis and emotion detection.pptx
A review on sentiment analysis and emotion detection.pptx
 
Mining Opinion Features in Customer Reviews
Mining Opinion Features in Customer ReviewsMining Opinion Features in Customer Reviews
Mining Opinion Features in Customer Reviews
 
Ira 2013 presentation
Ira 2013 presentationIra 2013 presentation
Ira 2013 presentation
 
Web Opinion Mining
Web Opinion MiningWeb Opinion Mining
Web Opinion Mining
 
Communication Message Planning
Communication Message PlanningCommunication Message Planning
Communication Message Planning
 
Sentiment analysis on_unstructured_review-1
Sentiment analysis on_unstructured_review-1Sentiment analysis on_unstructured_review-1
Sentiment analysis on_unstructured_review-1
 
NLP Ecosystem
NLP EcosystemNLP Ecosystem
NLP Ecosystem
 

Mais de CITE

Keynote 1: Teaching and Learning Computational Thinking at Scale
Keynote 1: Teaching and Learning Computational Thinking at ScaleKeynote 1: Teaching and Learning Computational Thinking at Scale
Keynote 1: Teaching and Learning Computational Thinking at ScaleCITE
 
Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...
Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...
Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...CITE
 
Changing Technology Changing Practice: Empowering Staff and Building Capabili...
Changing Technology Changing Practice: Empowering Staff and Building Capabili...Changing Technology Changing Practice: Empowering Staff and Building Capabili...
Changing Technology Changing Practice: Empowering Staff and Building Capabili...CITE
 
Traditional Large Scale Educational Assessment and the Incorporation of Digit...
Traditional Large Scale Educational Assessment and the Incorporation of Digit...Traditional Large Scale Educational Assessment and the Incorporation of Digit...
Traditional Large Scale Educational Assessment and the Incorporation of Digit...CITE
 
Scaling up Assessment for Learning
Scaling up Assessment for LearningScaling up Assessment for Learning
Scaling up Assessment for LearningCITE
 
Seminar on policy study on e-Learning in Informal Learning contexts
Seminar on policy study on e-Learning in Informal Learning contextsSeminar on policy study on e-Learning in Informal Learning contexts
Seminar on policy study on e-Learning in Informal Learning contextsCITE
 
Seminar on policy study on e-Learning in Formal & Open Learning contexts
Seminar on policy study on e-Learning in Formal & Open Learning contextsSeminar on policy study on e-Learning in Formal & Open Learning contexts
Seminar on policy study on e-Learning in Formal & Open Learning contextsCITE
 
Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society
Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society
Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society CITE
 
G:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-Fred
G:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-FredG:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-Fred
G:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-FredCITE
 
Dr. David Gibson: Challenge-Based Learning
Dr. David Gibson: Challenge-Based LearningDr. David Gibson: Challenge-Based Learning
Dr. David Gibson: Challenge-Based LearningCITE
 
Analogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thoughtAnalogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thoughtCITE
 
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...CITE
 
Science of Learning — Why it matters to schools and families?
Science of Learning — Why it matters to schools and families?Science of Learning — Why it matters to schools and families?
Science of Learning — Why it matters to schools and families?CITE
 
Understanding the self through self bias
Understanding the self through self biasUnderstanding the self through self bias
Understanding the self through self biasCITE
 
The implementation of "Reading Battle" in Lam Tin Methodist Primary School
The implementation of "Reading Battle" in Lam Tin Methodist Primary SchoolThe implementation of "Reading Battle" in Lam Tin Methodist Primary School
The implementation of "Reading Battle" in Lam Tin Methodist Primary SchoolCITE
 
Strengthening students' reading comprehension ability (both Chinese and Engli...
Strengthening students' reading comprehension ability (both Chinese and Engli...Strengthening students' reading comprehension ability (both Chinese and Engli...
Strengthening students' reading comprehension ability (both Chinese and Engli...CITE
 
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"CITE
 
Xiao Hu "Learning Analytics Initiatives"
Xiao Hu "Learning Analytics Initiatives"Xiao Hu "Learning Analytics Initiatives"
Xiao Hu "Learning Analytics Initiatives"CITE
 
Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...
Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...
Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...CITE
 
Phil Winne "Learning Analytics for Learning Science When N = me"
Phil Winne "Learning Analytics for Learning Science When N = me"Phil Winne "Learning Analytics for Learning Science When N = me"
Phil Winne "Learning Analytics for Learning Science When N = me"CITE
 

Mais de CITE (20)

Keynote 1: Teaching and Learning Computational Thinking at Scale
Keynote 1: Teaching and Learning Computational Thinking at ScaleKeynote 1: Teaching and Learning Computational Thinking at Scale
Keynote 1: Teaching and Learning Computational Thinking at Scale
 
Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...
Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...
Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...
 
Changing Technology Changing Practice: Empowering Staff and Building Capabili...
Changing Technology Changing Practice: Empowering Staff and Building Capabili...Changing Technology Changing Practice: Empowering Staff and Building Capabili...
Changing Technology Changing Practice: Empowering Staff and Building Capabili...
 
Traditional Large Scale Educational Assessment and the Incorporation of Digit...
Traditional Large Scale Educational Assessment and the Incorporation of Digit...Traditional Large Scale Educational Assessment and the Incorporation of Digit...
Traditional Large Scale Educational Assessment and the Incorporation of Digit...
 
Scaling up Assessment for Learning
Scaling up Assessment for LearningScaling up Assessment for Learning
Scaling up Assessment for Learning
 
Seminar on policy study on e-Learning in Informal Learning contexts
Seminar on policy study on e-Learning in Informal Learning contextsSeminar on policy study on e-Learning in Informal Learning contexts
Seminar on policy study on e-Learning in Informal Learning contexts
 
Seminar on policy study on e-Learning in Formal & Open Learning contexts
Seminar on policy study on e-Learning in Formal & Open Learning contextsSeminar on policy study on e-Learning in Formal & Open Learning contexts
Seminar on policy study on e-Learning in Formal & Open Learning contexts
 
Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society
Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society
Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society
 
G:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-Fred
G:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-FredG:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-Fred
G:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-Fred
 
Dr. David Gibson: Challenge-Based Learning
Dr. David Gibson: Challenge-Based LearningDr. David Gibson: Challenge-Based Learning
Dr. David Gibson: Challenge-Based Learning
 
Analogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thoughtAnalogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thought
 
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
 
Science of Learning — Why it matters to schools and families?
Science of Learning — Why it matters to schools and families?Science of Learning — Why it matters to schools and families?
Science of Learning — Why it matters to schools and families?
 
Understanding the self through self bias
Understanding the self through self biasUnderstanding the self through self bias
Understanding the self through self bias
 
The implementation of "Reading Battle" in Lam Tin Methodist Primary School
The implementation of "Reading Battle" in Lam Tin Methodist Primary SchoolThe implementation of "Reading Battle" in Lam Tin Methodist Primary School
The implementation of "Reading Battle" in Lam Tin Methodist Primary School
 
Strengthening students' reading comprehension ability (both Chinese and Engli...
Strengthening students' reading comprehension ability (both Chinese and Engli...Strengthening students' reading comprehension ability (both Chinese and Engli...
Strengthening students' reading comprehension ability (both Chinese and Engli...
 
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
 
Xiao Hu "Learning Analytics Initiatives"
Xiao Hu "Learning Analytics Initiatives"Xiao Hu "Learning Analytics Initiatives"
Xiao Hu "Learning Analytics Initiatives"
 
Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...
Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...
Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...
 
Phil Winne "Learning Analytics for Learning Science When N = me"
Phil Winne "Learning Analytics for Learning Science When N = me"Phil Winne "Learning Analytics for Learning Science When N = me"
Phil Winne "Learning Analytics for Learning Science When N = me"
 

Último

Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 

Último (20)

Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative Learning (CSCL) Data

  • 1. Question Classification & Sentiment Analysis Kwok-Ping Chan Dept. of Computer Science, The Univ. of Hong Kong March 5, 2010
  • 2. The Knowledge Forum A forum for students to discuss interesting issues, so that they can learn during the discussion process. Monitor the progress of students participating in the forum. Forum articles can be categorized into four different types – as Argument, Statement, Information, and Question. Examples of Articles (Information) Alcohol is an other kind of energy that would not produce air-pollution and easy to use. In Brazil, alcohol energy is very popular and successful. The Brazil government co-operate with a bank and produce alcohol for drivers (Argument) but producing fossil fuel need a few million years or maybe more than it. So it will too late if we have to wait for a long time until its produced. (Question) is it the one using Changjiang River? (Statement) we are doing wind energy. 2 of 14
  • 3. Article Classification The progress of a student is reflected by the different types of articles the student posted on the forum. We would like to use Machine Learning technique to solve this problem. Two pieces of work which is related to this problem: Question Classification — Classify questions into different categories. Sentiment Analysis (Opinion Mining) — aims to determine the attitude of a writer with respect to some topic. The attitude may be their judgment or evaluation, their affective state (the emotional state of the author when writing) or the intended emotional communication (the emotional effect the author wishes to have on the reader). (from Wikipedia) This includes determining the polarity of a given text — positive, negative or neutral. subjectivity/objectivity identification determining the opinions expressed on different aspects of entities 3 of 14
  • 4. Question Classification We have used a local-aligned tree-kernel to do Question Classification. Application: Question/Answering System. Based on the UIUC TREC database. 5 training set, containing 1,000 to 5,500 training questions, and a test set containing 500 questions (Li & Roth). The Questions are divided into 6 coarse classes and 50 fine classes. We achieved 92.5% accuracy. 4 of 14
  • 5. Question hierarchy ABBREVIATION – abbreviation and expression. DESCRIPTION – definition, description, manner, reason. ENTITY – animal, body, color, creative, currency, disease/medicine, event, food, instrument, lang, letter, other, plant, product, religion, sport, substance, symbol, technique, term, vehicle, word. HUMAN – description, group, individual, title LOCATION – city, country, mountain, state, other NUMERIC VALUE – code, count, date, distance, money, order, period, speed, percent, temp, vol/size, weight, other 5 of 14
  • 6. Example Questions The following is the first question extracted from the training dataset for each broad class: (ABBR, exp) What is the full form of .com ? (DESC, manner) How did serfdom develop in and then leave Russia ? (ENTY, animal) What fowl grabs the spotlight after the Chinese Year of the Monkey ? (HUM, title) What is the oldest profession ? (LOC, state) What sprawling U.S. state boasts the most airports ? (NUM, date) When was Ozzy Osbourne born ? 6 of 14
  • 7. Syntactic Features words – words appearing in the question. POS tags – their corresponding POS tags. Chunks – non-overlapping phrases in the question. Head chunks – the first noun/verb chunk in the question. Examples: (from Li & Roth) (Question) : Who was the first woman killed in the Vietnam War? (POS Tagging) : [Who WP] [was VBD] [the DT] [first JJ] [woman NN] [killed VBN] [in IN] [the DT] [Vietnam NNP] [War NNP] [? .] (Chunking) : [NP Who] [VP was] [NP the first woman] [VP killed] [PP in] [NP the Vietnam War] ? 7 of 14
  • 8. Semantic Features Named Entities – noun phrases was categorized into different semantic categories or varying specificity. e.g. Question in the previous slides, we can get the named entity [Num first] and [Event Vietnam War]. WordNet Senses – words are organized into senses in WordNet, which are organized in hierarchy. All senses of a word are used as features. We use the Wu & Palmer metric to measure the similarity between words. Class-specific related words – some words are related to specific question class, e.g. alcohol, lunch, orange etc are related to food class. Distributional Similarity words occurring in similar syntactic structure are similar to each other. words can be grouped into semantic categories accordingly. 8 of 14
  • 9. Classifiers Li & Roth used a hierarchical classifier. Use two level classifier. Coarse classifier – divide into the coarse classes. Fine classifier – for the fine classes. use Winnows algorithm. Zhang & Chan Use convolution tree kernels with local alignment tree-kernel is semantic-enriched, by measuring the semantic similarity of two parse trees, based on WordNet and Wu & Palmer metric, and distributional similarity. Classification was done by Support Vector Machine (SVM). We believe article classification can be done similarly, using both general features (for example, all POS tags and WordNet senses) and expert features (Class-specific related words). 9 of 14
  • 10. Sentiment Analysis & Opinion Mining It involves the following problems (Pang & Li): Sentiment polarity and degree of positivity classify the position of the opinion in a continuum between two polarities. for example, in the context of reviews or political speech. determine whether a piece of objective information is good or bad. more difficult task: rating inference, “pro and con” instead of positive or negative. Subjectivity detection and opinion identification whether an article contain subjective/objective information. determining opinion strength (different from rating). for example, use adjectives in the sentences. 10 of 14
  • 11. Features The following features can be used for sentiment analysis: Term presence & frequency Although term frequency was commonly used in information retrieval, it was found that term presence gives better performance. Binary features vs numerical feature. topic emphasized by frequent occurrences of keywords overall sentiment may not. Sometimes single occurrence of word already indicate subjectivity. Term-based features position of a term within a textual unit. use of unigram, bigram or trigram. high-contrast pair of words, such as ”delicious an dirty”. 11 of 14
  • 12. Features Parts of Speech Adjectives is particularly important in sentiment analysis. for example, certain adjective are good indicators. Use selected phrases, which are chosen via a pre-specified POS patterns, most including an adjective or an adverb. Nouns and verbs can also be strong indicators (e.g. ”gem”, and ”love”) Syntax sub-tree syntactic structures have been used. collocation and other complex syntactic patterns have also been found useful. Negation Positive and Negative opinion sometimes only differs in one negative word (such as ”not”, ”don’t”). Negation can be expressed in subtle ways, which is difficult to discover (such as sarcasm and irony). 12 of 14
  • 13. Features Topic-oriented features topic information should be incorporated into features. for example, a piece of good news of rivals can be a bad news. may need to include indicators (”this work”) or party names so that the features can be attached to different entities. 13 of 14
  • 14. Suggested Framework to apply Machine Learning, we need a labeled corpus with sufficient training data. Many different features are used. Some system uses more than 200,000 features! (of course generated by computers) Can group terms together to form concepts to reduce number of features. If we have enough training data, we can find the grouping most tailored for the topic involved. Features can also be results of another machine learning program, such as sentiment analysis, topic related keywords. Supervised classification can be employed, such as Support Vector Machines or Decision Trees with Adaboost. If sufficient data, the entire process can be data-driven. Expert knowledge can be used to reduce amount of training data needed. 14 of 14