SlideShare uma empresa Scribd logo
1 de 17
Xiao Hu
University of Hong Kong
CITE Research Symposium 2013
May 12, 2013
Towards Automatic Analysis of Online
Discussions Among Hong Kong
Students
Outline
 Goals and Purposes
 Data Mining and Applications to Online Discussions
 Classification
 Association Rule Mining
 Findings
 More questions to answer
 Bridging research and teaching
Goals and Purposes
 Online discussions are widely used in education
 Effective for communication and collaboration
 Need tools to monitor online discussions
 Data mining may help (semi-)automatically identify
various patterns in online discussions, for example:
 Threads that need interventions
 Outcome predictions
 Role identification (e.g., question raiser, answer
provide, etc.)
 Network analysis of student groups
 Assessment of discussion quality
 .....
This Study
 How effective it is to mine online discussions
of HK students?
 A case study on
 1,965 discussion posts
 on the subject of global warming
 collected from five primary or secondary schools in
Hong Kong from years 2006-2009
 383 discussion threads involving 1 to 21
participants
 Two commonly used Data Mining techniques
 Classification
 Association rule mining
What is Data Mining?
 To identify patterns (or to prove no patterns) from a
dataset
 DM is NOT querying databases
 Where you know what you are looking for
 E.g., total sales in the past three years
 DM is NOT statistical testing
 Where you know the hypotheses
 E.g. H0: the means of two groups are equal
 DM is discovery-based
 Find out unknown patterns, generate hypotheses
 DM is iterative
 exhaustively explore very large data sets
Data Mining –
Classification
 Functionality: to assign one of a number of class
labels to each instance of your data
 Examples of classification tasks:
 Predicting tumor cells as benign or malignant
 Classifying credit card transactions as legitimate or
fraudulent
 Categorizing news stories as finance, weather,
entertainment, sports, etc
 Categorizing library materials by catalogs
 Predicting whether a post in an online forum will get
replies or not
How Classification Works?
 Given a collection of data (training set )
 Each instance contains a set of attributes, one of the
attributes is the class label.
 Find (calculate) a model for the class label as a
function of the values of other attributes
 Goal: previously unseen data can then be fed to
the model and the model assigns a class label
as accurately as possible
 Performance measure: accuracy
 How many instances are correctly classified
An Illustrative Example (1)
8
Training
Data
NAME RANK YEARS TENURED
Mike Assistant Prof 3 no
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes
Dave Assistant Prof 6 no
Anne Associate Prof 3 no
Classification
Algorithms
IF rank = ‘professor’
OR years > 6
THEN tenured = ‘yes’
Classifier
(Model)
An Illustrative Example (2)
9
Classification
Algorithms
IF rank = ‘professor’
OR years > 6
THEN tenured = ‘yes’
Classifier
(Model)
Unseen Data
(Jeff, Professor, 4)
Tenured?
Classifying Online Discussions
(1)
 Task1: threads with one vs. many participants
 To predict whether a post belongs to a thread
involving only one participant or a thread involving
many (> 14) participants
 Attributes used to build classification model
 Words in the posts: individual words (unigram)
two consecutive words (bigrams)
 Classification algorithm: Naive Bayesian
 Empirically effective in text categorization
 Performance: 79.07%
Classifying Online Discussions
(2)
 Task2: initial posts with vs. without replies
 To predict whether an initial post are likely to get
replies or not
 Attributes used to build classification model
 Words in the posts: individual words (unigram)
two consecutive words (bigrams)
 Classification algorithm: Naive Bayesian
 Empirically effective in text categorization
 Performance: 64%
Need to look deeper: mine patterns in each
category
Data Mining – Association Rules
 Functionality: to find associative relations
between patterns frequently occurring in your
data
 {Pattern A} => {Pattern B} with certain probability
 Examples of association rule mining tasks:
 Basket (shopping cart) analysis: customers buying
product A often also buy product B
 Medical diagnosis: a patient with symptoms A is
likely to have disease B
 Protein sequences: the appearances of amino acids
A indicates a greater chance of also having amino
acids C
 Online discussions: a post with word or phrase A is
likely to be in class B
Mining Association Rules from
Online Discussions (1)
 Task 1: Words and phrases strongly associated
with threads with one or many participants
Rank One participant Many participants
1 dioxide i agree
2 carbon dioxide agree
3 carbon i
4 temperature greenhouse gases
5 global warming i think
6 global think
7 warming yes
8 power carbon dioxide
9 air global warming
10 water yeah
Mining Association Rules from
Online Discussions (2)
 Task 2: Words and phrases strongly associated
with initial posts with or without replies
Rank Has no reply Has replies
1 global warming protect
2 earth’s melt
3 global world
4 warming warming
5 earth sea
6 s i
7 greenhouse ice
8 effect rise
9 gases global warming
10 greenhouse effect global
Findings and future work
 Data mining techniques were able to find patterns
from online discussions among Hong Kong
students
 It was feasible to distinguish threads and posts in
contrast categories
 Same techniques can be applied to distinguish
 Shallow and deep discussions (depth of threads)
 Confusion level of posts (need annotations on
training data)
 Speech acts of posts (need annotations on training
data)
 Emotions in the posts (need annotations on training
data)
Integrating Research and
Teaching
 Both data mining techniques are discussed and
practiced in the Data Mining course in the
Bachelor of Science in Information Management
(BSIM 0018)
 The tool used in this project is also taught in the
course
 Projects like this can be students’ course projects,
Thank you!
Questions, comments, and suggestions are
appreciated!
Xiao Hu: xiaoxhu@hku.hk

Mais conteúdo relacionado

Mais procurados

Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsParang Saraf
 
Query formulation process
Query formulation processQuery formulation process
Query formulation processmalathimurugan
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrievalssbd6985
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information RetrievalDustin Smith
 
IRJET- A Survey on Link Prediction Techniques
IRJET-  	  A Survey on Link Prediction TechniquesIRJET-  	  A Survey on Link Prediction Techniques
IRJET- A Survey on Link Prediction TechniquesIRJET Journal
 
Binary search query classifier
Binary search query classifierBinary search query classifier
Binary search query classifierEsteban Ribero
 
Wikipedia as an Ontology for Describing Documents
Wikipedia as an Ontology for Describing DocumentsWikipedia as an Ontology for Describing Documents
Wikipedia as an Ontology for Describing DocumentsZareen Syed
 
Ethnograph 10 Jul07
Ethnograph 10 Jul07Ethnograph 10 Jul07
Ethnograph 10 Jul07Clara Kwan
 
Ethnograph 11 Jul07
Ethnograph 11 Jul07Ethnograph 11 Jul07
Ethnograph 11 Jul07Clara Kwan
 
Data Education project briefing for Royal Society
Data Education project briefing for Royal SocietyData Education project briefing for Royal Society
Data Education project briefing for Royal SocietyKate Farrell
 
06 Network Study Design: Ethical Considerations and Safeguards
06 Network Study Design: Ethical Considerations and Safeguards06 Network Study Design: Ethical Considerations and Safeguards
06 Network Study Design: Ethical Considerations and Safeguardsdnac
 
Information retrieval 7 boolean model
Information retrieval 7 boolean modelInformation retrieval 7 boolean model
Information retrieval 7 boolean modelVaibhav Khanna
 
Enabling reuse of arguments and opinions in open collaboration systems PhD vi...
Enabling reuse of arguments and opinions in open collaboration systems PhD vi...Enabling reuse of arguments and opinions in open collaboration systems PhD vi...
Enabling reuse of arguments and opinions in open collaboration systems PhD vi...jodischneider
 
A Review on Neural Network Question Answering Systems
A Review on Neural Network Question Answering SystemsA Review on Neural Network Question Answering Systems
A Review on Neural Network Question Answering Systemsijaia
 
A Survey on Sentiment Categorization of Movie Reviews
A Survey on Sentiment Categorization of Movie ReviewsA Survey on Sentiment Categorization of Movie Reviews
A Survey on Sentiment Categorization of Movie ReviewsEditor IJMTER
 
09 Respondent Driven Sampling and Network Sampling with Memory
09 Respondent Driven Sampling and Network Sampling with Memory09 Respondent Driven Sampling and Network Sampling with Memory
09 Respondent Driven Sampling and Network Sampling with Memorydnac
 
Document Classification Using Expectation Maximization with Semi Supervised L...
Document Classification Using Expectation Maximization with Semi Supervised L...Document Classification Using Expectation Maximization with Semi Supervised L...
Document Classification Using Expectation Maximization with Semi Supervised L...ijsc
 
Marshall hm poster_vra2015
Marshall hm poster_vra2015Marshall hm poster_vra2015
Marshall hm poster_vra2015Hannah Marshall
 

Mais procurados (20)

Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector Representations
 
Query formulation process
Query formulation processQuery formulation process
Query formulation process
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
IRJET- A Survey on Link Prediction Techniques
IRJET-  	  A Survey on Link Prediction TechniquesIRJET-  	  A Survey on Link Prediction Techniques
IRJET- A Survey on Link Prediction Techniques
 
Binary search query classifier
Binary search query classifierBinary search query classifier
Binary search query classifier
 
Slideshow ire
Slideshow ireSlideshow ire
Slideshow ire
 
Wikipedia as an Ontology for Describing Documents
Wikipedia as an Ontology for Describing DocumentsWikipedia as an Ontology for Describing Documents
Wikipedia as an Ontology for Describing Documents
 
Ethnograph 10 Jul07
Ethnograph 10 Jul07Ethnograph 10 Jul07
Ethnograph 10 Jul07
 
Ethnograph 11 Jul07
Ethnograph 11 Jul07Ethnograph 11 Jul07
Ethnograph 11 Jul07
 
Data Education project briefing for Royal Society
Data Education project briefing for Royal SocietyData Education project briefing for Royal Society
Data Education project briefing for Royal Society
 
06 Network Study Design: Ethical Considerations and Safeguards
06 Network Study Design: Ethical Considerations and Safeguards06 Network Study Design: Ethical Considerations and Safeguards
06 Network Study Design: Ethical Considerations and Safeguards
 
Cl4201593597
Cl4201593597Cl4201593597
Cl4201593597
 
Information retrieval 7 boolean model
Information retrieval 7 boolean modelInformation retrieval 7 boolean model
Information retrieval 7 boolean model
 
Enabling reuse of arguments and opinions in open collaboration systems PhD vi...
Enabling reuse of arguments and opinions in open collaboration systems PhD vi...Enabling reuse of arguments and opinions in open collaboration systems PhD vi...
Enabling reuse of arguments and opinions in open collaboration systems PhD vi...
 
A Review on Neural Network Question Answering Systems
A Review on Neural Network Question Answering SystemsA Review on Neural Network Question Answering Systems
A Review on Neural Network Question Answering Systems
 
A Survey on Sentiment Categorization of Movie Reviews
A Survey on Sentiment Categorization of Movie ReviewsA Survey on Sentiment Categorization of Movie Reviews
A Survey on Sentiment Categorization of Movie Reviews
 
09 Respondent Driven Sampling and Network Sampling with Memory
09 Respondent Driven Sampling and Network Sampling with Memory09 Respondent Driven Sampling and Network Sampling with Memory
09 Respondent Driven Sampling and Network Sampling with Memory
 
Document Classification Using Expectation Maximization with Semi Supervised L...
Document Classification Using Expectation Maximization with Semi Supervised L...Document Classification Using Expectation Maximization with Semi Supervised L...
Document Classification Using Expectation Maximization with Semi Supervised L...
 
Marshall hm poster_vra2015
Marshall hm poster_vra2015Marshall hm poster_vra2015
Marshall hm poster_vra2015
 

Destaque

Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...Alexander Decker
 
DS2014: Feature selection in hierarchical feature spaces
DS2014: Feature selection in hierarchical feature spacesDS2014: Feature selection in hierarchical feature spaces
DS2014: Feature selection in hierarchical feature spacesPetar Ristoski
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher EducationSalford Systems
 
Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...IOSR Journals
 
Performance analysis of Data Mining algorithms in Weka
Performance analysis of Data Mining algorithms in WekaPerformance analysis of Data Mining algorithms in Weka
Performance analysis of Data Mining algorithms in WekaIOSR Journals
 

Destaque (6)

Indonesia
IndonesiaIndonesia
Indonesia
 
Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...
 
DS2014: Feature selection in hierarchical feature spaces
DS2014: Feature selection in hierarchical feature spacesDS2014: Feature selection in hierarchical feature spaces
DS2014: Feature selection in hierarchical feature spaces
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher Education
 
Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...
 
Performance analysis of Data Mining algorithms in Weka
Performance analysis of Data Mining algorithms in WekaPerformance analysis of Data Mining algorithms in Weka
Performance analysis of Data Mining algorithms in Weka
 

Semelhante a Towards Automatic Analysis of Online Discussions among Hong Kong Students

Lec1-Into
Lec1-IntoLec1-Into
Lec1-Intobutest
 
Falon Deimler Methodological Workshop Presentation
Falon Deimler Methodological Workshop PresentationFalon Deimler Methodological Workshop Presentation
Falon Deimler Methodological Workshop PresentationFalon Deimler
 
The Role of Families and the Community Proposal Template (N.docx
The Role of Families and the Community Proposal Template  (N.docxThe Role of Families and the Community Proposal Template  (N.docx
The Role of Families and the Community Proposal Template (N.docxssusera34210
 
Data Science Workshop - day 1
Data Science Workshop - day 1Data Science Workshop - day 1
Data Science Workshop - day 1Aseel Addawood
 
Modern information Retrieval-Relevance Feedback
Modern information Retrieval-Relevance FeedbackModern information Retrieval-Relevance Feedback
Modern information Retrieval-Relevance FeedbackHasanulFahmi2
 
Utility of topic extraction on customer experience data
Utility of topic extraction on customer experience dataUtility of topic extraction on customer experience data
Utility of topic extraction on customer experience dataKiran Karkera
 
Note References should be 2015 or laterWrite 300 words on Dis.docx
Note References should be 2015 or laterWrite 300 words on Dis.docxNote References should be 2015 or laterWrite 300 words on Dis.docx
Note References should be 2015 or laterWrite 300 words on Dis.docxcurwenmichaela
 
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"CITE
 
Paper id 28201441
Paper id 28201441Paper id 28201441
Paper id 28201441IJRAT
 
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Jin Young Kim
 
Salford uni pres 2011
Salford uni pres 2011Salford uni pres 2011
Salford uni pres 2011oseamons
 
Salford uni pres 2011
Salford uni pres 2011Salford uni pres 2011
Salford uni pres 2011oseamons
 
1.2 Motivating Challenges As mentioned earlier, traditional data
1.2 Motivating Challenges As mentioned earlier, traditional data1.2 Motivating Challenges As mentioned earlier, traditional data
1.2 Motivating Challenges As mentioned earlier, traditional dataSantosConleyha
 
TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKING
TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKINGTOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKING
TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKINGcsandit
 
Data-Driven Learning Strategy
Data-Driven Learning StrategyData-Driven Learning Strategy
Data-Driven Learning StrategyJessie Chuang
 
Using Computer as a Research Assistant in Qualitative Research
Using Computer as a Research Assistant in Qualitative ResearchUsing Computer as a Research Assistant in Qualitative Research
Using Computer as a Research Assistant in Qualitative ResearchJoshuaApolonio1
 
3 D Project Based Learning Basics for the New Generation Science Standards
3 D Project Based  Learning Basics for the New Generation Science Standards3 D Project Based  Learning Basics for the New Generation Science Standards
3 D Project Based Learning Basics for the New Generation Science Standardsrekharajaseran
 
Design based for lisbon 2011
Design based for lisbon 2011Design based for lisbon 2011
Design based for lisbon 2011Terry Anderson
 

Semelhante a Towards Automatic Analysis of Online Discussions among Hong Kong Students (20)

Lec1-Into
Lec1-IntoLec1-Into
Lec1-Into
 
Falon Deimler Methodological Workshop Presentation
Falon Deimler Methodological Workshop PresentationFalon Deimler Methodological Workshop Presentation
Falon Deimler Methodological Workshop Presentation
 
The Role of Families and the Community Proposal Template (N.docx
The Role of Families and the Community Proposal Template  (N.docxThe Role of Families and the Community Proposal Template  (N.docx
The Role of Families and the Community Proposal Template (N.docx
 
Data Science Workshop - day 1
Data Science Workshop - day 1Data Science Workshop - day 1
Data Science Workshop - day 1
 
Modern information Retrieval-Relevance Feedback
Modern information Retrieval-Relevance FeedbackModern information Retrieval-Relevance Feedback
Modern information Retrieval-Relevance Feedback
 
Utility of topic extraction on customer experience data
Utility of topic extraction on customer experience dataUtility of topic extraction on customer experience data
Utility of topic extraction on customer experience data
 
Eric Smidth
Eric SmidthEric Smidth
Eric Smidth
 
Note References should be 2015 or laterWrite 300 words on Dis.docx
Note References should be 2015 or laterWrite 300 words on Dis.docxNote References should be 2015 or laterWrite 300 words on Dis.docx
Note References should be 2015 or laterWrite 300 words on Dis.docx
 
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
 
Paper id 28201441
Paper id 28201441Paper id 28201441
Paper id 28201441
 
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
 
Salford uni pres 2011
Salford uni pres 2011Salford uni pres 2011
Salford uni pres 2011
 
Salford uni pres 2011
Salford uni pres 2011Salford uni pres 2011
Salford uni pres 2011
 
1.2 Motivating Challenges As mentioned earlier, traditional data
1.2 Motivating Challenges As mentioned earlier, traditional data1.2 Motivating Challenges As mentioned earlier, traditional data
1.2 Motivating Challenges As mentioned earlier, traditional data
 
TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKING
TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKINGTOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKING
TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKING
 
Data-Driven Learning Strategy
Data-Driven Learning StrategyData-Driven Learning Strategy
Data-Driven Learning Strategy
 
qualitative.ppt
qualitative.pptqualitative.ppt
qualitative.ppt
 
Using Computer as a Research Assistant in Qualitative Research
Using Computer as a Research Assistant in Qualitative ResearchUsing Computer as a Research Assistant in Qualitative Research
Using Computer as a Research Assistant in Qualitative Research
 
3 D Project Based Learning Basics for the New Generation Science Standards
3 D Project Based  Learning Basics for the New Generation Science Standards3 D Project Based  Learning Basics for the New Generation Science Standards
3 D Project Based Learning Basics for the New Generation Science Standards
 
Design based for lisbon 2011
Design based for lisbon 2011Design based for lisbon 2011
Design based for lisbon 2011
 

Mais de CITE

Keynote 1: Teaching and Learning Computational Thinking at Scale
Keynote 1: Teaching and Learning Computational Thinking at ScaleKeynote 1: Teaching and Learning Computational Thinking at Scale
Keynote 1: Teaching and Learning Computational Thinking at ScaleCITE
 
Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...
Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...
Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...CITE
 
Changing Technology Changing Practice: Empowering Staff and Building Capabili...
Changing Technology Changing Practice: Empowering Staff and Building Capabili...Changing Technology Changing Practice: Empowering Staff and Building Capabili...
Changing Technology Changing Practice: Empowering Staff and Building Capabili...CITE
 
Traditional Large Scale Educational Assessment and the Incorporation of Digit...
Traditional Large Scale Educational Assessment and the Incorporation of Digit...Traditional Large Scale Educational Assessment and the Incorporation of Digit...
Traditional Large Scale Educational Assessment and the Incorporation of Digit...CITE
 
Scaling up Assessment for Learning
Scaling up Assessment for LearningScaling up Assessment for Learning
Scaling up Assessment for LearningCITE
 
Seminar on policy study on e-Learning in Informal Learning contexts
Seminar on policy study on e-Learning in Informal Learning contextsSeminar on policy study on e-Learning in Informal Learning contexts
Seminar on policy study on e-Learning in Informal Learning contextsCITE
 
Seminar on policy study on e-Learning in Formal & Open Learning contexts
Seminar on policy study on e-Learning in Formal & Open Learning contextsSeminar on policy study on e-Learning in Formal & Open Learning contexts
Seminar on policy study on e-Learning in Formal & Open Learning contextsCITE
 
Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society
Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society
Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society CITE
 
G:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-Fred
G:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-FredG:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-Fred
G:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-FredCITE
 
Dr. David Gibson: Challenge-Based Learning
Dr. David Gibson: Challenge-Based LearningDr. David Gibson: Challenge-Based Learning
Dr. David Gibson: Challenge-Based LearningCITE
 
Analogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thoughtAnalogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thoughtCITE
 
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...CITE
 
Science of Learning — Why it matters to schools and families?
Science of Learning — Why it matters to schools and families?Science of Learning — Why it matters to schools and families?
Science of Learning — Why it matters to schools and families?CITE
 
Understanding the self through self bias
Understanding the self through self biasUnderstanding the self through self bias
Understanding the self through self biasCITE
 
The implementation of "Reading Battle" in Lam Tin Methodist Primary School
The implementation of "Reading Battle" in Lam Tin Methodist Primary SchoolThe implementation of "Reading Battle" in Lam Tin Methodist Primary School
The implementation of "Reading Battle" in Lam Tin Methodist Primary SchoolCITE
 
Strengthening students' reading comprehension ability (both Chinese and Engli...
Strengthening students' reading comprehension ability (both Chinese and Engli...Strengthening students' reading comprehension ability (both Chinese and Engli...
Strengthening students' reading comprehension ability (both Chinese and Engli...CITE
 
Xiao Hu "Learning Analytics Initiatives"
Xiao Hu "Learning Analytics Initiatives"Xiao Hu "Learning Analytics Initiatives"
Xiao Hu "Learning Analytics Initiatives"CITE
 
Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...
Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...
Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...CITE
 
Phil Winne "Learning Analytics for Learning Science When N = me"
Phil Winne "Learning Analytics for Learning Science When N = me"Phil Winne "Learning Analytics for Learning Science When N = me"
Phil Winne "Learning Analytics for Learning Science When N = me"CITE
 
Xiao Hu "Overview of the Space of Learning Analytics and Educational Data Min...
Xiao Hu "Overview of the Space of Learning Analytics and Educational Data Min...Xiao Hu "Overview of the Space of Learning Analytics and Educational Data Min...
Xiao Hu "Overview of the Space of Learning Analytics and Educational Data Min...CITE
 

Mais de CITE (20)

Keynote 1: Teaching and Learning Computational Thinking at Scale
Keynote 1: Teaching and Learning Computational Thinking at ScaleKeynote 1: Teaching and Learning Computational Thinking at Scale
Keynote 1: Teaching and Learning Computational Thinking at Scale
 
Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...
Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...
Keynote 2: Social Epistemic Cognition in Engineering Learning: Theory, Pedago...
 
Changing Technology Changing Practice: Empowering Staff and Building Capabili...
Changing Technology Changing Practice: Empowering Staff and Building Capabili...Changing Technology Changing Practice: Empowering Staff and Building Capabili...
Changing Technology Changing Practice: Empowering Staff and Building Capabili...
 
Traditional Large Scale Educational Assessment and the Incorporation of Digit...
Traditional Large Scale Educational Assessment and the Incorporation of Digit...Traditional Large Scale Educational Assessment and the Incorporation of Digit...
Traditional Large Scale Educational Assessment and the Incorporation of Digit...
 
Scaling up Assessment for Learning
Scaling up Assessment for LearningScaling up Assessment for Learning
Scaling up Assessment for Learning
 
Seminar on policy study on e-Learning in Informal Learning contexts
Seminar on policy study on e-Learning in Informal Learning contextsSeminar on policy study on e-Learning in Informal Learning contexts
Seminar on policy study on e-Learning in Informal Learning contexts
 
Seminar on policy study on e-Learning in Formal & Open Learning contexts
Seminar on policy study on e-Learning in Formal & Open Learning contextsSeminar on policy study on e-Learning in Formal & Open Learning contexts
Seminar on policy study on e-Learning in Formal & Open Learning contexts
 
Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society
Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society
Prof. Gerald KNEZEK: Implications of Digital Generations for a Learning Society
 
G:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-Fred
G:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-FredG:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-Fred
G:\CITERS2015\29May2015\2 Invited-Talk-2-Sidorko-Fred
 
Dr. David Gibson: Challenge-Based Learning
Dr. David Gibson: Challenge-Based LearningDr. David Gibson: Challenge-Based Learning
Dr. David Gibson: Challenge-Based Learning
 
Analogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thoughtAnalogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thought
 
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
 
Science of Learning — Why it matters to schools and families?
Science of Learning — Why it matters to schools and families?Science of Learning — Why it matters to schools and families?
Science of Learning — Why it matters to schools and families?
 
Understanding the self through self bias
Understanding the self through self biasUnderstanding the self through self bias
Understanding the self through self bias
 
The implementation of "Reading Battle" in Lam Tin Methodist Primary School
The implementation of "Reading Battle" in Lam Tin Methodist Primary SchoolThe implementation of "Reading Battle" in Lam Tin Methodist Primary School
The implementation of "Reading Battle" in Lam Tin Methodist Primary School
 
Strengthening students' reading comprehension ability (both Chinese and Engli...
Strengthening students' reading comprehension ability (both Chinese and Engli...Strengthening students' reading comprehension ability (both Chinese and Engli...
Strengthening students' reading comprehension ability (both Chinese and Engli...
 
Xiao Hu "Learning Analytics Initiatives"
Xiao Hu "Learning Analytics Initiatives"Xiao Hu "Learning Analytics Initiatives"
Xiao Hu "Learning Analytics Initiatives"
 
Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...
Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...
Tiffany Barnes "Making a meaningful difference: Leveraging data to improve le...
 
Phil Winne "Learning Analytics for Learning Science When N = me"
Phil Winne "Learning Analytics for Learning Science When N = me"Phil Winne "Learning Analytics for Learning Science When N = me"
Phil Winne "Learning Analytics for Learning Science When N = me"
 
Xiao Hu "Overview of the Space of Learning Analytics and Educational Data Min...
Xiao Hu "Overview of the Space of Learning Analytics and Educational Data Min...Xiao Hu "Overview of the Space of Learning Analytics and Educational Data Min...
Xiao Hu "Overview of the Space of Learning Analytics and Educational Data Min...
 

Último

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 

Último (20)

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 

Towards Automatic Analysis of Online Discussions among Hong Kong Students

  • 1. Xiao Hu University of Hong Kong CITE Research Symposium 2013 May 12, 2013 Towards Automatic Analysis of Online Discussions Among Hong Kong Students
  • 2. Outline  Goals and Purposes  Data Mining and Applications to Online Discussions  Classification  Association Rule Mining  Findings  More questions to answer  Bridging research and teaching
  • 3. Goals and Purposes  Online discussions are widely used in education  Effective for communication and collaboration  Need tools to monitor online discussions  Data mining may help (semi-)automatically identify various patterns in online discussions, for example:  Threads that need interventions  Outcome predictions  Role identification (e.g., question raiser, answer provide, etc.)  Network analysis of student groups  Assessment of discussion quality  .....
  • 4. This Study  How effective it is to mine online discussions of HK students?  A case study on  1,965 discussion posts  on the subject of global warming  collected from five primary or secondary schools in Hong Kong from years 2006-2009  383 discussion threads involving 1 to 21 participants  Two commonly used Data Mining techniques  Classification  Association rule mining
  • 5. What is Data Mining?  To identify patterns (or to prove no patterns) from a dataset  DM is NOT querying databases  Where you know what you are looking for  E.g., total sales in the past three years  DM is NOT statistical testing  Where you know the hypotheses  E.g. H0: the means of two groups are equal  DM is discovery-based  Find out unknown patterns, generate hypotheses  DM is iterative  exhaustively explore very large data sets
  • 6. Data Mining – Classification  Functionality: to assign one of a number of class labels to each instance of your data  Examples of classification tasks:  Predicting tumor cells as benign or malignant  Classifying credit card transactions as legitimate or fraudulent  Categorizing news stories as finance, weather, entertainment, sports, etc  Categorizing library materials by catalogs  Predicting whether a post in an online forum will get replies or not
  • 7. How Classification Works?  Given a collection of data (training set )  Each instance contains a set of attributes, one of the attributes is the class label.  Find (calculate) a model for the class label as a function of the values of other attributes  Goal: previously unseen data can then be fed to the model and the model assigns a class label as accurately as possible  Performance measure: accuracy  How many instances are correctly classified
  • 8. An Illustrative Example (1) 8 Training Data NAME RANK YEARS TENURED Mike Assistant Prof 3 no Mary Assistant Prof 7 yes Bill Professor 2 yes Jim Associate Prof 7 yes Dave Assistant Prof 6 no Anne Associate Prof 3 no Classification Algorithms IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’ Classifier (Model)
  • 9. An Illustrative Example (2) 9 Classification Algorithms IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’ Classifier (Model) Unseen Data (Jeff, Professor, 4) Tenured?
  • 10. Classifying Online Discussions (1)  Task1: threads with one vs. many participants  To predict whether a post belongs to a thread involving only one participant or a thread involving many (> 14) participants  Attributes used to build classification model  Words in the posts: individual words (unigram) two consecutive words (bigrams)  Classification algorithm: Naive Bayesian  Empirically effective in text categorization  Performance: 79.07%
  • 11. Classifying Online Discussions (2)  Task2: initial posts with vs. without replies  To predict whether an initial post are likely to get replies or not  Attributes used to build classification model  Words in the posts: individual words (unigram) two consecutive words (bigrams)  Classification algorithm: Naive Bayesian  Empirically effective in text categorization  Performance: 64% Need to look deeper: mine patterns in each category
  • 12. Data Mining – Association Rules  Functionality: to find associative relations between patterns frequently occurring in your data  {Pattern A} => {Pattern B} with certain probability  Examples of association rule mining tasks:  Basket (shopping cart) analysis: customers buying product A often also buy product B  Medical diagnosis: a patient with symptoms A is likely to have disease B  Protein sequences: the appearances of amino acids A indicates a greater chance of also having amino acids C  Online discussions: a post with word or phrase A is likely to be in class B
  • 13. Mining Association Rules from Online Discussions (1)  Task 1: Words and phrases strongly associated with threads with one or many participants Rank One participant Many participants 1 dioxide i agree 2 carbon dioxide agree 3 carbon i 4 temperature greenhouse gases 5 global warming i think 6 global think 7 warming yes 8 power carbon dioxide 9 air global warming 10 water yeah
  • 14. Mining Association Rules from Online Discussions (2)  Task 2: Words and phrases strongly associated with initial posts with or without replies Rank Has no reply Has replies 1 global warming protect 2 earth’s melt 3 global world 4 warming warming 5 earth sea 6 s i 7 greenhouse ice 8 effect rise 9 gases global warming 10 greenhouse effect global
  • 15. Findings and future work  Data mining techniques were able to find patterns from online discussions among Hong Kong students  It was feasible to distinguish threads and posts in contrast categories  Same techniques can be applied to distinguish  Shallow and deep discussions (depth of threads)  Confusion level of posts (need annotations on training data)  Speech acts of posts (need annotations on training data)  Emotions in the posts (need annotations on training data)
  • 16. Integrating Research and Teaching  Both data mining techniques are discussed and practiced in the Data Mining course in the Bachelor of Science in Information Management (BSIM 0018)  The tool used in this project is also taught in the course  Projects like this can be students’ course projects,
  • 17. Thank you! Questions, comments, and suggestions are appreciated! Xiao Hu: xiaoxhu@hku.hk