SlideShare uma empresa Scribd logo
1 de 29
GEORGE GKOTSIS 1, MARIA LIAKATA 2,
CARLOS PEDRINACI 3, JOHN DOMINGUE 3
Leveraging Textual Features for Best
Answer Prediction in Community-based
Question Answering
1King’s College London
2Department of Computer Science, University of Warwick
3Knowledge Media Institute, The Open University
Outline
8-11June 2015ICCSS 2015
 Motivation
 Problem description
 Proposed solution
 Evaluation
 ACQUA
8-11June 2015ICCSS 2015
Motivation
Questions on social networking sites
8-11June 2015ICCSS 2015
Recommendations &
opinions
Authoritative
responses
Expert &
Empirical
knowledge
Queries on CQA
8-11June 2015ICCSS 2015
8-11June 2015ICCSS 2015
Problem description
8-11June 2015ICCSS 2015
Reputation based Answer Rating based
8-11June 2015ICCSS 2015
“…we observe significant
assortativity in the reputations of
co-answerers, relationships
between reputation and answer
speed, and that the probability of
an answer being chosen as the
best one strongly depends on
temporal characteristics of answer
arrivals.”
Ashton Anderson, Daniel Huttenlocher,
Jon Kleinberg, Jure Leskovec
Discovering Value from Community
Activity on Focused Question Answering
Sites: A Case Study of Stack Overflow.
KDD 2012
“When available, scoring (or
rating) features improve
prediction results significantly,
which demonstrates the value of
community feedback and
reputation for identifying valuable
answers.”
Grégoire Burel, Yulan He, Harith Alani.
Automatic Identification of Best Answers
in Online Enquiry Communities
ESWC 2012
State of the art solutions
Best answer prediction in Social Q&A
8-11June 2015ICCSS 2015
 Binary classification problem
 Is it solved?
 Yes, partially
 Current solutions depend on:
Answer Ratings
• Score, #comments
Knowledge is Future & Unknown
User Ratings
• User Reputation
• UpVotes etc
• Preferential attachment
Knowledge is Past & Not
always available
State of the art solutions
Summary
8-11June 2015ICCSS 2015
Our solution
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
Linguistic User Ratings Answer ratings
Average Precision
StackExchange network
8-11June 2015ICCSS 2015
SE “is all about getting answers, it’s not a
discussion forum, there’s no chit-chat”
 123 Q&A sites
 5,622,330 users
 9.5 million questions
 16.3 million answers
 9.3 million visits per day
20 June 2014:
8-11June 2015ICCSS 2015
StackOverflow
91%
The Rest
9%
3,375,817
3,795,276
0
1,000,000
2,000,000
3,000,000
4,000,000
5,000,000
6,000,000
7,000,000
8,000,000
stackoverflow
Non Accepted
Answers
Accepted
Answers
September 2013 dump
Questions with Accepted Answers
Shallow Linguistic features
8-11June 2015ICCSS 2015
 Long history, coming from studies on readability
1. Average number of characters per word
2. Average number of words per sentence
3. Number of words in the longest sentence
4. Answer length
5. Log Likehood:
Pitler &
Nenkova, 2008
StackOverflow
Overview of shallow features’ evolution
8-11June 2015ICCSS 2015
Shallow features: Observations
8-11June 2015ICCSS 2015
 Accepted answers tend to be:
 Longer
 Differ more from the community vocabulary
 Contain shorter words
 Have longer longest sentences
 Have more words per sentence
But how good are shallow features?
But how good are shallow features?
8-11June 2015ICCSS 2015
 58% macro precision (our baseline)
 Possible reasons
1. Evolution of language characteristics
 Language becomes more eloquent
2. Variance is huge
3. Universal classifier looks unreachable, e.g.:
 SuperUser average length is 577
 Skeptics average length is 2,154
Bad
Good
StackOverflow vrs. SuperUser
8-11June 2015ICCSS 2015
8-11June 2015ICCSS 2015
Proposed solution
Objectives
8-11June 2015ICCSS 2015
 Build a classifier which is:
1. Based on linguistic features solely
2. Robust
 Performs equally well to other classifiers that use user ratings
(past knowledge) or answer ratings (future knowledge)
3. Universal
 Same classifier applicable to as many SE websites possible
(domain agnostic)
Feature discretisation
Example for Length
8-11June 2015ICCSS 2015
Group by question
Question Id
1
5
Answer Id
6
7
Length
2 200
3 150
4 250
150
100
Sort by Length in descending order
Rank
LengthD
1
2
3
1
2
Feature discretisation
8-11June 2015ICCSS 2015
Category Name Information Gain
Linguistic
Length 0.0226
LongestSentence 0.0121
LL 0.0053
WordsPerSentence 0.0048
CharactersPerWord 0.0052
Linguistic
Discretisation
LengthD 0.2168
LongestSentenceD 0.1750
LLD 0.1180
WordsPerSentenceD 0.1404
CharactersPerWordD 0.1162
20x increase
User and answer rating features
8-11June 2015ICCSS 2015
Category Name
Other
Age
CreationDateD
AnswerCount
User Rating
UserReputation
UserUpVotes
UserDownVotes
UserViews
UserUpDownVotes
Answer
rating
Score
CommentCount
ScoreRatio
8-11June 2015ICCSS 2015
Evaluation
Evaluation Comparison
8-11June 2015ICCSS 2015
Case Features Used P R FM AUC
1 Linguistic 0.58 0.60 0.56 0.60
2 Linguistic & Discretisation 0.81 0.70 0.74 0.84
3 Linguistic & Discretisation &
Other
0.84 0.7 0.76 0.87
4 Linguistic & Other & User
Rating
(no discretisation)
0.82 0.69 0.75 0.86
5 Linguistic & Other & User
Rating
(with discretisation)
0.82 0.72 0.77 0.88
6 All features
(Answer and User Rating
with discretisation)
0.88 0.85 0.86 0.94
8-11June 2015ICCSS 2015
ACQUA
Automatic Community-based Question Answering
https://acqua.kmi.open.ac.uk/
8-11June 2015ICCSS 2015
ACQUA - Architecture
ACQUA - Screenshot
8-11June 2015ICCSS 2015
Read more about our work
8-11June 2015ICCSS 2015
 It’s All in the Content: State of the Art Best
Answer Prediction based on Discretisation of
Shallow Linguistic Features. WebSci ’14
 ACQUA: Automated Community-based
Question Answering through the
Discretisation of Shallow Linguistic Features.
The Journal of Web Science, 1(1) (preprint available)
Thank you
8-11June 2015ICCSS 2015
http://xkcd.com/386/

Mais conteúdo relacionado

Destaque

Madis Room: come funziona in caso di sisma
Madis Room: come funziona in caso di sismaMadis Room: come funziona in caso di sisma
Madis Room: come funziona in caso di sismamadisroom
 
Voice of Customer and Beyond
Voice of Customer and BeyondVoice of Customer and Beyond
Voice of Customer and BeyondLucieColt
 
La prevenzione nazionale è l’anello debole: dobbiamo farla da soli, in casa
La prevenzione nazionale è l’anello debole: dobbiamo farla da soli, in casaLa prevenzione nazionale è l’anello debole: dobbiamo farla da soli, in casa
La prevenzione nazionale è l’anello debole: dobbiamo farla da soli, in casamadisroom
 
παρουσιαση ανιματιον προσχεδιο
παρουσιαση ανιματιον προσχεδιοπαρουσιαση ανιματιον προσχεδιο
παρουσιαση ανιματιον προσχεδιοwitsh
 
родителям своим спасибо говорим
родителям своим спасибо говоримродителям своим спасибо говорим
родителям своим спасибо говоримskazkakotel
 
Apps multi os it for business n 2194 - avril 2015
Apps multi os it for business n 2194 - avril 2015Apps multi os it for business n 2194 - avril 2015
Apps multi os it for business n 2194 - avril 2015pierress
 
Περί υδάτων, λίθων και μετοικεσίας. Εξερευνώντας τη γη της Αρκαδίας
Περί υδάτων, λίθων και μετοικεσίας. Εξερευνώντας τη γη της ΑρκαδίαςΠερί υδάτων, λίθων και μετοικεσίας. Εξερευνώντας τη γη της Αρκαδίας
Περί υδάτων, λίθων και μετοικεσίας. Εξερευνώντας τη γη της Αρκαδίαςgper2014
 
Netforte Company Presentation
Netforte Company Presentation Netforte Company Presentation
Netforte Company Presentation LucieColt
 
La Madis Room nasce dall’analisi dei dati sull’alta sismicità dell’Italia
La Madis Room nasce dall’analisi dei dati sull’alta sismicità dell’ItaliaLa Madis Room nasce dall’analisi dei dati sull’alta sismicità dell’Italia
La Madis Room nasce dall’analisi dei dati sull’alta sismicità dell’Italiamadisroom
 
Grand estela maría_unidad5y6
Grand estela maría_unidad5y6Grand estela maría_unidad5y6
Grand estela maría_unidad5y6Teligrand
 
Madis Room: il brevetto e l'installazione
Madis Room: il brevetto e l'installazioneMadis Room: il brevetto e l'installazione
Madis Room: il brevetto e l'installazionemadisroom
 
платные образовательные услуги
платные образовательные услугиплатные образовательные услуги
платные образовательные услугиskazkakotel
 
В сказке всё у нас цветёт
В сказке всё у нас цветётВ сказке всё у нас цветёт
В сказке всё у нас цветётskazkakotel
 
Grand estela maría_unidad5y6
Grand estela maría_unidad5y6Grand estela maría_unidad5y6
Grand estela maría_unidad5y6Teligrand
 

Destaque (16)

Madis Room: come funziona in caso di sisma
Madis Room: come funziona in caso di sismaMadis Room: come funziona in caso di sisma
Madis Room: come funziona in caso di sisma
 
Voice of Customer and Beyond
Voice of Customer and BeyondVoice of Customer and Beyond
Voice of Customer and Beyond
 
Bonito
BonitoBonito
Bonito
 
Blog
BlogBlog
Blog
 
La prevenzione nazionale è l’anello debole: dobbiamo farla da soli, in casa
La prevenzione nazionale è l’anello debole: dobbiamo farla da soli, in casaLa prevenzione nazionale è l’anello debole: dobbiamo farla da soli, in casa
La prevenzione nazionale è l’anello debole: dobbiamo farla da soli, in casa
 
παρουσιαση ανιματιον προσχεδιο
παρουσιαση ανιματιον προσχεδιοπαρουσιαση ανιματιον προσχεδιο
παρουσιαση ανιματιον προσχεδιο
 
родителям своим спасибо говорим
родителям своим спасибо говоримродителям своим спасибо говорим
родителям своим спасибо говорим
 
Apps multi os it for business n 2194 - avril 2015
Apps multi os it for business n 2194 - avril 2015Apps multi os it for business n 2194 - avril 2015
Apps multi os it for business n 2194 - avril 2015
 
Περί υδάτων, λίθων και μετοικεσίας. Εξερευνώντας τη γη της Αρκαδίας
Περί υδάτων, λίθων και μετοικεσίας. Εξερευνώντας τη γη της ΑρκαδίαςΠερί υδάτων, λίθων και μετοικεσίας. Εξερευνώντας τη γη της Αρκαδίας
Περί υδάτων, λίθων και μετοικεσίας. Εξερευνώντας τη γη της Αρκαδίας
 
Netforte Company Presentation
Netforte Company Presentation Netforte Company Presentation
Netforte Company Presentation
 
La Madis Room nasce dall’analisi dei dati sull’alta sismicità dell’Italia
La Madis Room nasce dall’analisi dei dati sull’alta sismicità dell’ItaliaLa Madis Room nasce dall’analisi dei dati sull’alta sismicità dell’Italia
La Madis Room nasce dall’analisi dei dati sull’alta sismicità dell’Italia
 
Grand estela maría_unidad5y6
Grand estela maría_unidad5y6Grand estela maría_unidad5y6
Grand estela maría_unidad5y6
 
Madis Room: il brevetto e l'installazione
Madis Room: il brevetto e l'installazioneMadis Room: il brevetto e l'installazione
Madis Room: il brevetto e l'installazione
 
платные образовательные услуги
платные образовательные услугиплатные образовательные услуги
платные образовательные услуги
 
В сказке всё у нас цветёт
В сказке всё у нас цветётВ сказке всё у нас цветёт
В сказке всё у нас цветёт
 
Grand estela maría_unidad5y6
Grand estela maría_unidad5y6Grand estela maría_unidad5y6
Grand estela maría_unidad5y6
 

Semelhante a Leveraging Textual Features for Best Answer Prediction in Community-based Question Answering

It’s all in the Content: State of the art Best Answer Prediction based on Dis...
It’s all in the Content: State of the art Best Answer Prediction based on Dis...It’s all in the Content: State of the art Best Answer Prediction based on Dis...
It’s all in the Content: State of the art Best Answer Prediction based on Dis...George Gkotsis
 
A Sentiment-Based Approach to Twitter User Recommendation
A Sentiment-Based Approach to Twitter User RecommendationA Sentiment-Based Approach to Twitter User Recommendation
A Sentiment-Based Approach to Twitter User RecommendationDavide Feltoni Gurini
 
Jisc learning analytics update-feb 2016
Jisc learning analytics update-feb 2016Jisc learning analytics update-feb 2016
Jisc learning analytics update-feb 2016Paul Bailey
 
Learning in the wild: Predicting the formation of ties in 'Ask' subreddit com...
Learning in the wild: Predicting the formation of ties in 'Ask' subreddit com...Learning in the wild: Predicting the formation of ties in 'Ask' subreddit com...
Learning in the wild: Predicting the formation of ties in 'Ask' subreddit com...University of Groningen (The Netherlands)
 
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Thoughtworks
 
Content Curation – New L&D Mindset & Skill Set
Content Curation – New L&D Mindset & Skill SetContent Curation – New L&D Mindset & Skill Set
Content Curation – New L&D Mindset & Skill SetLearningCafe
 
Relevant multimedia question answering
Relevant multimedia question answeringRelevant multimedia question answering
Relevant multimedia question answeringvembuking
 
Lee Sung Eob Mastersthesisproposal03
Lee Sung Eob Mastersthesisproposal03Lee Sung Eob Mastersthesisproposal03
Lee Sung Eob Mastersthesisproposal03Sung Eob Lee
 
How Data Science Works for Education and Entertainment
How Data Science Works for Education and EntertainmentHow Data Science Works for Education and Entertainment
How Data Science Works for Education and EntertainmentHwai-Jung Hsu
 
What’s your score? Using XLAs to quantify service experience
What’s your score? Using XLAs to quantify service experienceWhat’s your score? Using XLAs to quantify service experience
What’s your score? Using XLAs to quantify service experiencenexthink
 
Jisc learning analytics MASHEIN Jan 2017
Jisc learning analytics MASHEIN Jan 2017Jisc learning analytics MASHEIN Jan 2017
Jisc learning analytics MASHEIN Jan 2017Paul Bailey
 
ICSE 2016 - Opening and Awards I
ICSE 2016 - Opening and Awards IICSE 2016 - Opening and Awards I
ICSE 2016 - Opening and Awards Isonal-mahajan
 
Jisc learning analytics service core slides
Jisc learning analytics service core slidesJisc learning analytics service core slides
Jisc learning analytics service core slidesPaul Bailey
 
XSIM and CTSC OSG Satellite Presentations at 2015 OSG All Hands Meeting
XSIM and CTSC OSG Satellite Presentations at 2015 OSG All Hands MeetingXSIM and CTSC OSG Satellite Presentations at 2015 OSG All Hands Meeting
XSIM and CTSC OSG Satellite Presentations at 2015 OSG All Hands MeetingVon Welch
 
Equipping the researcher - patterns in the UK and US
Equipping the researcher - patterns in the UK and USEquipping the researcher - patterns in the UK and US
Equipping the researcher - patterns in the UK and USJisc
 
A novel model of cognitive presence assessment using automated learning analy...
A novel model of cognitive presence assessment using automated learning analy...A novel model of cognitive presence assessment using automated learning analy...
A novel model of cognitive presence assessment using automated learning analy...Vitomir Kovanovic
 
SCONUL Summer Conference 2018 - Simon Walker
SCONUL Summer Conference 2018 - Simon WalkerSCONUL Summer Conference 2018 - Simon Walker
SCONUL Summer Conference 2018 - Simon Walkersconul
 
Knowledge Management for Real
Knowledge Management for RealKnowledge Management for Real
Knowledge Management for RealCherwell Software
 
MOOCs: A View from the Digital Trenches
MOOCs: A View from the Digital TrenchesMOOCs: A View from the Digital Trenches
MOOCs: A View from the Digital TrenchesKevin Werbach
 

Semelhante a Leveraging Textual Features for Best Answer Prediction in Community-based Question Answering (20)

It’s all in the Content: State of the art Best Answer Prediction based on Dis...
It’s all in the Content: State of the art Best Answer Prediction based on Dis...It’s all in the Content: State of the art Best Answer Prediction based on Dis...
It’s all in the Content: State of the art Best Answer Prediction based on Dis...
 
A Sentiment-Based Approach to Twitter User Recommendation
A Sentiment-Based Approach to Twitter User RecommendationA Sentiment-Based Approach to Twitter User Recommendation
A Sentiment-Based Approach to Twitter User Recommendation
 
Jisc learning analytics update-feb 2016
Jisc learning analytics update-feb 2016Jisc learning analytics update-feb 2016
Jisc learning analytics update-feb 2016
 
Learning in the wild: Predicting the formation of ties in 'Ask' subreddit com...
Learning in the wild: Predicting the formation of ties in 'Ask' subreddit com...Learning in the wild: Predicting the formation of ties in 'Ask' subreddit com...
Learning in the wild: Predicting the formation of ties in 'Ask' subreddit com...
 
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
 
Content Curation – New L&D Mindset & Skill Set
Content Curation – New L&D Mindset & Skill SetContent Curation – New L&D Mindset & Skill Set
Content Curation – New L&D Mindset & Skill Set
 
Relevant multimedia question answering
Relevant multimedia question answeringRelevant multimedia question answering
Relevant multimedia question answering
 
Lee Sung Eob Mastersthesisproposal03
Lee Sung Eob Mastersthesisproposal03Lee Sung Eob Mastersthesisproposal03
Lee Sung Eob Mastersthesisproposal03
 
How Data Science Works for Education and Entertainment
How Data Science Works for Education and EntertainmentHow Data Science Works for Education and Entertainment
How Data Science Works for Education and Entertainment
 
Show me the data! Actionable insight from open courses
Show me the data! Actionable insight from open coursesShow me the data! Actionable insight from open courses
Show me the data! Actionable insight from open courses
 
What’s your score? Using XLAs to quantify service experience
What’s your score? Using XLAs to quantify service experienceWhat’s your score? Using XLAs to quantify service experience
What’s your score? Using XLAs to quantify service experience
 
Jisc learning analytics MASHEIN Jan 2017
Jisc learning analytics MASHEIN Jan 2017Jisc learning analytics MASHEIN Jan 2017
Jisc learning analytics MASHEIN Jan 2017
 
ICSE 2016 - Opening and Awards I
ICSE 2016 - Opening and Awards IICSE 2016 - Opening and Awards I
ICSE 2016 - Opening and Awards I
 
Jisc learning analytics service core slides
Jisc learning analytics service core slidesJisc learning analytics service core slides
Jisc learning analytics service core slides
 
XSIM and CTSC OSG Satellite Presentations at 2015 OSG All Hands Meeting
XSIM and CTSC OSG Satellite Presentations at 2015 OSG All Hands MeetingXSIM and CTSC OSG Satellite Presentations at 2015 OSG All Hands Meeting
XSIM and CTSC OSG Satellite Presentations at 2015 OSG All Hands Meeting
 
Equipping the researcher - patterns in the UK and US
Equipping the researcher - patterns in the UK and USEquipping the researcher - patterns in the UK and US
Equipping the researcher - patterns in the UK and US
 
A novel model of cognitive presence assessment using automated learning analy...
A novel model of cognitive presence assessment using automated learning analy...A novel model of cognitive presence assessment using automated learning analy...
A novel model of cognitive presence assessment using automated learning analy...
 
SCONUL Summer Conference 2018 - Simon Walker
SCONUL Summer Conference 2018 - Simon WalkerSCONUL Summer Conference 2018 - Simon Walker
SCONUL Summer Conference 2018 - Simon Walker
 
Knowledge Management for Real
Knowledge Management for RealKnowledge Management for Real
Knowledge Management for Real
 
MOOCs: A View from the Digital Trenches
MOOCs: A View from the Digital TrenchesMOOCs: A View from the Digital Trenches
MOOCs: A View from the Digital Trenches
 

Último

Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.soniya singh
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...singhpriety023
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.CarlotaBedoya1
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableSeo
 
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...Escorts Call Girls
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...tanu pandey
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...Diya Sharma
 
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Onlineanilsa9823
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$kojalkojal131
 

Último (20)

Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
 
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
 
@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶
@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶
@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
 

Leveraging Textual Features for Best Answer Prediction in Community-based Question Answering

  • 1. GEORGE GKOTSIS 1, MARIA LIAKATA 2, CARLOS PEDRINACI 3, JOHN DOMINGUE 3 Leveraging Textual Features for Best Answer Prediction in Community-based Question Answering 1King’s College London 2Department of Computer Science, University of Warwick 3Knowledge Media Institute, The Open University
  • 2. Outline 8-11June 2015ICCSS 2015  Motivation  Problem description  Proposed solution  Evaluation  ACQUA
  • 4. Questions on social networking sites 8-11June 2015ICCSS 2015 Recommendations & opinions Authoritative responses Expert & Empirical knowledge
  • 5. Queries on CQA 8-11June 2015ICCSS 2015
  • 8. Reputation based Answer Rating based 8-11June 2015ICCSS 2015 “…we observe significant assortativity in the reputations of co-answerers, relationships between reputation and answer speed, and that the probability of an answer being chosen as the best one strongly depends on temporal characteristics of answer arrivals.” Ashton Anderson, Daniel Huttenlocher, Jon Kleinberg, Jure Leskovec Discovering Value from Community Activity on Focused Question Answering Sites: A Case Study of Stack Overflow. KDD 2012 “When available, scoring (or rating) features improve prediction results significantly, which demonstrates the value of community feedback and reputation for identifying valuable answers.” Grégoire Burel, Yulan He, Harith Alani. Automatic Identification of Best Answers in Online Enquiry Communities ESWC 2012 State of the art solutions
  • 9. Best answer prediction in Social Q&A 8-11June 2015ICCSS 2015  Binary classification problem  Is it solved?  Yes, partially  Current solutions depend on: Answer Ratings • Score, #comments Knowledge is Future & Unknown User Ratings • User Reputation • UpVotes etc • Preferential attachment Knowledge is Past & Not always available
  • 10. State of the art solutions Summary 8-11June 2015ICCSS 2015 Our solution 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00% Linguistic User Ratings Answer ratings Average Precision
  • 11. StackExchange network 8-11June 2015ICCSS 2015 SE “is all about getting answers, it’s not a discussion forum, there’s no chit-chat”  123 Q&A sites  5,622,330 users  9.5 million questions  16.3 million answers  9.3 million visits per day 20 June 2014:
  • 12. 8-11June 2015ICCSS 2015 StackOverflow 91% The Rest 9% 3,375,817 3,795,276 0 1,000,000 2,000,000 3,000,000 4,000,000 5,000,000 6,000,000 7,000,000 8,000,000 stackoverflow Non Accepted Answers Accepted Answers September 2013 dump Questions with Accepted Answers
  • 13. Shallow Linguistic features 8-11June 2015ICCSS 2015  Long history, coming from studies on readability 1. Average number of characters per word 2. Average number of words per sentence 3. Number of words in the longest sentence 4. Answer length 5. Log Likehood: Pitler & Nenkova, 2008
  • 14. StackOverflow Overview of shallow features’ evolution 8-11June 2015ICCSS 2015
  • 15. Shallow features: Observations 8-11June 2015ICCSS 2015  Accepted answers tend to be:  Longer  Differ more from the community vocabulary  Contain shorter words  Have longer longest sentences  Have more words per sentence But how good are shallow features?
  • 16. But how good are shallow features? 8-11June 2015ICCSS 2015  58% macro precision (our baseline)  Possible reasons 1. Evolution of language characteristics  Language becomes more eloquent 2. Variance is huge 3. Universal classifier looks unreachable, e.g.:  SuperUser average length is 577  Skeptics average length is 2,154 Bad Good
  • 19. Objectives 8-11June 2015ICCSS 2015  Build a classifier which is: 1. Based on linguistic features solely 2. Robust  Performs equally well to other classifiers that use user ratings (past knowledge) or answer ratings (future knowledge) 3. Universal  Same classifier applicable to as many SE websites possible (domain agnostic)
  • 20. Feature discretisation Example for Length 8-11June 2015ICCSS 2015 Group by question Question Id 1 5 Answer Id 6 7 Length 2 200 3 150 4 250 150 100 Sort by Length in descending order Rank LengthD 1 2 3 1 2
  • 21. Feature discretisation 8-11June 2015ICCSS 2015 Category Name Information Gain Linguistic Length 0.0226 LongestSentence 0.0121 LL 0.0053 WordsPerSentence 0.0048 CharactersPerWord 0.0052 Linguistic Discretisation LengthD 0.2168 LongestSentenceD 0.1750 LLD 0.1180 WordsPerSentenceD 0.1404 CharactersPerWordD 0.1162 20x increase
  • 22. User and answer rating features 8-11June 2015ICCSS 2015 Category Name Other Age CreationDateD AnswerCount User Rating UserReputation UserUpVotes UserDownVotes UserViews UserUpDownVotes Answer rating Score CommentCount ScoreRatio
  • 24. Evaluation Comparison 8-11June 2015ICCSS 2015 Case Features Used P R FM AUC 1 Linguistic 0.58 0.60 0.56 0.60 2 Linguistic & Discretisation 0.81 0.70 0.74 0.84 3 Linguistic & Discretisation & Other 0.84 0.7 0.76 0.87 4 Linguistic & Other & User Rating (no discretisation) 0.82 0.69 0.75 0.86 5 Linguistic & Other & User Rating (with discretisation) 0.82 0.72 0.77 0.88 6 All features (Answer and User Rating with discretisation) 0.88 0.85 0.86 0.94
  • 25. 8-11June 2015ICCSS 2015 ACQUA Automatic Community-based Question Answering https://acqua.kmi.open.ac.uk/
  • 27. ACQUA - Screenshot 8-11June 2015ICCSS 2015
  • 28. Read more about our work 8-11June 2015ICCSS 2015  It’s All in the Content: State of the Art Best Answer Prediction based on Discretisation of Shallow Linguistic Features. WebSci ’14  ACQUA: Automated Community-based Question Answering through the Discretisation of Shallow Linguistic Features. The Journal of Web Science, 1(1) (preprint available)
  • 29. Thank you 8-11June 2015ICCSS 2015 http://xkcd.com/386/