SlideShare a Scribd company logo
1 of 30
Mapping Tweets to Conference Talks: A
Goldmine for Semantics
Milan Stankovic, Hypios, Paris-Sorbonne, FR & Matthew Rowe, KMI, Open University, UK
On Conference We Tweet
Is there a Correspondance?
?
Why?
tweettweet talktalk
is about
Why?
tweettweet talktalk
is about
Topic 3
Topic 2
Topic 1
has topic
has topic
has topic
useruser
made
Why?
tweettweet talktalk
is about
Topic 3
Topic 2
Topic 1
has topic
has topic
has topic
useruser
made
interest ?
Why?
tweettweet talktalk
is about
useruser
made
were at the same talk ?
tweettweet
is about
useruser
made
Potential Benefits
• Digital memory
• Conference feedback
– number of tweets for a talk
– conversational aspects
– sentiment analysis
• User profiling and expert finding
• Trending topics
Rich Activity Twitter Event Data
• We take Twitter archives from
TwapperKeeper
• We enrich Tweets with relevant DBPedia
concepts using Zemanta
• We rely on existing Linked Data about talks to
perform the mappings.
ESWC Dataset
• Collected during the Extended Semantic Web
Conference 2010
– Any tweets tagged with “eswc”
• 1082 tweets
• 213 tweets enriched with concepts
Aligning Tweets with Talks
• Goal: Label tweets with talks
• Method:
– Induce a labelling function to perform alignment
– Labelled data = events from Web of Data
– Unlabelled data = tweets
( ){ }L
iii yx 1
, =
( ){ }U
iix 1=
YXf →:
Aligning Tweets with Talks
1. Feature Extraction:
@prefix swrc: <http://swrc.ontoware.org/ontology#>
@prefix swc: <http://data.semanticweb.org/ns/swc/ontology#>
@prefix dog: <http://data.semanticweb.org>
@prefix dc: <http://purl.org/dc/elements/1.1/>
<http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ;
dc:subject "Knowledge Acquisition" ;
dc:subject "Semantic Analysis" ;
dc:subject "Social Web" ;
dc:subject "Microblogs" ;
dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ;
swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ;
swrc:author <http://data.semanticweb.org/person/claudia-wagner> .
<http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ;
foaf:name "Claudia Wagner" ;
swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ;
foaf:based_near <http://dbpedia.org/resource/Austria>
Aligning Tweets with Talks
1. Feature Extraction: F1 - Immediate Resource
Leaves
@prefix swrc: <http://swrc.ontoware.org/ontology#>
@prefix swc: <http://data.semanticweb.org/ns/swc/ontology#>
@prefix dog: <http://data.semanticweb.org>
@prefix dc: <http://purl.org/dc/elements/1.1/>
<http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ;
dc:subject "Knowledge Acquisition" ;
dc:subject "Semantic Analysis" ;
dc:subject "Social Web" ;
dc:subject "Microblogs" ;
dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ;
swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ;
swrc:author <http://data.semanticweb.org/person/claudia-wagner> .
<http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ;
foaf:name "Claudia Wagner" ;
swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ;
foaf:based_near <http://dbpedia.org/resource/Austria>
Knowledge Acquisition Semantic
Analysis Social Web Microblogs
Exploring the Wisdom of the Tweets:
Knowledge Acquisition from Social
Awareness Streams Although one might
argue that little wisdom can be
conveyed in messages of 140
http://data.semanticweb.org/person/cla
udia-wagner
Knowledge Acquisition Semantic
Analysis Social Web Microblogs
Exploring the Wisdom of the Tweets:
Knowledge Acquisition from Social
Awareness Streams Although one might
argue that little wisdom can be
conveyed in messages of 140
http://data.semanticweb.org/person/cla
udia-wagner
Aligning Tweets with Talks
1. Feature Extraction: F2 – 1-step Resource
Leaves
@prefix swrc: <http://swrc.ontoware.org/ontology#>
@prefix swc: <http://data.semanticweb.org/ns/swc/ontology#>
@prefix dog: <http://data.semanticweb.org>
@prefix dc: <http://purl.org/dc/elements/1.1/>
<http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ;
dc:subject "Knowledge Acquisition" ;
dc:subject "Semantic Analysis" ;
dc:subject "Social Web" ;
dc:subject "Microblogs" ;
dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ;
swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ;
swrc:author <http://data.semanticweb.org/person/claudia-wagner> .
<http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ;
foaf:name "Claudia Wagner" ;
swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ;
foaf:based_near <http://dbpedia.org/resource/Austria>
http://data.semanticweb.org/person/cla
udia-wagner Claudia Wagner
http://data.semanticweb.org/organizati
on/joanneum-research
http://dbpedia.org/resource/Austria
http://data.semanticweb.org/person/cla
udia-wagner Claudia Wagner
http://data.semanticweb.org/organizati
on/joanneum-research
http://dbpedia.org/resource/Austria
Aligning Tweets with Talks
1. Feature Extraction: F3 – DBPedia Concepts
@prefix swrc: <http://swrc.ontoware.org/ontology#>
@prefix swc: <http://data.semanticweb.org/ns/swc/ontology#>
@prefix dog: <http://data.semanticweb.org>
@prefix dc: <http://purl.org/dc/elements/1.1/>
<http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ;
dc:subject "Knowledge Acquisition" ;
dc:subject "Semantic Analysis" ;
dc:subject "Social Web" ;
dc:subject "Microblogs" ;
dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ;
swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ;
swrc:author <http://data.semanticweb.org/person/claudia-wagner> .
<http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ;
foaf:name "Claudia Wagner" ;
swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ;
foaf:based_near <http://dbpedia.org/resource/Austria>
Http://dbpedia.org/resource/TwitterHttp://dbpedia.org/resource/Twitter
Http://dbpedia.org/resource/Social_WebHttp://dbpedia.org/resource/Social_Web
Http://dbpedia.org/resource/MicroblogsHttp://dbpedia.org/resource/Microblogs
Aligning Tweets with Talks
2. Feature Vector Composition
Knowledge Acquisition Semantic
Analysis Social Web Microblogs
Exploring the Wisdom of the Tweets:
Knowledge Acquisition from Social
Awareness Streams Although one might
argue that little wisdom can be
conveyed in messages of 140
http://data.semanticweb.org/person/cla
udia-wagner
Knowledge Acquisition Semantic
Analysis Social Web Microblogs
Exploring the Wisdom of the Tweets:
Knowledge Acquisition from Social
Awareness Streams Although one might
argue that little wisdom can be
conveyed in messages of 140
http://data.semanticweb.org/person/cla
udia-wagner
knowledge
acquisition
semantic
analysis
social
web
microblogs
exploring
wisdom
tweets
knowledge
acquisition
social
awareness
streams
wisdom
messages
IndexerIndexer
knowledge 2
acquisition 2
semantic 1
analysis 1
social 2
web 1
microblogs 1
exploring 1
wisdom 1
tweets 1
awareness 1
streams 1
wisdom 1
messages 1
Aligning Tweets with Talks
3. Inducing the Labelling Function
– Both tweets and events are provided as feature
vectors
– Induce a labelling function:
Choose the most likely event (y) given the tweet (x)
YXf →:
Aligning Tweets with Talks
3. Inducing the Labelling Function: Proximity-
based Clustering
– Build a centroid vector for each event
• From event feature vectors
– Compare each tweet vector with each centroid
• Choose event (y) which is closest
)),((minarg y
Yy
xdy µ
∈
=
∑=
−=
n
i
iixxmanhat
1
),( µµ ( )
2
1
),( ∑=
−=
n
i
iixxeucl µµ
Aligning Tweets with Talks
3. Inducing the Labelling Function: Naive Bayes
Classification
– Assigns most probably event label given tweet
features
– Using Bayes Theorem, we write this as:
),,,|( 21maxarg n
Yy
xxxyPy 
∈
=
∏
∈
∈
∈
=
=
=
i
i
Yy
n
Yy
n
n
Yy
yxPyPy
yPyxxxPy
xxxP
yPyxxxP
y
)|()(
)()|,,,(
),,,(
)()|,,,(
maxarg
maxarg
maxarg
21
21
21



Experiments
• Dataset
– Corpus of Tweets collected during ESWC 2010
• Gold Standard Construction
– Used 3 raters to label a portion of tweet corpus
• 200 tweets labelled
– Took interrater agreement between raters
• Using Kappa statistic
– Initial Agreement was too low: 0.328
– Utilised Delphi method to improve agreement
– Second round of labelling produced: 0.820
Experiments
• Evaluation Measures
– Precision: proportion of event tweets correctly
labelled
– Recall: proportion of tweets successfully
returned for a tweet
– F-measure: Harmonic mean of precision and
recall
• Placed emphasis of precision over recall
RP
RP
measuref
+×
××+
=− 2
2
)1(
β
β
{ }1,5.0,2.0=β
Results
Imagine…
Imagine user profiling
ESWC dataset, user Matthew Rowe
Imagine conference feedback
ESWC dataset
directly from Tweets
from mappings (Talks)
We Challenge You
We Challenge You!
• Beat us in mappings!
• We provide the human generated gold
stadnard mappings
• Can you find a more precise way to do tweet-
talk mappings?
• Can you find other uses? Let us know!
We Challenge You!
• you can find the gold standard data here :
http://research.hypios.com/?page_id=131
• you can find all the data (and automated
mappings) here:
http://data.hypios.com/tweets/sparql
We Challenge You!
http://data.hypios.com/tweets/sparql
SELECT ?tweet ?talk WHERE {
?tweet <http://linkedevents.org/ontology/illustrate> ?talk.
}
brought to you by
milan.stankovic@hypios.com & M.C.Rowe@open.ac.uk
November 2010, Shanghaï, China

More Related Content

What's hot

Threat Hunting with Splunk
Threat Hunting with Splunk Threat Hunting with Splunk
Threat Hunting with Splunk Splunk
 
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptxChi En (Ashley) Shen
 
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...OpenDNS
 
OSINT tools for security auditing with python
OSINT tools for security auditing with pythonOSINT tools for security auditing with python
OSINT tools for security auditing with pythonJose Manuel Ortega Candel
 
Malware Static Analysis
Malware Static AnalysisMalware Static Analysis
Malware Static AnalysisHossein Yavari
 
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...Paladion Networks
 

What's hot (9)

Threat Hunting with Splunk
Threat Hunting with Splunk Threat Hunting with Splunk
Threat Hunting with Splunk
 
Supraja_SMS_presentation
Supraja_SMS_presentationSupraja_SMS_presentation
Supraja_SMS_presentation
 
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
 
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
 
Tcpdump hunter
Tcpdump hunterTcpdump hunter
Tcpdump hunter
 
OSINT tools for security auditing with python
OSINT tools for security auditing with pythonOSINT tools for security auditing with python
OSINT tools for security auditing with python
 
BDACA1516s2 - Lecture5
BDACA1516s2 - Lecture5BDACA1516s2 - Lecture5
BDACA1516s2 - Lecture5
 
Malware Static Analysis
Malware Static AnalysisMalware Static Analysis
Malware Static Analysis
 
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...
 

Viewers also liked

Istc 655 Chapter 7 Ppt
Istc 655 Chapter 7 PptIstc 655 Chapter 7 Ppt
Istc 655 Chapter 7 Pptcdegro1
 
Envoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail ProcessEnvoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail Processwijrwsr
 
Rabies Virus
Rabies VirusRabies Virus
Rabies VirusDikshan
 
MAD ./ \. DOG
MAD ./ \. DOGMAD ./ \. DOG
MAD ./ \. DOGDikshan
 
Evaluation Slide Show
Evaluation Slide ShowEvaluation Slide Show
Evaluation Slide Showgemmibearrox
 
rabies 2
rabies 2rabies 2
rabies 2Dikshan
 
Envoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail ProcessEnvoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail Processwijrwsr
 
FOAF+SSL: More then Authentication
FOAF+SSL: More then AuthenticationFOAF+SSL: More then Authentication
FOAF+SSL: More then AuthenticationMilan Stankovic
 
Finding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked DataFinding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked DataMilan Stankovic
 
Open Innovation and Semantic Web
Open Innovation and Semantic WebOpen Innovation and Semantic Web
Open Innovation and Semantic WebMilan Stankovic
 
Semantic Web In Practice
Semantic Web In PracticeSemantic Web In Practice
Semantic Web In PracticeMilan Stankovic
 
Looking for Experts? What can Linked Data do for You?
Looking for Experts? What can Linked Data do for You?Looking for Experts? What can Linked Data do for You?
Looking for Experts? What can Linked Data do for You?Milan Stankovic
 
Translate Subtitles
Translate SubtitlesTranslate Subtitles
Translate Subtitlesguest78ba8c
 
How does your media project represent particualr social groups?
How does your media project represent particualr social groups?How does your media project represent particualr social groups?
How does your media project represent particualr social groups?gemmibearrox
 
Accessibility U 1237927961698 S U
Accessibility U 1237927961698 S UAccessibility U 1237927961698 S U
Accessibility U 1237927961698 S Uguest45d56
 
Gtdessentialsles320090324.Key U 1237929710943 B U
Gtdessentialsles320090324.Key U 1237929710943 B UGtdessentialsles320090324.Key U 1237929710943 B U
Gtdessentialsles320090324.Key U 1237929710943 B Uguest45d56
 

Viewers also liked (20)

gs0703
gs0703gs0703
gs0703
 
Istc 655 Chapter 7 Ppt
Istc 655 Chapter 7 PptIstc 655 Chapter 7 Ppt
Istc 655 Chapter 7 Ppt
 
Envoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail ProcessEnvoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail Process
 
Rabies Virus
Rabies VirusRabies Virus
Rabies Virus
 
MAD ./ \. DOG
MAD ./ \. DOGMAD ./ \. DOG
MAD ./ \. DOG
 
Evaluation Slide Show
Evaluation Slide ShowEvaluation Slide Show
Evaluation Slide Show
 
rabies
rabiesrabies
rabies
 
rabies 2
rabies 2rabies 2
rabies 2
 
Envoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail ProcessEnvoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail Process
 
FOAF+SSL: More then Authentication
FOAF+SSL: More then AuthenticationFOAF+SSL: More then Authentication
FOAF+SSL: More then Authentication
 
Finding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked DataFinding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked Data
 
Open Innovation and Semantic Web
Open Innovation and Semantic WebOpen Innovation and Semantic Web
Open Innovation and Semantic Web
 
Semantic Web In Practice
Semantic Web In PracticeSemantic Web In Practice
Semantic Web In Practice
 
Faceted Online Presence
Faceted Online PresenceFaceted Online Presence
Faceted Online Presence
 
Looking for Experts? What can Linked Data do for You?
Looking for Experts? What can Linked Data do for You?Looking for Experts? What can Linked Data do for You?
Looking for Experts? What can Linked Data do for You?
 
Translate Subtitles
Translate SubtitlesTranslate Subtitles
Translate Subtitles
 
Online Presence
Online PresenceOnline Presence
Online Presence
 
How does your media project represent particualr social groups?
How does your media project represent particualr social groups?How does your media project represent particualr social groups?
How does your media project represent particualr social groups?
 
Accessibility U 1237927961698 S U
Accessibility U 1237927961698 S UAccessibility U 1237927961698 S U
Accessibility U 1237927961698 S U
 
Gtdessentialsles320090324.Key U 1237929710943 B U
Gtdessentialsles320090324.Key U 1237929710943 B UGtdessentialsles320090324.Key U 1237929710943 B U
Gtdessentialsles320090324.Key U 1237929710943 B U
 

Similar to Mapping Tweets to Conference Talks: A Goldmine for Semantics

apidays LIVE Australia 2021 - Tracing across your distributed process boundar...
apidays LIVE Australia 2021 - Tracing across your distributed process boundar...apidays LIVE Australia 2021 - Tracing across your distributed process boundar...
apidays LIVE Australia 2021 - Tracing across your distributed process boundar...apidays
 
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology DomainFacilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology DomainChristophe Debruyne
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSAPRBETTER
 
Patterns of Streaming Applications
Patterns of Streaming ApplicationsPatterns of Streaming Applications
Patterns of Streaming ApplicationsC4Media
 
Semantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research EnvironmentsSemantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research EnvironmentsHenrique O. Santos
 
Unleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightUnleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightMatthew Russell
 
Unleashing twitter data for fun and insight
Unleashing twitter data for fun and insightUnleashing twitter data for fun and insight
Unleashing twitter data for fun and insightDigital Reasoning
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignMichael Noll
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsthelabdude
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteDeep Kayal
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesTony Hammond
 
Filtering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media StreamingFiltering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media StreamingCloud Elements
 
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...DataStax Academy
 
30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real worldDiego Valerio Camarda
 
AAT LOD Microthesauri
AAT LOD MicrothesauriAAT LOD Microthesauri
AAT LOD MicrothesauriMarcia Zeng
 
Recommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and DatoRecommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and DatoAshok Venkatesan
 
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...Steffen Staab
 
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for ReproducibilityRob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for ReproducibilityGigaScience, BGI Hong Kong
 
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...WSO2
 
myExperiment @ Nettab
myExperiment @ NettabmyExperiment @ Nettab
myExperiment @ NettabDuncan Hull
 

Similar to Mapping Tweets to Conference Talks: A Goldmine for Semantics (20)

apidays LIVE Australia 2021 - Tracing across your distributed process boundar...
apidays LIVE Australia 2021 - Tracing across your distributed process boundar...apidays LIVE Australia 2021 - Tracing across your distributed process boundar...
apidays LIVE Australia 2021 - Tracing across your distributed process boundar...
 
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology DomainFacilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
 
Patterns of Streaming Applications
Patterns of Streaming ApplicationsPatterns of Streaming Applications
Patterns of Streaming Applications
 
Semantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research EnvironmentsSemantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research Environments
 
Unleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightUnleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and Insight
 
Unleashing twitter data for fun and insight
Unleashing twitter data for fun and insightUnleashing twitter data for fun and insight
Unleashing twitter data for fun and insight
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ Deloitte
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 
Filtering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media StreamingFiltering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media Streaming
 
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...
 
30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world
 
AAT LOD Microthesauri
AAT LOD MicrothesauriAAT LOD Microthesauri
AAT LOD Microthesauri
 
Recommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and DatoRecommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and Dato
 
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
 
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for ReproducibilityRob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
 
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
 
myExperiment @ Nettab
myExperiment @ NettabmyExperiment @ Nettab
myExperiment @ Nettab
 

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Mapping Tweets to Conference Talks: A Goldmine for Semantics

  • 1. Mapping Tweets to Conference Talks: A Goldmine for Semantics Milan Stankovic, Hypios, Paris-Sorbonne, FR & Matthew Rowe, KMI, Open University, UK
  • 3. Is there a Correspondance? ?
  • 5. Why? tweettweet talktalk is about Topic 3 Topic 2 Topic 1 has topic has topic has topic useruser made
  • 6. Why? tweettweet talktalk is about Topic 3 Topic 2 Topic 1 has topic has topic has topic useruser made interest ?
  • 7. Why? tweettweet talktalk is about useruser made were at the same talk ? tweettweet is about useruser made
  • 8. Potential Benefits • Digital memory • Conference feedback – number of tweets for a talk – conversational aspects – sentiment analysis • User profiling and expert finding • Trending topics
  • 9. Rich Activity Twitter Event Data • We take Twitter archives from TwapperKeeper • We enrich Tweets with relevant DBPedia concepts using Zemanta • We rely on existing Linked Data about talks to perform the mappings.
  • 10. ESWC Dataset • Collected during the Extended Semantic Web Conference 2010 – Any tweets tagged with “eswc” • 1082 tweets • 213 tweets enriched with concepts
  • 11. Aligning Tweets with Talks • Goal: Label tweets with talks • Method: – Induce a labelling function to perform alignment – Labelled data = events from Web of Data – Unlabelled data = tweets ( ){ }L iii yx 1 , = ( ){ }U iix 1= YXf →:
  • 12. Aligning Tweets with Talks 1. Feature Extraction: @prefix swrc: <http://swrc.ontoware.org/ontology#> @prefix swc: <http://data.semanticweb.org/ns/swc/ontology#> @prefix dog: <http://data.semanticweb.org> @prefix dc: <http://purl.org/dc/elements/1.1/> <http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ; dc:subject "Knowledge Acquisition" ; dc:subject "Semantic Analysis" ; dc:subject "Social Web" ; dc:subject "Microblogs" ; dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ; swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ; swrc:author <http://data.semanticweb.org/person/claudia-wagner> . <http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ; foaf:name "Claudia Wagner" ; swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ; foaf:based_near <http://dbpedia.org/resource/Austria>
  • 13. Aligning Tweets with Talks 1. Feature Extraction: F1 - Immediate Resource Leaves @prefix swrc: <http://swrc.ontoware.org/ontology#> @prefix swc: <http://data.semanticweb.org/ns/swc/ontology#> @prefix dog: <http://data.semanticweb.org> @prefix dc: <http://purl.org/dc/elements/1.1/> <http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ; dc:subject "Knowledge Acquisition" ; dc:subject "Semantic Analysis" ; dc:subject "Social Web" ; dc:subject "Microblogs" ; dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ; swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ; swrc:author <http://data.semanticweb.org/person/claudia-wagner> . <http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ; foaf:name "Claudia Wagner" ; swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ; foaf:based_near <http://dbpedia.org/resource/Austria> Knowledge Acquisition Semantic Analysis Social Web Microblogs Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams Although one might argue that little wisdom can be conveyed in messages of 140 http://data.semanticweb.org/person/cla udia-wagner Knowledge Acquisition Semantic Analysis Social Web Microblogs Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams Although one might argue that little wisdom can be conveyed in messages of 140 http://data.semanticweb.org/person/cla udia-wagner
  • 14. Aligning Tweets with Talks 1. Feature Extraction: F2 – 1-step Resource Leaves @prefix swrc: <http://swrc.ontoware.org/ontology#> @prefix swc: <http://data.semanticweb.org/ns/swc/ontology#> @prefix dog: <http://data.semanticweb.org> @prefix dc: <http://purl.org/dc/elements/1.1/> <http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ; dc:subject "Knowledge Acquisition" ; dc:subject "Semantic Analysis" ; dc:subject "Social Web" ; dc:subject "Microblogs" ; dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ; swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ; swrc:author <http://data.semanticweb.org/person/claudia-wagner> . <http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ; foaf:name "Claudia Wagner" ; swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ; foaf:based_near <http://dbpedia.org/resource/Austria> http://data.semanticweb.org/person/cla udia-wagner Claudia Wagner http://data.semanticweb.org/organizati on/joanneum-research http://dbpedia.org/resource/Austria http://data.semanticweb.org/person/cla udia-wagner Claudia Wagner http://data.semanticweb.org/organizati on/joanneum-research http://dbpedia.org/resource/Austria
  • 15. Aligning Tweets with Talks 1. Feature Extraction: F3 – DBPedia Concepts @prefix swrc: <http://swrc.ontoware.org/ontology#> @prefix swc: <http://data.semanticweb.org/ns/swc/ontology#> @prefix dog: <http://data.semanticweb.org> @prefix dc: <http://purl.org/dc/elements/1.1/> <http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ; dc:subject "Knowledge Acquisition" ; dc:subject "Semantic Analysis" ; dc:subject "Social Web" ; dc:subject "Microblogs" ; dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ; swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ; swrc:author <http://data.semanticweb.org/person/claudia-wagner> . <http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ; foaf:name "Claudia Wagner" ; swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ; foaf:based_near <http://dbpedia.org/resource/Austria> Http://dbpedia.org/resource/TwitterHttp://dbpedia.org/resource/Twitter Http://dbpedia.org/resource/Social_WebHttp://dbpedia.org/resource/Social_Web Http://dbpedia.org/resource/MicroblogsHttp://dbpedia.org/resource/Microblogs
  • 16. Aligning Tweets with Talks 2. Feature Vector Composition Knowledge Acquisition Semantic Analysis Social Web Microblogs Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams Although one might argue that little wisdom can be conveyed in messages of 140 http://data.semanticweb.org/person/cla udia-wagner Knowledge Acquisition Semantic Analysis Social Web Microblogs Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams Although one might argue that little wisdom can be conveyed in messages of 140 http://data.semanticweb.org/person/cla udia-wagner knowledge acquisition semantic analysis social web microblogs exploring wisdom tweets knowledge acquisition social awareness streams wisdom messages IndexerIndexer knowledge 2 acquisition 2 semantic 1 analysis 1 social 2 web 1 microblogs 1 exploring 1 wisdom 1 tweets 1 awareness 1 streams 1 wisdom 1 messages 1
  • 17. Aligning Tweets with Talks 3. Inducing the Labelling Function – Both tweets and events are provided as feature vectors – Induce a labelling function: Choose the most likely event (y) given the tweet (x) YXf →:
  • 18. Aligning Tweets with Talks 3. Inducing the Labelling Function: Proximity- based Clustering – Build a centroid vector for each event • From event feature vectors – Compare each tweet vector with each centroid • Choose event (y) which is closest )),((minarg y Yy xdy µ ∈ = ∑= −= n i iixxmanhat 1 ),( µµ ( ) 2 1 ),( ∑= −= n i iixxeucl µµ
  • 19. Aligning Tweets with Talks 3. Inducing the Labelling Function: Naive Bayes Classification – Assigns most probably event label given tweet features – Using Bayes Theorem, we write this as: ),,,|( 21maxarg n Yy xxxyPy  ∈ = ∏ ∈ ∈ ∈ = = = i i Yy n Yy n n Yy yxPyPy yPyxxxPy xxxP yPyxxxP y )|()( )()|,,,( ),,,( )()|,,,( maxarg maxarg maxarg 21 21 21   
  • 20. Experiments • Dataset – Corpus of Tweets collected during ESWC 2010 • Gold Standard Construction – Used 3 raters to label a portion of tweet corpus • 200 tweets labelled – Took interrater agreement between raters • Using Kappa statistic – Initial Agreement was too low: 0.328 – Utilised Delphi method to improve agreement – Second round of labelling produced: 0.820
  • 21. Experiments • Evaluation Measures – Precision: proportion of event tweets correctly labelled – Recall: proportion of tweets successfully returned for a tweet – F-measure: Harmonic mean of precision and recall • Placed emphasis of precision over recall RP RP measuref +× ××+ =− 2 2 )1( β β { }1,5.0,2.0=β
  • 24. Imagine user profiling ESWC dataset, user Matthew Rowe
  • 25. Imagine conference feedback ESWC dataset directly from Tweets from mappings (Talks)
  • 27. We Challenge You! • Beat us in mappings! • We provide the human generated gold stadnard mappings • Can you find a more precise way to do tweet- talk mappings? • Can you find other uses? Let us know!
  • 28. We Challenge You! • you can find the gold standard data here : http://research.hypios.com/?page_id=131 • you can find all the data (and automated mappings) here: http://data.hypios.com/tweets/sparql
  • 29. We Challenge You! http://data.hypios.com/tweets/sparql SELECT ?tweet ?talk WHERE { ?tweet <http://linkedevents.org/ontology/illustrate> ?talk. }
  • 30. brought to you by milan.stankovic@hypios.com & M.C.Rowe@open.ac.uk November 2010, Shanghaï, China