SlideShare uma empresa Scribd logo
1 de 48
Baixar para ler offline
Natural Language Processing for the 
Semantic Web 
Isabelle Augenstein 
Department of Computer Science, University of Sheffield, UK 
i.augenstein@sheffield.ac.uk 
Linked Data for NLP Tutorial, ESWC Summer School 2014 
August 24, 2014
Why Natural Language Processing? 2 
Isabelle Augenstein
Why Natural Language Processing? 3 
semi-structured information 
Isabelle Augenstein 
unstructured information
Why Natural Language Processing? 4 
semi-structured information 
Isabelle Augenstein 
unstructured information 
How to link this information to a 
knowledge base automatically?
Why Natural Language Processing? 5 
semi-structured information 
Isabelle Augenstein 
unstructured information 
How to link this information to a 
knowledge base automatically? 
Information Extraction!
Information Extraction 6 
Between AD 1400 and 1450, China was a global superpower run 
by one family – the Ming dynasty – who established Beijing as the 
capital and built the Forbidden City. 
Isabelle Augenstein
Named Entities 7 
Between AD 1400 and 1450, China was a global superpower run 
by one family – the Ming dynasty – who established Beijing as the 
capital and built the Forbidden City. 
Named Entity Recognition 
Isabelle Augenstein
Named Entities 8 
Between AD 1400 and 1450, China was a global superpower run 
by one family – the Ming dynasty – who established Beijing as the 
capital and built the Forbidden City. 
Named Entity Recognition (NER) 
Named Entity Classification (NEC): 
China: /location/country 
Ming dynasty: /royalty/royal_line 
Beijing: /location/city 
Forbidden City: /location/city 
Isabelle Augenstein
Named Entities 9 
Between AD 1400 and 1450, China was a global superpower run 
by one family – the Ming dynasty – who established Beijing as the 
capital and built the Forbidden City. 
Named Entity Recognition 
Named Entity Classification: Named Entity Linking: 
China: /location/country China: /m/0d05w3 
Ming dynasty: /royalty/royal_line Ming dynasty: /m/0bw_m 
Beijing: /location/city Beijing: /m/01914 
Forbidden City: /location/city Forbidden City: /m/0j0b2 
Isabelle Augenstein
Named Entities: Definition 10 
Named Entities: Proper nouns, which refer to real-life entities 
Named Entity Recognition: Detecting boundaries of named 
entities (NEs) 
Named Entity Classification: Assigning classes to NEs, such as 
PERSON, LOCATION, ORGANISATION, or fine-grained classes 
such as ROYAL LINE 
Named Entity Linking / Disambiguation: Linking NEs to 
concrete entries in knowledge base, example: 
China -> LOCATION: Republic of China, country in East Asia 
-> LOCATION: China proper, core region of China during Qing dynasty 
-> LOCATION: China, Texas 
-> PERSON: China, Brazilian footballer born in 1964 
-> MUSIC: China, a 1979 album by Vangelis 
-> …
Relations 11 
Between AD 1400 and 1450, China was a global 
superpower run by one family – the Ming dynasty – who 
established Beijing as the capital and built the Forbidden City. 
Named Entity Recognition 
Relation Extraction 
Isabelle Augenstein 
/royalty/royal_line/kingdom_s_ruled 
/location/country/capital /location/location/containedby
Relations and Time Expressions 12 
Between AD 1400 and 1450, China was a global 
1400-XX-XX -- 1450-XX-XX 
superpower run by one family – the Ming dynasty – who 
established Beijing as the capital and built the Forbidden City. 
Named Entity Recognition 
Relation Extraction 
Temporal Extraction 
Isabelle Augenstein 
/royalty/royal_line/kingdom_s_ruled 
/location/country/capital /location/location/containedby
Relations, Time Expressions 13 
and Events 
Between AD 1400 and 1450, China was a global 
1400-XX-XX -- 1450-XX-XX 
superpower run by one family – the Ming dynasty – who 
established Beijing as the capital and built the Forbidden City. 
Named Entity Recognition 
Relation Extraction 
Temporal Extraction 
Isabelle Augenstein 
/royalty/royal_line/kingdom_s_ruled 
/location/country/capital /location/location/containedby 
Event Extraction 
Event: /royalty/royal_line: Ming dynasty 
/royalty/royal_line/ruled_from: 1400-XX-XX 
/royalty/royal_line/ruled_to: 1450-XX-XX 
/royalty/royal_line/kingdom_s_ruled: China
Relations, Time Expressions 14 
and Events: Definition 
Relations: Two or more entities which relate to one another in 
real life 
Relation Extraction: Detecting relations between entities and 
assigning relation types to them, such as CAPITAL-OF 
Temporal Extraction: Recognising and normalising time 
expressions: times (e.g. “3 in the afternoon”), dates (“tomorrow”), 
durations (“since yesterday”), and sets (e.g. “twice a month”) 
Events: Real-life events that happened at some point in space 
and time, e.g. kingdom, assassination, exhibition 
Event Extraction: Extracting events consisting of the name and 
type of event, time and location
Summary: Introduction 15 
• Information extraction (IE) methods such as named entity 
recognition (NER), named entity classification (NEC), named 
entity linking, relation extraction (RE), temporal extraction, and 
event extraction can help to add markup to Web pages 
• Information extraction approaches can serve two purposes: 
• Annotating every single mention of an entity, relation or event, 
e.g. to add markup to Web pages 
• Aggregating those mentions to populate knowledge bases, e.g. 
based on confidence values and majority voting 
China LOCATION 0.9 
China LOCATION 0.8 
China PERSON 0.4 
à China LOCATION 
Isabelle Augenstein
Information Extraction: Methods 16 
• Focus of the rest of the tutorial and hands-on session: 
Named entity recognition and classification (NERC) 
• Possible methodologies 
• Rule-based approaches: write manual extraction rules 
• Machine learning based approaches 
• Supervised learning: manually annotate text, train machine 
learning model 
• Unsupervised learning: extract language patterns, cluster similar 
ones 
• Semi-supervised learning: start with a small number of language 
patterns, iteratively learn more (bootstrapping) 
• Gazetteer-based method: use existing list of named entities 
• Combination of the above 
Isabelle Augenstein
Information Extraction: Methods 17 
Developing a NERC involves programming based around APIs.. 
Isabelle Augenstein
Information Extraction: Methods 18 
Developing a NERC involves programming based around APIs.. 
which can be frustrating at times 
Isabelle Augenstein
Information Extraction: Methods 19 
and (at least basic) knowledge about linguistics 
Isabelle Augenstein
Background: Linguistics 20 
Acoustic signal 
Phonemes 
Phonology, 
phonetics 
Morphemes 
Words 
Sentences 
Meaning 
Lexicon 
Morphology 
Isabelle Augenstein 
-> /l/ 
Syntax 
Semantics, 
Discourse 
/ʃ/ -> sh 
wait + ed -> waited, cat -> cat 
wait -> V, cat -> N 
NP 
NP 
The (D) farmer (N) hit (V) the (D) donkey (N). 
Every farmer who owns a donkey beats it. 
∀x∀y (farmer(x) ∧ donkey(y) ∧ own(x, y) 
→ beat(x, y))
Background: NLP Tasks 21 
Acoustic signal 
Phonemes 
Morphemes 
Words 
Sentences 
Meaning 
Isabelle Augenstein 
Speech recognition 
Phonology, 
phonetics 
Morphology 
-> /l/ 
Syntax 
Lemmatisation or stemming, 
part of speech (POS) tagging 
Chunking, parsing 
Lexicon 
Semantics, 
Discourse 
Semantic and discourse analysis, 
anaphora resolution
Background: Linguistics 22 
Acoustic signal 
Phonemes 
Morphemes 
Words 
Sentences 
Meaning 
Isabelle Augenstein 
recognize speech – wreck a nice beach 
/ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ 
Phonology, 
phonetics 
Lexicon 
Morphology 
-> /l/ 
Syntax 
Semantics, 
Discourse
Background: Linguistics 23 
Acoustic signal 
Phonemes 
Morphemes 
Words 
Sentences 
Meaning 
Isabelle Augenstein 
recognize speech – wreck a nice beach 
/ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ 
Phonology, 
phonetics 
Lexicon She’d /ʃiːd/ 
Morphology 
-> /l/ 
Syntax 
Semantics, 
Discourse
Background: Linguistics 24 
Acoustic signal 
Phonemes 
Morphemes 
Words 
Sentences 
Meaning 
Isabelle Augenstein 
recognize speech – wreck a nice beach 
/ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ 
Phonology, 
phonetics 
Lexicon She’d /ʃiːd/ -> she would, she had 
Morphology 
-> /l/ 
Syntax 
Semantics, 
Discourse
Background: Linguistics 25 
Acoustic signal 
Phonemes 
Morphemes 
Words 
Sentences 
Meaning 
Isabelle Augenstein 
recognize speech – wreck a nice beach 
/ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ 
Phonology, 
phonetics 
Lexicon She’d /ʃiːd/ -> she would, she had 
Morphology 
-> /l/ 
Syntax 
Time flies like an arrow 
Semantics, 
Discourse
Background: Linguistics 26 
Acoustic signal 
Phonemes 
Morphemes 
Words 
Sentences 
Meaning 
Isabelle Augenstein 
recognize speech – wreck a nice beach 
/ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ 
Phonology, 
phonetics 
Lexicon She’d /ʃiːd/ -> she would, she had 
Morphology 
-> /l/ 
Syntax 
Time flies(V/N) like(V/P) an arrow 
Semantics, 
Discourse
Background: Linguistics 27 
Acoustic signal 
Phonemes 
Morphemes 
Words 
Sentences 
Meaning 
Isabelle Augenstein 
recognize speech – wreck a nice beach 
/ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ 
Phonology, 
phonetics 
Lexicon She’d /ʃiːd/ -> she would, she had 
Morphology 
-> /l/ 
Syntax 
Time flies(V/N) like(V/P) an arrow 
The woman saw the man with the 
binoculars. 
Semantics, 
Discourse
Background: Linguistics 28 
Acoustic signal 
Phonemes 
Morphemes 
Words 
Sentences 
Meaning 
Isabelle Augenstein 
recognize speech – wreck a nice beach 
/ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ 
Phonology, 
phonetics 
Lexicon She’d /ʃiːd/ -> she would, she had 
Morphology 
-> /l/ 
Syntax 
Time flies(V/N) like(V/P) an arrow 
The woman saw the man with the 
binoculars. -> Who had the binoculars? 
Semantics, 
Discourse
Background: Linguistics 29 
Acoustic signal 
Phonemes 
Morphemes 
Words 
Sentences 
Meaning 
Isabelle Augenstein 
recognize speech – wreck a nice beach 
/ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ 
Phonology, 
phonetics 
Lexicon She’d /ʃiːd/ -> she would, she had 
Morphology 
-> /l/ 
Syntax 
Time flies(V/N) like(V/P) an arrow 
The woman saw the man with the 
binoculars. -> Who had the binoculars? 
Semantics, 
Discourse 
Somewhere in Britain, some woman 
has a child every thirty seconds.
Background: Linguistics 30 
Acoustic signal 
Phonemes 
Morphemes 
Words 
Sentences 
Meaning 
Isabelle Augenstein 
recognize speech – wreck a nice beach 
/ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ 
Phonology, 
phonetics 
Lexicon She’d /ʃiːd/ -> she would, she had 
Morphology 
-> /l/ 
Syntax 
Time flies(V/N) like(V/P) an arrow 
The woman saw the man with the 
binoculars. -> Who had the binoculars? 
Semantics, 
Discourse 
Somewhere in Britain, some woman 
has a child every thirty seconds. 
-> Same woman or different women?
Background: Linguistics 31 
Acoustic signal 
Phonemes 
Morphemes 
Words 
Sentences 
Meaning 
Isabelle Augenstein 
Ambiguities on every level 
recognize speech – wreck a nice beach 
/ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ 
Phonology, 
phonetics 
Lexicon She’d /ʃiːd/ -> she would, she had 
Morphology 
-> /l/ 
Syntax 
Time flies(V/N) like(V/P) an arrow 
The woman saw the man with the 
binoculars. -> Who had the binoculars? 
Semantics, 
Discourse 
Somewhere in Britain, some woman 
has a child every thirty seconds. 
-> Same woman or different women?
Background: Linguistics 32 
Acoustic signal 
Phonemes 
Morphemes 
Words 
Sentences 
Meaning 
Isabelle Augenstein 
Ambiguities on every level 
Y U SO 
recognize speech – wreck a nice beach 
/ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ 
Phonology, 
phonetics 
Lexicon She’d /ʃiːd/ -> she would, she had 
Morphology 
-> /l/ 
Syntax 
Time flies(V/N) like(V/P) an arrow 
The woman saw the man with the 
binoculars. -> Who had the binoculars? 
Semantics, 
Discourse 
Somewhere in Britain, some woman 
has a child every thirty seconds. 
-> Same woman or different women? 
AMBIGUOUS?
Information Extraction 33 
Language is ambiguous.. 
Can we still build named entity extractors that extract all 
entities from unseen text correctly? 
Isabelle Augenstein
Information Extraction 34 
Language is ambiguous.. 
Can we still build named entity extractors that extract all 
entities from unseen text correctly? 
Isabelle Augenstein
Information Extraction 35 
Language is ambiguous.. 
Can we still build named entity extractors that extract all 
entities from unseen text correctly? 
However, we can try to extract most of them correctly 
using linguistic cues and background knowledge! 
Isabelle Augenstein
NERC: Features 36 
What can help to recognise and/or classify named entities? 
• Words: 
• Words in window before and after mention 
• Sequences 
• Bags of words 
Between AD 1400 and 1450, China was a global superpower 
w: China w-1: , w-2: 1450 w+1: was w+2: a 
seq[-]: 1450, seq[+]: was a 
bow: China bow[-]: , bow[-]: 1450 bow[+]: was bow[+]: a 
Isabelle Augenstein
NERC: Features 37 
What can help to recognise and/or classify named entities? 
• Morphology: 
• Capitalisation: is upper case (China), all upper case (IBM), mixed case 
(eBay) 
• Symbols: contains $, £, €, roman symbols (IV), .. 
• Contains period (google.com), apostrophe (Mandy’s), hyphen (speed-o-meter), 
ampersand (Fisher & Sons) 
• Stem or Lemma (cats -> cat), prefix (disadvantages -> dis), 
suffix (cats -> s), interfix (speed-o-meter -> o) 
Isabelle Augenstein
our learning component. 
4.1 Part-of-Speech Tags 
Part-Of-Speech (POS) tags have often been considered as 
an important discriminative feature for term identification. 
Many works on key term identification apply either fixed 
or regular expression POS tag patterns to improve their ef-fectiveness. 
NERC: Features 38 
Nonetheless, POS tags alone cannot produce 
What can help to recognise and/or classify named entities? 
high-quality results. As can be seen from the overall POS 
tag distribution graph extracted from one of our collections 
(see Figure 3), many of the most frequent tag patterns (e.g., 
JJNN tagging adjectives and nouns6) are far from yielding 
perfect results. 
• POS (part of speech) tags 
• Most named entities are nouns 
 
 
 
 
• Prokofyev (2014) 
Isabelle Augenstein
Figure 3: Top 6 most frequent part-of-speech tag 
patterns of the SIGIR collection, where JJ stands 
comma, while the n-gram“automatic which is indeed a valid the beginning or at the end The contingency tables given this: The +punctuation respectively, the counts of the punctuation mark in any of of the n-grams that have no occurrences. From the tables, of punctuation marks (+punctuation) an n-gram occurs twice as often valid entities compared to the that the absence of punctuation less frequently for the valid ones. 
Table 1: Contingency table appearing immediately before Valid +punctuation 1622 punctuation 6523 Totals 8145 Table 2: Contingency table appearing immediately after Valid +punctuation 4887
Morphology: Penn Treebank POS tags 39
Morphology: Penn Treebank POS tags 40 
Nouns 
(all start 
with N) 
Verbs (all 
start with V) 
Adjectives 
(all start 
with J)
our learning component. 
4.1 Part-of-Speech Tags 
Part-Of-Speech (POS) tags have often been considered as 
an important discriminative feature for term identification. 
Many works on key term identification apply either fixed 
or regular expression POS tag patterns to improve their ef-fectiveness. 
NERC: Features 41 
Nonetheless, POS tags alone cannot produce 
What can help to recognise and/or classify named entities? 
high-quality results. As can be seen from the overall POS 
tag distribution graph extracted from one of our collections 
(see Figure 3), many of the most frequent tag patterns (e.g., 
JJNN tagging adjectives and nouns6) are far from yielding 
perfect results. 
• POS (part of speech) tags 
• Most named entities are nouns 
 
 
 
 
• Prokofyev (2014) 
Isabelle Augenstein
Figure 3: Top 6 most frequent part-of-speech tag 
patterns of the SIGIR collection, where JJ stands 
comma, while the n-gram“automatic which is indeed a valid the beginning or at the end The contingency tables given this: The +punctuation respectively, the counts of the punctuation mark in any of of the n-grams that have no occurrences. From the tables, of punctuation marks (+punctuation) an n-gram occurs twice as often valid entities compared to the that the absence of punctuation less frequently for the valid ones. 
Table 1: Contingency table appearing immediately before Valid +punctuation 1622 punctuation 6523 Totals 8145 Table 2: Contingency table appearing immediately after Valid +punctuation 4887
NERC: Features 42 
What can help to recognise and/or classify named entities? 
• Gazetteers 
• Retrieved from HTML lists or tables 
• Using regular expressions patterns and search engines (e.g. 
“Religions such as * ”) 
• Retrieved from knowledge bases 
Isabelle Augenstein
NERC: Training Models 43 
Extensive choice of machine learning algorithms for training NERCs
NERC: Training Models 44 
Extensive choice of machine learning algorithms for training NERCs
NERC: Training Models 45 
• Unfortunately, there isn’t enough time to explain machine learning 
algorithms in detail 
• CRFs (conditional random fields) are one of the most widely used 
algorithms for NERC 
• Graphical models, view NERC as a sequence labelling task 
• Named entities consist of a beginning token (B), inside tokens (I), 
and outside tokens (O) 
China (B-LOC) built (O) the (O) Forbidden (B-LOC) City (I-LOC) . (O) 
• For now, we will focus on rule- and gazetteer-based NERC 
• It is fairly easy to write manual extraction rules for NEs, can 
achieve a high performance when combined with gazetteers 
• This can be done with the GATE software (general architecture for 
text engineering) and Jape rules 
- Hands-on session 
Isabelle Augenstein
Outlook: NLP  ML Software 46 
Natural Language Processing: 
- GATE (general purpose architecture, includes other NLP and ML 
software as plugins) 
- Stanford NLP (Java) 
- OpenNLP (Java) 
- NLTK (Python) 
Machine Learning: 
- scikit-learn (Python, rich documentation, highly recommended!) 
- Mallet (Java) 
- WEKA (Java) 
- Alchemy (graphical models, Java) 
- FACTORIE (graphical models, Scala) 
- CRFSuite (efficient implementation of CRFs, Python) 
Isabelle Augenstein

Mais conteúdo relacionado

Destaque

Weakly Supervised Machine Reading
Weakly Supervised Machine ReadingWeakly Supervised Machine Reading
Weakly Supervised Machine ReadingIsabelle Augenstein
 
USFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
USFD at SemEval-2016 - Stance Detection on Twitter with AutoencodersUSFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
USFD at SemEval-2016 - Stance Detection on Twitter with AutoencodersIsabelle Augenstein
 
Lodifier: Generating Linked Data from Unstructured Text
Lodifier: Generating Linked Data from Unstructured TextLodifier: Generating Linked Data from Unstructured Text
Lodifier: Generating Linked Data from Unstructured TextIsabelle Augenstein
 
Distant Supervision with Imitation Learning
Distant Supervision with Imitation LearningDistant Supervision with Imitation Learning
Distant Supervision with Imitation LearningIsabelle Augenstein
 
Relation Extraction from the Web using Distant Supervision
Relation Extraction from the Web using Distant SupervisionRelation Extraction from the Web using Distant Supervision
Relation Extraction from the Web using Distant SupervisionIsabelle Augenstein
 
Seed Selection for Distantly Supervised Web-Based Relation Extraction
Seed Selection for Distantly Supervised Web-Based Relation ExtractionSeed Selection for Distantly Supervised Web-Based Relation Extraction
Seed Selection for Distantly Supervised Web-Based Relation ExtractionIsabelle Augenstein
 
Running Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDBRunning Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDBMongoDB
 
Social Machines - 2017 Update (University of Iowa)
Social Machines - 2017 Update (University of Iowa)Social Machines - 2017 Update (University of Iowa)
Social Machines - 2017 Update (University of Iowa)James Hendler
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Peter Mika
 
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...Grokking VN
 
Social Machines: The coming collision of Artificial Intelligence, Social Netw...
Social Machines: The coming collision of Artificial Intelligence, Social Netw...Social Machines: The coming collision of Artificial Intelligence, Social Netw...
Social Machines: The coming collision of Artificial Intelligence, Social Netw...James Hendler
 
NLIDB(Natural Language Interface to DataBases)
NLIDB(Natural Language Interface to DataBases)NLIDB(Natural Language Interface to DataBases)
NLIDB(Natural Language Interface to DataBases)Swetha Pallati
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word CloudsMarina Santini
 
Building blocks for building bots
Building blocks for building botsBuilding blocks for building bots
Building blocks for building botsRita Zhang
 
"There's a bot for that!" - The World of Conversational UIs and Chat Bots
"There's a bot for that!" - The World of Conversational UIs and Chat Bots"There's a bot for that!" - The World of Conversational UIs and Chat Bots
"There's a bot for that!" - The World of Conversational UIs and Chat BotsVishrut Shukla
 

Destaque (17)

Weakly Supervised Machine Reading
Weakly Supervised Machine ReadingWeakly Supervised Machine Reading
Weakly Supervised Machine Reading
 
USFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
USFD at SemEval-2016 - Stance Detection on Twitter with AutoencodersUSFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
USFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
 
Lodifier: Generating Linked Data from Unstructured Text
Lodifier: Generating Linked Data from Unstructured TextLodifier: Generating Linked Data from Unstructured Text
Lodifier: Generating Linked Data from Unstructured Text
 
SEALS @ WWW2012
SEALS @ WWW2012SEALS @ WWW2012
SEALS @ WWW2012
 
Distant Supervision with Imitation Learning
Distant Supervision with Imitation LearningDistant Supervision with Imitation Learning
Distant Supervision with Imitation Learning
 
Relation Extraction from the Web using Distant Supervision
Relation Extraction from the Web using Distant SupervisionRelation Extraction from the Web using Distant Supervision
Relation Extraction from the Web using Distant Supervision
 
Seed Selection for Distantly Supervised Web-Based Relation Extraction
Seed Selection for Distantly Supervised Web-Based Relation ExtractionSeed Selection for Distantly Supervised Web-Based Relation Extraction
Seed Selection for Distantly Supervised Web-Based Relation Extraction
 
Running Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDBRunning Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDB
 
Social Machines - 2017 Update (University of Iowa)
Social Machines - 2017 Update (University of Iowa)Social Machines - 2017 Update (University of Iowa)
Social Machines - 2017 Update (University of Iowa)
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012
 
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
 
Social Machines: The coming collision of Artificial Intelligence, Social Netw...
Social Machines: The coming collision of Artificial Intelligence, Social Netw...Social Machines: The coming collision of Artificial Intelligence, Social Netw...
Social Machines: The coming collision of Artificial Intelligence, Social Netw...
 
NLIDB(Natural Language Interface to DataBases)
NLIDB(Natural Language Interface to DataBases)NLIDB(Natural Language Interface to DataBases)
NLIDB(Natural Language Interface to DataBases)
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word Clouds
 
Building blocks for building bots
Building blocks for building botsBuilding blocks for building bots
Building blocks for building bots
 
"There's a bot for that!" - The World of Conversational UIs and Chat Bots
"There's a bot for that!" - The World of Conversational UIs and Chat Bots"There's a bot for that!" - The World of Conversational UIs and Chat Bots
"There's a bot for that!" - The World of Conversational UIs and Chat Bots
 
Human Neural Machine
Human Neural MachineHuman Neural Machine
Human Neural Machine
 

Semelhante a Natural Language Processing for the Semantic Web

Give Me An Example Of Essay.pdf
Give Me An Example Of Essay.pdfGive Me An Example Of Essay.pdf
Give Me An Example Of Essay.pdfElizabeth Brown
 
Persuasive Essay On Everyday Use
Persuasive Essay On Everyday UsePersuasive Essay On Everyday Use
Persuasive Essay On Everyday UseTakyra Roberts
 
Essay On Disha. Online assignment writing service.
Essay On Disha. Online assignment writing service.Essay On Disha. Online assignment writing service.
Essay On Disha. Online assignment writing service.Kathleen Ward
 
Economics of Shakespeare
Economics of ShakespeareEconomics of Shakespeare
Economics of ShakespeareYumonomics
 
Legitimate Essay Writing Service.pdfLegitimate Essay Writing Service
Legitimate Essay Writing Service.pdfLegitimate Essay Writing ServiceLegitimate Essay Writing Service.pdfLegitimate Essay Writing Service
Legitimate Essay Writing Service.pdfLegitimate Essay Writing ServiceRhonda Middleton
 
Legitimate Essay Writing Service.pdf
Legitimate Essay Writing Service.pdfLegitimate Essay Writing Service.pdf
Legitimate Essay Writing Service.pdfEvelin Santos
 
How To Write A Literary Analysis Essay A Step-B
How To Write A Literary Analysis Essay A Step-BHow To Write A Literary Analysis Essay A Step-B
How To Write A Literary Analysis Essay A Step-BKathy Miller
 
The Federalist Papers Audiolibro Alexander
The Federalist Papers Audiolibro AlexanderThe Federalist Papers Audiolibro Alexander
The Federalist Papers Audiolibro AlexanderLisa Riley
 
Cultivating a Community-Oriented Organization
Cultivating a Community-Oriented OrganizationCultivating a Community-Oriented Organization
Cultivating a Community-Oriented OrganizationTim Falls
 
Essay Writing Co Uk. Buy Essay Uk Orders, Buy ready essay
Essay Writing Co Uk. Buy Essay Uk Orders, Buy ready essayEssay Writing Co Uk. Buy Essay Uk Orders, Buy ready essay
Essay Writing Co Uk. Buy Essay Uk Orders, Buy ready essayNoel Brooks
 
Franklin D Roosevelt Essay
Franklin D Roosevelt EssayFranklin D Roosevelt Essay
Franklin D Roosevelt Essaybdg81dzt
 
Franklin D Roosevelt Essay.pdf
Franklin D Roosevelt Essay.pdfFranklin D Roosevelt Essay.pdf
Franklin D Roosevelt Essay.pdfAlison Parker
 
What’s in a Name? Text and Image for Indexing Prosopographical Data
What’s in a Name?  Text and Image for Indexing Prosopographical DataWhat’s in a Name?  Text and Image for Indexing Prosopographical Data
What’s in a Name? Text and Image for Indexing Prosopographical DataEquipex Biblissima
 

Semelhante a Natural Language Processing for the Semantic Web (14)

Give Me An Example Of Essay.pdf
Give Me An Example Of Essay.pdfGive Me An Example Of Essay.pdf
Give Me An Example Of Essay.pdf
 
Persuasive Essay On Everyday Use
Persuasive Essay On Everyday UsePersuasive Essay On Everyday Use
Persuasive Essay On Everyday Use
 
Essay On Disha. Online assignment writing service.
Essay On Disha. Online assignment writing service.Essay On Disha. Online assignment writing service.
Essay On Disha. Online assignment writing service.
 
Economics of Shakespeare
Economics of ShakespeareEconomics of Shakespeare
Economics of Shakespeare
 
Legitimate Essay Writing Service.pdfLegitimate Essay Writing Service
Legitimate Essay Writing Service.pdfLegitimate Essay Writing ServiceLegitimate Essay Writing Service.pdfLegitimate Essay Writing Service
Legitimate Essay Writing Service.pdfLegitimate Essay Writing Service
 
Legitimate Essay Writing Service.pdf
Legitimate Essay Writing Service.pdfLegitimate Essay Writing Service.pdf
Legitimate Essay Writing Service.pdf
 
How To Write A Literary Analysis Essay A Step-B
How To Write A Literary Analysis Essay A Step-BHow To Write A Literary Analysis Essay A Step-B
How To Write A Literary Analysis Essay A Step-B
 
The Federalist Papers Audiolibro Alexander
The Federalist Papers Audiolibro AlexanderThe Federalist Papers Audiolibro Alexander
The Federalist Papers Audiolibro Alexander
 
Cultivating a Community-Oriented Organization
Cultivating a Community-Oriented OrganizationCultivating a Community-Oriented Organization
Cultivating a Community-Oriented Organization
 
Essay Writing Co Uk. Buy Essay Uk Orders, Buy ready essay
Essay Writing Co Uk. Buy Essay Uk Orders, Buy ready essayEssay Writing Co Uk. Buy Essay Uk Orders, Buy ready essay
Essay Writing Co Uk. Buy Essay Uk Orders, Buy ready essay
 
Bibliotheca Digitalis Summer school: What’s in a Name: Text and Image for ind...
Bibliotheca Digitalis Summer school: What’s in a Name: Text and Image for ind...Bibliotheca Digitalis Summer school: What’s in a Name: Text and Image for ind...
Bibliotheca Digitalis Summer school: What’s in a Name: Text and Image for ind...
 
Franklin D Roosevelt Essay
Franklin D Roosevelt EssayFranklin D Roosevelt Essay
Franklin D Roosevelt Essay
 
Franklin D Roosevelt Essay.pdf
Franklin D Roosevelt Essay.pdfFranklin D Roosevelt Essay.pdf
Franklin D Roosevelt Essay.pdf
 
What’s in a Name? Text and Image for Indexing Prosopographical Data
What’s in a Name?  Text and Image for Indexing Prosopographical DataWhat’s in a Name?  Text and Image for Indexing Prosopographical Data
What’s in a Name? Text and Image for Indexing Prosopographical Data
 

Mais de Isabelle Augenstein

Beyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific CommunicationBeyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific CommunicationIsabelle Augenstein
 
Automatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationAutomatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationIsabelle Augenstein
 
Accountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact CheckingAccountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact CheckingIsabelle Augenstein
 
Determining the Credibility of Science Communication
Determining the Credibility of Science CommunicationDetermining the Credibility of Science Communication
Determining the Credibility of Science CommunicationIsabelle Augenstein
 
Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Isabelle Augenstein
 
Towards Explainable Fact Checking
Towards Explainable Fact CheckingTowards Explainable Fact Checking
Towards Explainable Fact CheckingIsabelle Augenstein
 
Tracking False Information Online
Tracking False Information OnlineTracking False Information Online
Tracking False Information OnlineIsabelle Augenstein
 
What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...Isabelle Augenstein
 
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...Isabelle Augenstein
 
Learning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyondLearning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyondIsabelle Augenstein
 
Learning to read for automated fact checking
Learning to read for automated fact checkingLearning to read for automated fact checking
Learning to read for automated fact checkingIsabelle Augenstein
 
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...Isabelle Augenstein
 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...Isabelle Augenstein
 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...Isabelle Augenstein
 
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Isabelle Augenstein
 
Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...Isabelle Augenstein
 

Mais de Isabelle Augenstein (17)

Beyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific CommunicationBeyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific Communication
 
Automatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationAutomatically Detecting Scientific Misinformation
Automatically Detecting Scientific Misinformation
 
Accountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact CheckingAccountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact Checking
 
Determining the Credibility of Science Communication
Determining the Credibility of Science CommunicationDetermining the Credibility of Science Communication
Determining the Credibility of Science Communication
 
Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)
 
Explainability for NLP
Explainability for NLPExplainability for NLP
Explainability for NLP
 
Towards Explainable Fact Checking
Towards Explainable Fact CheckingTowards Explainable Fact Checking
Towards Explainable Fact Checking
 
Tracking False Information Online
Tracking False Information OnlineTracking False Information Online
Tracking False Information Online
 
What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...
 
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
 
Learning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyondLearning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyond
 
Learning to read for automated fact checking
Learning to read for automated fact checkingLearning to read for automated fact checking
Learning to read for automated fact checking
 
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
 
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
 
Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...
 

Último

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Último (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Natural Language Processing for the Semantic Web

  • 1. Natural Language Processing for the Semantic Web Isabelle Augenstein Department of Computer Science, University of Sheffield, UK i.augenstein@sheffield.ac.uk Linked Data for NLP Tutorial, ESWC Summer School 2014 August 24, 2014
  • 2. Why Natural Language Processing? 2 Isabelle Augenstein
  • 3. Why Natural Language Processing? 3 semi-structured information Isabelle Augenstein unstructured information
  • 4. Why Natural Language Processing? 4 semi-structured information Isabelle Augenstein unstructured information How to link this information to a knowledge base automatically?
  • 5. Why Natural Language Processing? 5 semi-structured information Isabelle Augenstein unstructured information How to link this information to a knowledge base automatically? Information Extraction!
  • 6. Information Extraction 6 Between AD 1400 and 1450, China was a global superpower run by one family – the Ming dynasty – who established Beijing as the capital and built the Forbidden City. Isabelle Augenstein
  • 7. Named Entities 7 Between AD 1400 and 1450, China was a global superpower run by one family – the Ming dynasty – who established Beijing as the capital and built the Forbidden City. Named Entity Recognition Isabelle Augenstein
  • 8. Named Entities 8 Between AD 1400 and 1450, China was a global superpower run by one family – the Ming dynasty – who established Beijing as the capital and built the Forbidden City. Named Entity Recognition (NER) Named Entity Classification (NEC): China: /location/country Ming dynasty: /royalty/royal_line Beijing: /location/city Forbidden City: /location/city Isabelle Augenstein
  • 9. Named Entities 9 Between AD 1400 and 1450, China was a global superpower run by one family – the Ming dynasty – who established Beijing as the capital and built the Forbidden City. Named Entity Recognition Named Entity Classification: Named Entity Linking: China: /location/country China: /m/0d05w3 Ming dynasty: /royalty/royal_line Ming dynasty: /m/0bw_m Beijing: /location/city Beijing: /m/01914 Forbidden City: /location/city Forbidden City: /m/0j0b2 Isabelle Augenstein
  • 10. Named Entities: Definition 10 Named Entities: Proper nouns, which refer to real-life entities Named Entity Recognition: Detecting boundaries of named entities (NEs) Named Entity Classification: Assigning classes to NEs, such as PERSON, LOCATION, ORGANISATION, or fine-grained classes such as ROYAL LINE Named Entity Linking / Disambiguation: Linking NEs to concrete entries in knowledge base, example: China -> LOCATION: Republic of China, country in East Asia -> LOCATION: China proper, core region of China during Qing dynasty -> LOCATION: China, Texas -> PERSON: China, Brazilian footballer born in 1964 -> MUSIC: China, a 1979 album by Vangelis -> …
  • 11. Relations 11 Between AD 1400 and 1450, China was a global superpower run by one family – the Ming dynasty – who established Beijing as the capital and built the Forbidden City. Named Entity Recognition Relation Extraction Isabelle Augenstein /royalty/royal_line/kingdom_s_ruled /location/country/capital /location/location/containedby
  • 12. Relations and Time Expressions 12 Between AD 1400 and 1450, China was a global 1400-XX-XX -- 1450-XX-XX superpower run by one family – the Ming dynasty – who established Beijing as the capital and built the Forbidden City. Named Entity Recognition Relation Extraction Temporal Extraction Isabelle Augenstein /royalty/royal_line/kingdom_s_ruled /location/country/capital /location/location/containedby
  • 13. Relations, Time Expressions 13 and Events Between AD 1400 and 1450, China was a global 1400-XX-XX -- 1450-XX-XX superpower run by one family – the Ming dynasty – who established Beijing as the capital and built the Forbidden City. Named Entity Recognition Relation Extraction Temporal Extraction Isabelle Augenstein /royalty/royal_line/kingdom_s_ruled /location/country/capital /location/location/containedby Event Extraction Event: /royalty/royal_line: Ming dynasty /royalty/royal_line/ruled_from: 1400-XX-XX /royalty/royal_line/ruled_to: 1450-XX-XX /royalty/royal_line/kingdom_s_ruled: China
  • 14. Relations, Time Expressions 14 and Events: Definition Relations: Two or more entities which relate to one another in real life Relation Extraction: Detecting relations between entities and assigning relation types to them, such as CAPITAL-OF Temporal Extraction: Recognising and normalising time expressions: times (e.g. “3 in the afternoon”), dates (“tomorrow”), durations (“since yesterday”), and sets (e.g. “twice a month”) Events: Real-life events that happened at some point in space and time, e.g. kingdom, assassination, exhibition Event Extraction: Extracting events consisting of the name and type of event, time and location
  • 15. Summary: Introduction 15 • Information extraction (IE) methods such as named entity recognition (NER), named entity classification (NEC), named entity linking, relation extraction (RE), temporal extraction, and event extraction can help to add markup to Web pages • Information extraction approaches can serve two purposes: • Annotating every single mention of an entity, relation or event, e.g. to add markup to Web pages • Aggregating those mentions to populate knowledge bases, e.g. based on confidence values and majority voting China LOCATION 0.9 China LOCATION 0.8 China PERSON 0.4 à China LOCATION Isabelle Augenstein
  • 16. Information Extraction: Methods 16 • Focus of the rest of the tutorial and hands-on session: Named entity recognition and classification (NERC) • Possible methodologies • Rule-based approaches: write manual extraction rules • Machine learning based approaches • Supervised learning: manually annotate text, train machine learning model • Unsupervised learning: extract language patterns, cluster similar ones • Semi-supervised learning: start with a small number of language patterns, iteratively learn more (bootstrapping) • Gazetteer-based method: use existing list of named entities • Combination of the above Isabelle Augenstein
  • 17. Information Extraction: Methods 17 Developing a NERC involves programming based around APIs.. Isabelle Augenstein
  • 18. Information Extraction: Methods 18 Developing a NERC involves programming based around APIs.. which can be frustrating at times Isabelle Augenstein
  • 19. Information Extraction: Methods 19 and (at least basic) knowledge about linguistics Isabelle Augenstein
  • 20. Background: Linguistics 20 Acoustic signal Phonemes Phonology, phonetics Morphemes Words Sentences Meaning Lexicon Morphology Isabelle Augenstein -> /l/ Syntax Semantics, Discourse /ʃ/ -> sh wait + ed -> waited, cat -> cat wait -> V, cat -> N NP NP The (D) farmer (N) hit (V) the (D) donkey (N). Every farmer who owns a donkey beats it. ∀x∀y (farmer(x) ∧ donkey(y) ∧ own(x, y) → beat(x, y))
  • 21. Background: NLP Tasks 21 Acoustic signal Phonemes Morphemes Words Sentences Meaning Isabelle Augenstein Speech recognition Phonology, phonetics Morphology -> /l/ Syntax Lemmatisation or stemming, part of speech (POS) tagging Chunking, parsing Lexicon Semantics, Discourse Semantic and discourse analysis, anaphora resolution
  • 22. Background: Linguistics 22 Acoustic signal Phonemes Morphemes Words Sentences Meaning Isabelle Augenstein recognize speech – wreck a nice beach /ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ Phonology, phonetics Lexicon Morphology -> /l/ Syntax Semantics, Discourse
  • 23. Background: Linguistics 23 Acoustic signal Phonemes Morphemes Words Sentences Meaning Isabelle Augenstein recognize speech – wreck a nice beach /ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ Phonology, phonetics Lexicon She’d /ʃiːd/ Morphology -> /l/ Syntax Semantics, Discourse
  • 24. Background: Linguistics 24 Acoustic signal Phonemes Morphemes Words Sentences Meaning Isabelle Augenstein recognize speech – wreck a nice beach /ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ Phonology, phonetics Lexicon She’d /ʃiːd/ -> she would, she had Morphology -> /l/ Syntax Semantics, Discourse
  • 25. Background: Linguistics 25 Acoustic signal Phonemes Morphemes Words Sentences Meaning Isabelle Augenstein recognize speech – wreck a nice beach /ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ Phonology, phonetics Lexicon She’d /ʃiːd/ -> she would, she had Morphology -> /l/ Syntax Time flies like an arrow Semantics, Discourse
  • 26. Background: Linguistics 26 Acoustic signal Phonemes Morphemes Words Sentences Meaning Isabelle Augenstein recognize speech – wreck a nice beach /ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ Phonology, phonetics Lexicon She’d /ʃiːd/ -> she would, she had Morphology -> /l/ Syntax Time flies(V/N) like(V/P) an arrow Semantics, Discourse
  • 27. Background: Linguistics 27 Acoustic signal Phonemes Morphemes Words Sentences Meaning Isabelle Augenstein recognize speech – wreck a nice beach /ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ Phonology, phonetics Lexicon She’d /ʃiːd/ -> she would, she had Morphology -> /l/ Syntax Time flies(V/N) like(V/P) an arrow The woman saw the man with the binoculars. Semantics, Discourse
  • 28. Background: Linguistics 28 Acoustic signal Phonemes Morphemes Words Sentences Meaning Isabelle Augenstein recognize speech – wreck a nice beach /ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ Phonology, phonetics Lexicon She’d /ʃiːd/ -> she would, she had Morphology -> /l/ Syntax Time flies(V/N) like(V/P) an arrow The woman saw the man with the binoculars. -> Who had the binoculars? Semantics, Discourse
  • 29. Background: Linguistics 29 Acoustic signal Phonemes Morphemes Words Sentences Meaning Isabelle Augenstein recognize speech – wreck a nice beach /ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ Phonology, phonetics Lexicon She’d /ʃiːd/ -> she would, she had Morphology -> /l/ Syntax Time flies(V/N) like(V/P) an arrow The woman saw the man with the binoculars. -> Who had the binoculars? Semantics, Discourse Somewhere in Britain, some woman has a child every thirty seconds.
  • 30. Background: Linguistics 30 Acoustic signal Phonemes Morphemes Words Sentences Meaning Isabelle Augenstein recognize speech – wreck a nice beach /ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ Phonology, phonetics Lexicon She’d /ʃiːd/ -> she would, she had Morphology -> /l/ Syntax Time flies(V/N) like(V/P) an arrow The woman saw the man with the binoculars. -> Who had the binoculars? Semantics, Discourse Somewhere in Britain, some woman has a child every thirty seconds. -> Same woman or different women?
  • 31. Background: Linguistics 31 Acoustic signal Phonemes Morphemes Words Sentences Meaning Isabelle Augenstein Ambiguities on every level recognize speech – wreck a nice beach /ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ Phonology, phonetics Lexicon She’d /ʃiːd/ -> she would, she had Morphology -> /l/ Syntax Time flies(V/N) like(V/P) an arrow The woman saw the man with the binoculars. -> Who had the binoculars? Semantics, Discourse Somewhere in Britain, some woman has a child every thirty seconds. -> Same woman or different women?
  • 32. Background: Linguistics 32 Acoustic signal Phonemes Morphemes Words Sentences Meaning Isabelle Augenstein Ambiguities on every level Y U SO recognize speech – wreck a nice beach /ˈrek.əәɡ.naɪz/ /spiːtʃ/ – /rek/ /əә/ /naɪs/ /biːtʃ/ Phonology, phonetics Lexicon She’d /ʃiːd/ -> she would, she had Morphology -> /l/ Syntax Time flies(V/N) like(V/P) an arrow The woman saw the man with the binoculars. -> Who had the binoculars? Semantics, Discourse Somewhere in Britain, some woman has a child every thirty seconds. -> Same woman or different women? AMBIGUOUS?
  • 33. Information Extraction 33 Language is ambiguous.. Can we still build named entity extractors that extract all entities from unseen text correctly? Isabelle Augenstein
  • 34. Information Extraction 34 Language is ambiguous.. Can we still build named entity extractors that extract all entities from unseen text correctly? Isabelle Augenstein
  • 35. Information Extraction 35 Language is ambiguous.. Can we still build named entity extractors that extract all entities from unseen text correctly? However, we can try to extract most of them correctly using linguistic cues and background knowledge! Isabelle Augenstein
  • 36. NERC: Features 36 What can help to recognise and/or classify named entities? • Words: • Words in window before and after mention • Sequences • Bags of words Between AD 1400 and 1450, China was a global superpower w: China w-1: , w-2: 1450 w+1: was w+2: a seq[-]: 1450, seq[+]: was a bow: China bow[-]: , bow[-]: 1450 bow[+]: was bow[+]: a Isabelle Augenstein
  • 37. NERC: Features 37 What can help to recognise and/or classify named entities? • Morphology: • Capitalisation: is upper case (China), all upper case (IBM), mixed case (eBay) • Symbols: contains $, £, €, roman symbols (IV), .. • Contains period (google.com), apostrophe (Mandy’s), hyphen (speed-o-meter), ampersand (Fisher & Sons) • Stem or Lemma (cats -> cat), prefix (disadvantages -> dis), suffix (cats -> s), interfix (speed-o-meter -> o) Isabelle Augenstein
  • 38. our learning component. 4.1 Part-of-Speech Tags Part-Of-Speech (POS) tags have often been considered as an important discriminative feature for term identification. Many works on key term identification apply either fixed or regular expression POS tag patterns to improve their ef-fectiveness. NERC: Features 38 Nonetheless, POS tags alone cannot produce What can help to recognise and/or classify named entities? high-quality results. As can be seen from the overall POS tag distribution graph extracted from one of our collections (see Figure 3), many of the most frequent tag patterns (e.g., JJNN tagging adjectives and nouns6) are far from yielding perfect results. • POS (part of speech) tags • Most named entities are nouns • Prokofyev (2014) Isabelle Augenstein
  • 39. Figure 3: Top 6 most frequent part-of-speech tag patterns of the SIGIR collection, where JJ stands comma, while the n-gram“automatic which is indeed a valid the beginning or at the end The contingency tables given this: The +punctuation respectively, the counts of the punctuation mark in any of of the n-grams that have no occurrences. From the tables, of punctuation marks (+punctuation) an n-gram occurs twice as often valid entities compared to the that the absence of punctuation less frequently for the valid ones. Table 1: Contingency table appearing immediately before Valid +punctuation 1622 punctuation 6523 Totals 8145 Table 2: Contingency table appearing immediately after Valid +punctuation 4887
  • 41. Morphology: Penn Treebank POS tags 40 Nouns (all start with N) Verbs (all start with V) Adjectives (all start with J)
  • 42. our learning component. 4.1 Part-of-Speech Tags Part-Of-Speech (POS) tags have often been considered as an important discriminative feature for term identification. Many works on key term identification apply either fixed or regular expression POS tag patterns to improve their ef-fectiveness. NERC: Features 41 Nonetheless, POS tags alone cannot produce What can help to recognise and/or classify named entities? high-quality results. As can be seen from the overall POS tag distribution graph extracted from one of our collections (see Figure 3), many of the most frequent tag patterns (e.g., JJNN tagging adjectives and nouns6) are far from yielding perfect results. • POS (part of speech) tags • Most named entities are nouns • Prokofyev (2014) Isabelle Augenstein
  • 43. Figure 3: Top 6 most frequent part-of-speech tag patterns of the SIGIR collection, where JJ stands comma, while the n-gram“automatic which is indeed a valid the beginning or at the end The contingency tables given this: The +punctuation respectively, the counts of the punctuation mark in any of of the n-grams that have no occurrences. From the tables, of punctuation marks (+punctuation) an n-gram occurs twice as often valid entities compared to the that the absence of punctuation less frequently for the valid ones. Table 1: Contingency table appearing immediately before Valid +punctuation 1622 punctuation 6523 Totals 8145 Table 2: Contingency table appearing immediately after Valid +punctuation 4887
  • 44. NERC: Features 42 What can help to recognise and/or classify named entities? • Gazetteers • Retrieved from HTML lists or tables • Using regular expressions patterns and search engines (e.g. “Religions such as * ”) • Retrieved from knowledge bases Isabelle Augenstein
  • 45. NERC: Training Models 43 Extensive choice of machine learning algorithms for training NERCs
  • 46. NERC: Training Models 44 Extensive choice of machine learning algorithms for training NERCs
  • 47. NERC: Training Models 45 • Unfortunately, there isn’t enough time to explain machine learning algorithms in detail • CRFs (conditional random fields) are one of the most widely used algorithms for NERC • Graphical models, view NERC as a sequence labelling task • Named entities consist of a beginning token (B), inside tokens (I), and outside tokens (O) China (B-LOC) built (O) the (O) Forbidden (B-LOC) City (I-LOC) . (O) • For now, we will focus on rule- and gazetteer-based NERC • It is fairly easy to write manual extraction rules for NEs, can achieve a high performance when combined with gazetteers • This can be done with the GATE software (general architecture for text engineering) and Jape rules - Hands-on session Isabelle Augenstein
  • 48. Outlook: NLP ML Software 46 Natural Language Processing: - GATE (general purpose architecture, includes other NLP and ML software as plugins) - Stanford NLP (Java) - OpenNLP (Java) - NLTK (Python) Machine Learning: - scikit-learn (Python, rich documentation, highly recommended!) - Mallet (Java) - WEKA (Java) - Alchemy (graphical models, Java) - FACTORIE (graphical models, Scala) - CRFSuite (efficient implementation of CRFs, Python) Isabelle Augenstein
  • 49. Outlook: NLP ML Software 47 Ready to use NERC software: - ANNIE (rule-based, part of GATE) - Wikifier (based on Wikipedia) - FIGER (based on Wikipedia, fine-grained Freebase NE classes) Almost ready to use NERC software: - CRFSuite (already includes Python implementation for feature extraction, you just need to feed it with training data, which you can also download) Ready to use RE software: - ReVerb (Open IE, extracts patterns for any kind of relation) - MultiR (Distant supervision, relation extractor trained on Freebase) Web Content Extraction software: - Boilerpipe (extract main text content from Web pages) - Jsoup (traverse elements of Web pages individually, also allows to extract text) Isabelle Augenstein
  • 50. Natural Language Processing for the 48 Semantic Web Thank you for your attention! Questions? Isabelle Augenstein