SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
How to compute
semantic relationships
between entities
and facts out of
natural texts
Michael Fuchs
Technology Evangelist
ABBYY
fuchs@abbyy.com
Agenda
1. How machines read pixels
2. Documents, words, layout & semantics
3. Syntactic & semantic text parsing
4. Live demo
5. Q&A
2
How machines read pixels
3
Separate pixels to charactersPixel analysis Find text/image blocks
How machines read pixels
4
Build proper words as editable textRecognize individual characters
-> Linguistics: Alphabets & Morphology Dictionaries
-> Math, AI, Statistics, Experience, and…
Requirements to make a machine read text:
5
What is needed to make
a machine understand the meaning
of words, sentences, texts?
Documents & Words
6
What is a document?
Statistics can give
basic insights
-> No real semantic
understanding
b) Words in order?
Layouts generate
visual pattern
-> Semantics can be
derived from layout
a) Bag of words?
Documents, Words and Layout
7
Document with layout
Text document with “simulated” layout Text with line breaks
Text only
-> Rules can extract data out of (semi-)structured texts and documents
-> Layout helps to identify the semantic meaning of data
Text and Structure
Is “plain” natural language text unstructured?
8
-> yes, at least for almost all IT systems
-> not for humans who can read and
speak the language
-> Facts and their relations can’t be reliably
detected with “simple” rules
Text, Structure & Translation
9
Is a word by word translation enough?
-> … well – not really…
-> Semantic understanding of the words and
their relationship in sentences is needed!
-> That is true for humans and machines
Text & Structure
10
Why is natural language text understanding difficult for machines?
-> Languages are not logical and context dependent
– different usage, e.g. as verb, noun, adjective
-> Different words – the same concept, e.g. to buy/sell something
– different meanings, e.g. run, plant, apple …
-> One word – different variants, e.g. go, went, gone
Basic Language Structure
11
-> Morphology = Rules how to use words
-> Semantics = meaning and the usage of words
-> Semantic Relations = reflect/organise the meaning and
relations of words and sentences.
-> Syntax = Rules are used to build correct sentences
How to get to the insides of a sentence?
Compreno System Architecture
13
Extraction rules
Interpretation
rules
Identification
rules
Morphological
analyzer
Syntactic and
semantic analysis
Anaphora
resolution
Disambiguation
Semantic
representation
of text
Parser Information
Extraction
Module
RDF Graph
Morphology Analysis
1414
Sentence Analysis with Semantic Info
15
17
How to get the correct
semantic meaning of words?
ABBYY’s answer:
Universal Semantic Hierarchy
= language independent semantic concepts
ABBYY’s Universal Semantic Hierarchy
18
Semantic Meaning “Vocabulary” EN “Vocabulary” DE
Handling Lexical Ambiguity
19
Recovering Omitted Words and Links (Ellipsis)
20
Recovered Node
Ellipsis
Identifying Pronoun Referents (Anaphora)
21
Mary saw her students. They were wearing masks. She was surprised.
(Mary → her, Mary → she, students → they).
From Text to Semantic with Compreno
22
DEMO
Summary: What is ABBYY Compreno?
● … NLP technology featuring a unique model-based approach that employs
universal language models and identifies language structures.
● …. combines both syntactic and semantic analysis, as well as machine learning
on untagged text corpora.
● … allows to create a semantic representation of text
● … able to resolve complex language phenomena:
− lexical ambiguity
− omitted words and links recovering ellipsis
− identifying pronoun referents anaphora
− coreference
− coordination and more
● … support of English, Russian, German in progress
24
QUESTIONS?
Thank you for
your attention!

Mais conteúdo relacionado

Destaque

Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...
Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...
Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...semanticsconference
 
Philippe Martin and Jérémy Bénard | Importing, Translating and Exporting Know...
Philippe Martin and Jérémy Bénard | Importing, Translating and Exporting Know...Philippe Martin and Jérémy Bénard | Importing, Translating and Exporting Know...
Philippe Martin and Jérémy Bénard | Importing, Translating and Exporting Know...semanticsconference
 
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...semanticsconference
 
Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...
Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...
Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...semanticsconference
 
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...semanticsconference
 
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...semanticsconference
 
David Kuilman | Creating a Semantic Enterprise Content model to support conti...
David Kuilman | Creating a Semantic Enterprise Content model to support conti...David Kuilman | Creating a Semantic Enterprise Content model to support conti...
David Kuilman | Creating a Semantic Enterprise Content model to support conti...semanticsconference
 
Victor Charpenay | Standardized Semantics for an Open Web of Things
Victor Charpenay | Standardized Semantics for an Open Web of ThingsVictor Charpenay | Standardized Semantics for an Open Web of Things
Victor Charpenay | Standardized Semantics for an Open Web of Thingssemanticsconference
 
Shuangyong Song, Qingliang Miao and Yao Meng | Linking Images to Semantic Kno...
Shuangyong Song, Qingliang Miao and Yao Meng | Linking Images to Semantic Kno...Shuangyong Song, Qingliang Miao and Yao Meng | Linking Images to Semantic Kno...
Shuangyong Song, Qingliang Miao and Yao Meng | Linking Images to Semantic Kno...semanticsconference
 
Kolawole John Adebayo, Luigi Di Caro and Guido Boella | A Supervised Keyphras...
Kolawole John Adebayo, Luigi Di Caro and Guido Boella | A Supervised Keyphras...Kolawole John Adebayo, Luigi Di Caro and Guido Boella | A Supervised Keyphras...
Kolawole John Adebayo, Luigi Di Caro and Guido Boella | A Supervised Keyphras...semanticsconference
 
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...semanticsconference
 
Najmeh Mousavi Nejad, Simon Scerri, Sören Auer and Elisa M. Sibarani | EULAid...
Najmeh Mousavi Nejad, Simon Scerri, Sören Auer and Elisa M. Sibarani | EULAid...Najmeh Mousavi Nejad, Simon Scerri, Sören Auer and Elisa M. Sibarani | EULAid...
Najmeh Mousavi Nejad, Simon Scerri, Sören Auer and Elisa M. Sibarani | EULAid...semanticsconference
 
Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
Chalitha Perera | Cross Media Concept and Entity Driven Search for EnterpriseChalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprisesemanticsconference
 
Kostas Kastrantas | Business Opportunities with Linked Open Data
Kostas Kastrantas  | Business Opportunities with Linked Open DataKostas Kastrantas  | Business Opportunities with Linked Open Data
Kostas Kastrantas | Business Opportunities with Linked Open Datasemanticsconference
 
OWL-based validation by Gavin Mendel Gleasonand Bojan Bozic, Trinity College,...
OWL-based validation by Gavin Mendel Gleasonand Bojan Bozic, Trinity College,...OWL-based validation by Gavin Mendel Gleasonand Bojan Bozic, Trinity College,...
OWL-based validation by Gavin Mendel Gleasonand Bojan Bozic, Trinity College,...semanticsconference
 
Thomas Vavra | New Ways of Handling Old Data
Thomas Vavra | New Ways of Handling Old DataThomas Vavra | New Ways of Handling Old Data
Thomas Vavra | New Ways of Handling Old Datasemanticsconference
 
OOPS!: on-line ontology diagnosis by Maria Poveda
OOPS!: on-line ontology diagnosis by Maria PovedaOOPS!: on-line ontology diagnosis by Maria Poveda
OOPS!: on-line ontology diagnosis by Maria Povedasemanticsconference
 
Georgios Meditskos and Stamatia Dasiopoulou | Question Answering over Pattern...
Georgios Meditskos and Stamatia Dasiopoulou | Question Answering over Pattern...Georgios Meditskos and Stamatia Dasiopoulou | Question Answering over Pattern...
Georgios Meditskos and Stamatia Dasiopoulou | Question Answering over Pattern...semanticsconference
 
Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE
Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINEFelix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE
Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINEsemanticsconference
 
Sören Auer | Enterprise Knowledge Graphs
Sören Auer | Enterprise Knowledge GraphsSören Auer | Enterprise Knowledge Graphs
Sören Auer | Enterprise Knowledge Graphssemanticsconference
 

Destaque (20)

Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...
Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...
Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...
 
Philippe Martin and Jérémy Bénard | Importing, Translating and Exporting Know...
Philippe Martin and Jérémy Bénard | Importing, Translating and Exporting Know...Philippe Martin and Jérémy Bénard | Importing, Translating and Exporting Know...
Philippe Martin and Jérémy Bénard | Importing, Translating and Exporting Know...
 
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
 
Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...
Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...
Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...
 
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
 
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
 
David Kuilman | Creating a Semantic Enterprise Content model to support conti...
David Kuilman | Creating a Semantic Enterprise Content model to support conti...David Kuilman | Creating a Semantic Enterprise Content model to support conti...
David Kuilman | Creating a Semantic Enterprise Content model to support conti...
 
Victor Charpenay | Standardized Semantics for an Open Web of Things
Victor Charpenay | Standardized Semantics for an Open Web of ThingsVictor Charpenay | Standardized Semantics for an Open Web of Things
Victor Charpenay | Standardized Semantics for an Open Web of Things
 
Shuangyong Song, Qingliang Miao and Yao Meng | Linking Images to Semantic Kno...
Shuangyong Song, Qingliang Miao and Yao Meng | Linking Images to Semantic Kno...Shuangyong Song, Qingliang Miao and Yao Meng | Linking Images to Semantic Kno...
Shuangyong Song, Qingliang Miao and Yao Meng | Linking Images to Semantic Kno...
 
Kolawole John Adebayo, Luigi Di Caro and Guido Boella | A Supervised Keyphras...
Kolawole John Adebayo, Luigi Di Caro and Guido Boella | A Supervised Keyphras...Kolawole John Adebayo, Luigi Di Caro and Guido Boella | A Supervised Keyphras...
Kolawole John Adebayo, Luigi Di Caro and Guido Boella | A Supervised Keyphras...
 
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
 
Najmeh Mousavi Nejad, Simon Scerri, Sören Auer and Elisa M. Sibarani | EULAid...
Najmeh Mousavi Nejad, Simon Scerri, Sören Auer and Elisa M. Sibarani | EULAid...Najmeh Mousavi Nejad, Simon Scerri, Sören Auer and Elisa M. Sibarani | EULAid...
Najmeh Mousavi Nejad, Simon Scerri, Sören Auer and Elisa M. Sibarani | EULAid...
 
Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
Chalitha Perera | Cross Media Concept and Entity Driven Search for EnterpriseChalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
 
Kostas Kastrantas | Business Opportunities with Linked Open Data
Kostas Kastrantas  | Business Opportunities with Linked Open DataKostas Kastrantas  | Business Opportunities with Linked Open Data
Kostas Kastrantas | Business Opportunities with Linked Open Data
 
OWL-based validation by Gavin Mendel Gleasonand Bojan Bozic, Trinity College,...
OWL-based validation by Gavin Mendel Gleasonand Bojan Bozic, Trinity College,...OWL-based validation by Gavin Mendel Gleasonand Bojan Bozic, Trinity College,...
OWL-based validation by Gavin Mendel Gleasonand Bojan Bozic, Trinity College,...
 
Thomas Vavra | New Ways of Handling Old Data
Thomas Vavra | New Ways of Handling Old DataThomas Vavra | New Ways of Handling Old Data
Thomas Vavra | New Ways of Handling Old Data
 
OOPS!: on-line ontology diagnosis by Maria Poveda
OOPS!: on-line ontology diagnosis by Maria PovedaOOPS!: on-line ontology diagnosis by Maria Poveda
OOPS!: on-line ontology diagnosis by Maria Poveda
 
Georgios Meditskos and Stamatia Dasiopoulou | Question Answering over Pattern...
Georgios Meditskos and Stamatia Dasiopoulou | Question Answering over Pattern...Georgios Meditskos and Stamatia Dasiopoulou | Question Answering over Pattern...
Georgios Meditskos and Stamatia Dasiopoulou | Question Answering over Pattern...
 
Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE
Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINEFelix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE
Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE
 
Sören Auer | Enterprise Knowledge Graphs
Sören Auer | Enterprise Knowledge GraphsSören Auer | Enterprise Knowledge Graphs
Sören Auer | Enterprise Knowledge Graphs
 

Semelhante a Michael Fuchs | How to compute semantic relationships between entities and facts out of natural texts

Introduction to Semantic Technology for SharePoint Administrators
Introduction to Semantic Technology for SharePoint AdministratorsIntroduction to Semantic Technology for SharePoint Administrators
Introduction to Semantic Technology for SharePoint AdministratorsBradley Bennet
 
Information retrieval based on word sens 1
Information retrieval based on word sens 1Information retrieval based on word sens 1
Information retrieval based on word sens 1ATHMAN HAJ-HAMOU
 
Coaching kippsters to guided reading success
Coaching kippsters to guided reading successCoaching kippsters to guided reading success
Coaching kippsters to guided reading successbvardiman
 
Conceptual foundations of text mining and preprocessing steps nfaoui el_habib
Conceptual foundations of text mining and preprocessing steps nfaoui el_habibConceptual foundations of text mining and preprocessing steps nfaoui el_habib
Conceptual foundations of text mining and preprocessing steps nfaoui el_habibEl Habib NFAOUI
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language ProcessingMichel Bruley
 
The search engine index
The search engine indexThe search engine index
The search engine indexCJ Jenkins
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional SemanticsAndre Freitas
 
Survey methods of_teaching_esl_reading
Survey methods of_teaching_esl_readingSurvey methods of_teaching_esl_reading
Survey methods of_teaching_esl_readingMarv1
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingBhavya Chawla
 
Reading Automaticity by David LaBerge and S Jay Samuels
Reading Automaticity by David LaBerge  and S Jay SamuelsReading Automaticity by David LaBerge  and S Jay Samuels
Reading Automaticity by David LaBerge and S Jay SamuelsDoha Zallag
 
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUEJournal For Research
 

Semelhante a Michael Fuchs | How to compute semantic relationships between entities and facts out of natural texts (20)

NLP
NLPNLP
NLP
 
Textmining
TextminingTextmining
Textmining
 
Introduction to Semantic Technology for SharePoint Administrators
Introduction to Semantic Technology for SharePoint AdministratorsIntroduction to Semantic Technology for SharePoint Administrators
Introduction to Semantic Technology for SharePoint Administrators
 
Nlp
NlpNlp
Nlp
 
NLP
NLPNLP
NLP
 
NLP
NLPNLP
NLP
 
nlp (1).pptx
nlp (1).pptxnlp (1).pptx
nlp (1).pptx
 
Information retrieval based on word sens 1
Information retrieval based on word sens 1Information retrieval based on word sens 1
Information retrieval based on word sens 1
 
Coaching kippsters to guided reading success
Coaching kippsters to guided reading successCoaching kippsters to guided reading success
Coaching kippsters to guided reading success
 
Conceptual foundations of text mining and preprocessing steps nfaoui el_habib
Conceptual foundations of text mining and preprocessing steps nfaoui el_habibConceptual foundations of text mining and preprocessing steps nfaoui el_habib
Conceptual foundations of text mining and preprocessing steps nfaoui el_habib
 
Nlp (1)
Nlp (1)Nlp (1)
Nlp (1)
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
NLP todo
NLP todoNLP todo
NLP todo
 
The search engine index
The search engine indexThe search engine index
The search engine index
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
 
Survey methods of_teaching_esl_reading
Survey methods of_teaching_esl_readingSurvey methods of_teaching_esl_reading
Survey methods of_teaching_esl_reading
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Reading Automaticity by David LaBerge and S Jay Samuels
Reading Automaticity by David LaBerge  and S Jay SamuelsReading Automaticity by David LaBerge  and S Jay Samuels
Reading Automaticity by David LaBerge and S Jay Samuels
 
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
 

Mais de semanticsconference

Linear books to open world adventure
Linear books to open world adventureLinear books to open world adventure
Linear books to open world adventuresemanticsconference
 
Session 1.2 high-precision, context-free entity linking exploiting unambigu...
Session 1.2   high-precision, context-free entity linking exploiting unambigu...Session 1.2   high-precision, context-free entity linking exploiting unambigu...
Session 1.2 high-precision, context-free entity linking exploiting unambigu...semanticsconference
 
Session 4.3 semantic annotation for enhancing collaborative ideation
Session 4.3   semantic annotation for enhancing collaborative ideationSession 4.3   semantic annotation for enhancing collaborative ideation
Session 4.3 semantic annotation for enhancing collaborative ideationsemanticsconference
 
Session 1.1 dalicc - data licenses clearance center
Session 1.1   dalicc - data licenses clearance centerSession 1.1   dalicc - data licenses clearance center
Session 1.1 dalicc - data licenses clearance centersemanticsconference
 
Session 1.3 context information management across smart city knowledge domains
Session 1.3   context information management across smart city knowledge domainsSession 1.3   context information management across smart city knowledge domains
Session 1.3 context information management across smart city knowledge domainssemanticsconference
 
Session 0.0 aussenac semanticsnl-pwebsem2017-v4
Session 0.0   aussenac semanticsnl-pwebsem2017-v4Session 0.0   aussenac semanticsnl-pwebsem2017-v4
Session 0.0 aussenac semanticsnl-pwebsem2017-v4semanticsconference
 
Session 0.0 keynote sandeep sacheti - final hi res
Session 0.0   keynote sandeep sacheti - final hi resSession 0.0   keynote sandeep sacheti - final hi res
Session 0.0 keynote sandeep sacheti - final hi ressemanticsconference
 
Session 1.1 linked data applied: a field report from the netherlands
Session 1.1   linked data applied: a field report from the netherlandsSession 1.1   linked data applied: a field report from the netherlands
Session 1.1 linked data applied: a field report from the netherlandssemanticsconference
 
Session 1.2 enrich your knowledge graphs: linked data integration with pool...
Session 1.2   enrich your knowledge graphs: linked data integration with pool...Session 1.2   enrich your knowledge graphs: linked data integration with pool...
Session 1.2 enrich your knowledge graphs: linked data integration with pool...semanticsconference
 
Session 1.4 connecting information from legislation and datasets using a ca...
Session 1.4   connecting information from legislation and datasets using a ca...Session 1.4   connecting information from legislation and datasets using a ca...
Session 1.4 connecting information from legislation and datasets using a ca...semanticsconference
 
Session 1.4 a distributed network of heritage information
Session 1.4   a distributed network of heritage informationSession 1.4   a distributed network of heritage information
Session 1.4 a distributed network of heritage informationsemanticsconference
 
Session 0.0 media panel - matthias priem - gtuo - semantics 2017
Session 0.0   media panel - matthias priem - gtuo - semantics 2017Session 0.0   media panel - matthias priem - gtuo - semantics 2017
Session 0.0 media panel - matthias priem - gtuo - semantics 2017semanticsconference
 
Session 1.3 semantic asset management in the dutch rail engineering and con...
Session 1.3   semantic asset management in the dutch rail engineering and con...Session 1.3   semantic asset management in the dutch rail engineering and con...
Session 1.3 semantic asset management in the dutch rail engineering and con...semanticsconference
 
Session 1.3 energy, smart homes & smart grids: towards interoperability...
Session 1.3   energy, smart homes & smart grids: towards interoperability...Session 1.3   energy, smart homes & smart grids: towards interoperability...
Session 1.3 energy, smart homes & smart grids: towards interoperability...semanticsconference
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichmentsemanticsconference
 
Session 2.3 semantics for safeguarding & security – a police story
Session 2.3   semantics for safeguarding & security – a police storySession 2.3   semantics for safeguarding & security – a police story
Session 2.3 semantics for safeguarding & security – a police storysemanticsconference
 
Session 2.5 semantic similarity based clustering of license excerpts for im...
Session 2.5   semantic similarity based clustering of license excerpts for im...Session 2.5   semantic similarity based clustering of license excerpts for im...
Session 2.5 semantic similarity based clustering of license excerpts for im...semanticsconference
 
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
Session 4.2   unleash the triple: leveraging a corporate discovery interface....Session 4.2   unleash the triple: leveraging a corporate discovery interface....
Session 4.2 unleash the triple: leveraging a corporate discovery interface....semanticsconference
 
Session 1.6 slovak public metadata governance and management based on linke...
Session 1.6   slovak public metadata governance and management based on linke...Session 1.6   slovak public metadata governance and management based on linke...
Session 1.6 slovak public metadata governance and management based on linke...semanticsconference
 
Session 5.6 towards a semantic outlier detection framework in wireless sens...
Session 5.6   towards a semantic outlier detection framework in wireless sens...Session 5.6   towards a semantic outlier detection framework in wireless sens...
Session 5.6 towards a semantic outlier detection framework in wireless sens...semanticsconference
 

Mais de semanticsconference (20)

Linear books to open world adventure
Linear books to open world adventureLinear books to open world adventure
Linear books to open world adventure
 
Session 1.2 high-precision, context-free entity linking exploiting unambigu...
Session 1.2   high-precision, context-free entity linking exploiting unambigu...Session 1.2   high-precision, context-free entity linking exploiting unambigu...
Session 1.2 high-precision, context-free entity linking exploiting unambigu...
 
Session 4.3 semantic annotation for enhancing collaborative ideation
Session 4.3   semantic annotation for enhancing collaborative ideationSession 4.3   semantic annotation for enhancing collaborative ideation
Session 4.3 semantic annotation for enhancing collaborative ideation
 
Session 1.1 dalicc - data licenses clearance center
Session 1.1   dalicc - data licenses clearance centerSession 1.1   dalicc - data licenses clearance center
Session 1.1 dalicc - data licenses clearance center
 
Session 1.3 context information management across smart city knowledge domains
Session 1.3   context information management across smart city knowledge domainsSession 1.3   context information management across smart city knowledge domains
Session 1.3 context information management across smart city knowledge domains
 
Session 0.0 aussenac semanticsnl-pwebsem2017-v4
Session 0.0   aussenac semanticsnl-pwebsem2017-v4Session 0.0   aussenac semanticsnl-pwebsem2017-v4
Session 0.0 aussenac semanticsnl-pwebsem2017-v4
 
Session 0.0 keynote sandeep sacheti - final hi res
Session 0.0   keynote sandeep sacheti - final hi resSession 0.0   keynote sandeep sacheti - final hi res
Session 0.0 keynote sandeep sacheti - final hi res
 
Session 1.1 linked data applied: a field report from the netherlands
Session 1.1   linked data applied: a field report from the netherlandsSession 1.1   linked data applied: a field report from the netherlands
Session 1.1 linked data applied: a field report from the netherlands
 
Session 1.2 enrich your knowledge graphs: linked data integration with pool...
Session 1.2   enrich your knowledge graphs: linked data integration with pool...Session 1.2   enrich your knowledge graphs: linked data integration with pool...
Session 1.2 enrich your knowledge graphs: linked data integration with pool...
 
Session 1.4 connecting information from legislation and datasets using a ca...
Session 1.4   connecting information from legislation and datasets using a ca...Session 1.4   connecting information from legislation and datasets using a ca...
Session 1.4 connecting information from legislation and datasets using a ca...
 
Session 1.4 a distributed network of heritage information
Session 1.4   a distributed network of heritage informationSession 1.4   a distributed network of heritage information
Session 1.4 a distributed network of heritage information
 
Session 0.0 media panel - matthias priem - gtuo - semantics 2017
Session 0.0   media panel - matthias priem - gtuo - semantics 2017Session 0.0   media panel - matthias priem - gtuo - semantics 2017
Session 0.0 media panel - matthias priem - gtuo - semantics 2017
 
Session 1.3 semantic asset management in the dutch rail engineering and con...
Session 1.3   semantic asset management in the dutch rail engineering and con...Session 1.3   semantic asset management in the dutch rail engineering and con...
Session 1.3 semantic asset management in the dutch rail engineering and con...
 
Session 1.3 energy, smart homes & smart grids: towards interoperability...
Session 1.3   energy, smart homes & smart grids: towards interoperability...Session 1.3   energy, smart homes & smart grids: towards interoperability...
Session 1.3 energy, smart homes & smart grids: towards interoperability...
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichment
 
Session 2.3 semantics for safeguarding & security – a police story
Session 2.3   semantics for safeguarding & security – a police storySession 2.3   semantics for safeguarding & security – a police story
Session 2.3 semantics for safeguarding & security – a police story
 
Session 2.5 semantic similarity based clustering of license excerpts for im...
Session 2.5   semantic similarity based clustering of license excerpts for im...Session 2.5   semantic similarity based clustering of license excerpts for im...
Session 2.5 semantic similarity based clustering of license excerpts for im...
 
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
Session 4.2   unleash the triple: leveraging a corporate discovery interface....Session 4.2   unleash the triple: leveraging a corporate discovery interface....
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
 
Session 1.6 slovak public metadata governance and management based on linke...
Session 1.6   slovak public metadata governance and management based on linke...Session 1.6   slovak public metadata governance and management based on linke...
Session 1.6 slovak public metadata governance and management based on linke...
 
Session 5.6 towards a semantic outlier detection framework in wireless sens...
Session 5.6   towards a semantic outlier detection framework in wireless sens...Session 5.6   towards a semantic outlier detection framework in wireless sens...
Session 5.6 towards a semantic outlier detection framework in wireless sens...
 

Último

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Último (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

Michael Fuchs | How to compute semantic relationships between entities and facts out of natural texts

  • 1. How to compute semantic relationships between entities and facts out of natural texts Michael Fuchs Technology Evangelist ABBYY fuchs@abbyy.com
  • 2. Agenda 1. How machines read pixels 2. Documents, words, layout & semantics 3. Syntactic & semantic text parsing 4. Live demo 5. Q&A 2
  • 3. How machines read pixels 3 Separate pixels to charactersPixel analysis Find text/image blocks
  • 4. How machines read pixels 4 Build proper words as editable textRecognize individual characters -> Linguistics: Alphabets & Morphology Dictionaries -> Math, AI, Statistics, Experience, and… Requirements to make a machine read text:
  • 5. 5 What is needed to make a machine understand the meaning of words, sentences, texts?
  • 6. Documents & Words 6 What is a document? Statistics can give basic insights -> No real semantic understanding b) Words in order? Layouts generate visual pattern -> Semantics can be derived from layout a) Bag of words?
  • 7. Documents, Words and Layout 7 Document with layout Text document with “simulated” layout Text with line breaks Text only -> Rules can extract data out of (semi-)structured texts and documents -> Layout helps to identify the semantic meaning of data
  • 8. Text and Structure Is “plain” natural language text unstructured? 8 -> yes, at least for almost all IT systems -> not for humans who can read and speak the language -> Facts and their relations can’t be reliably detected with “simple” rules
  • 9. Text, Structure & Translation 9 Is a word by word translation enough? -> … well – not really… -> Semantic understanding of the words and their relationship in sentences is needed! -> That is true for humans and machines
  • 10. Text & Structure 10 Why is natural language text understanding difficult for machines? -> Languages are not logical and context dependent – different usage, e.g. as verb, noun, adjective -> Different words – the same concept, e.g. to buy/sell something – different meanings, e.g. run, plant, apple … -> One word – different variants, e.g. go, went, gone
  • 11. Basic Language Structure 11 -> Morphology = Rules how to use words -> Semantics = meaning and the usage of words -> Semantic Relations = reflect/organise the meaning and relations of words and sentences. -> Syntax = Rules are used to build correct sentences How to get to the insides of a sentence?
  • 12. Compreno System Architecture 13 Extraction rules Interpretation rules Identification rules Morphological analyzer Syntactic and semantic analysis Anaphora resolution Disambiguation Semantic representation of text Parser Information Extraction Module RDF Graph
  • 14. Sentence Analysis with Semantic Info 15
  • 15. 17 How to get the correct semantic meaning of words? ABBYY’s answer: Universal Semantic Hierarchy = language independent semantic concepts
  • 16. ABBYY’s Universal Semantic Hierarchy 18 Semantic Meaning “Vocabulary” EN “Vocabulary” DE
  • 18. Recovering Omitted Words and Links (Ellipsis) 20 Recovered Node Ellipsis
  • 19. Identifying Pronoun Referents (Anaphora) 21 Mary saw her students. They were wearing masks. She was surprised. (Mary → her, Mary → she, students → they).
  • 20. From Text to Semantic with Compreno 22
  • 21. DEMO
  • 22. Summary: What is ABBYY Compreno? ● … NLP technology featuring a unique model-based approach that employs universal language models and identifies language structures. ● …. combines both syntactic and semantic analysis, as well as machine learning on untagged text corpora. ● … allows to create a semantic representation of text ● … able to resolve complex language phenomena: − lexical ambiguity − omitted words and links recovering ellipsis − identifying pronoun referents anaphora − coreference − coordination and more ● … support of English, Russian, German in progress 24