Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
What You've Learned About Ontology Matching So Far
1. What you have learned so far
Ontology matching
J´rˆme Euzenat
eo
Data can be expressed in RDF
Linked through URIs
Modelled with OWL ontologies
&
Retrieved through SPARQL queries
Montbonnot, France
Jerome.Euzenat@inrialpes.fr
Thanks to Pavel Shvaiko and Natasha Noy for our collaboration on former versions of these
slides
J´rˆme Euzenat
eo Ontology matching 2 / 36
Being serious about the semantic web Ontology heterogeneity
Monograph
Item integer pages
price isbn
It is not one person’s ontology string
title author
It is not several people common ontology doi title
creator uri
It is many people’s many ontologies pp Essay
So it is a mess, but a meaningful mess. Person Literary critics
DVD
Human Politics
Book
Biography
author Writer
Heterogeneity is not a bug, it is a feature subject
Paperback
Autobiography
Hardcover
Literature
CD
J´rˆme Euzenat
eo Ontology matching 3 / 36 J´rˆme Euzenat
eo Ontology matching 4 / 36
2. Heterogeneity problem How can we address the problem?
Resources being expressed in different ways must be reconciled before being
used.
Mismatch between formalized knowledge can occur when:
different languages are used (OWL vs. Topic maps);
different terminologies are used: First ontology parameters
English vs. Chinese;
Book vs. Monograph. Initial alignment matching Resulting alignment
different models are used:
different classes: Autobiography vs. Paperback; Second ontology resources
classes vs. property: Essay vs. literarygenre;
classes vs. instances: One physical book as an instance vs. one work as
an instance.
different scopes and granularity are used.
Only books vs. cultural items vs. any product;
Books detailed to the print and translation level vs. books as works.
J´rˆme Euzenat
eo Ontology matching 5 / 36 J´rˆme Euzenat
eo Ontology matching 6 / 36
Ontology alignment Expressive alignments (EDOAL)
≥ Monograph
integer Volume
Item pages Pocket ≥
price 14 size
string isbn
title ≥ author
doi title Book
creator uri =
Essay topic Autobiography
pp =
≥ Literary critics author
Person
DVD
≤ Human Politics
Book
Biography
author ≥ Writer
subject ∀x, Pocket(x) ⇐ Volume(x) ∧ size(x, y ) ∧ y ≤ 14
Paperback
Autobiography ∀x, Book(x) ∧ author (x, y ) ∧ topic(x, y ) ≡ Autobiography (x)
Hardcover
Literature
CD
J´rˆme Euzenat
eo Ontology matching 7 / 36 J´rˆme Euzenat
eo Ontology matching 8 / 36
3. Transformation and mediation Ontology networks
a2 b5
SELECT x.doi SELECT x.isbn o2
WHERE x : Book WHERE x : Autobiography o5
AND x.author = ”Bertrand Russell” AND x.author = ”Bertrand Russell” b2 c2 A2,4 f5 g5
AND x.topic = ”Bertrand Russell” a4
A1,2
a1 f2 g2 d2 e2 h5 j5
o4
mediator
A2,3
o1 b4 c4
b1 c1 a3
f4 g 4 d4 e4
d1 e1 o3
b3 c3
A3,4
x.doi=http://dx.doi.org/10.1080/041522862X x.isbn=041522862X A1,3
f3 g3 d3 e 3
J´rˆme Euzenat
eo Ontology matching 9 / 36 J´rˆme Euzenat
eo Ontology matching 10 / 36
Why should we deal with this? Applications: Query answering
Applications of semantic integration
First Second
Matcher
Catalogue integration ontology ontology
Schema and data integration
Query answering Alignment
Peer-to-peer information sharing
Web service composition Generator
Agent communication
Data transformation First query reformulated query Second
mediator
Ontology evolution peer reformulated answer answer peer
Data interlinking
J´rˆme Euzenat
eo Ontology matching 11 / 36 J´rˆme Euzenat
eo Ontology matching 12 / 36
4. Applications: Agent communication Data interlinking
First Second
Matcher
ontology ontology First Second
Matcher
ontology ontology
Alignment
Alignment
Generator
Transformed Generator
message ontology
Translator
Transformed message First Second
First Second links
dataset dataset
agent agent
J´rˆme Euzenat
eo Ontology matching 13 / 36 J´rˆme Euzenat
eo Ontology matching 14 / 36
Ontology matching in three steps On what basis can we match?
Reconciliation can be performed in 3 steps o Content: relying on what is inside the ontology
o
Name, comments, alternate names, names of related entities: NLP, IR,
etc.
Match, Matcher Internal structure: constraints on relations, typing
External structure: relations between entities: Data mining, Discrete
thereby determines the alignment A mathematics
Extension: Statistics, data analysis, data mining, machine learning
Semantics (models): Reasoning techniques
Generate Generator
Context: the relations of the ontology with the outside
a processor (for merging, transforming, etc.) Transformation Annotated resources:
The web
External ontologies: dbpedia, etc.
Apply External resources: wordnet, etc.
J´rˆme Euzenat
eo Ontology matching 15 / 36 J´rˆme Euzenat
eo Ontology matching 16 / 36
5. Name similarity Structure similarity
Monograph Monograph
Item pages Item integer pages
price isbn creator string isbn
title author author
DVD
doi title uri title
creator ≥ Essay Book Essay
pp
price
Person Literary critics Literary critics
DVD title
Human Politics doi Human Politics
Book pp
Biography Biography
author Writer author Person Writer
subject subject
Paperback Paperback
Autobiography Autobiography
Hardcover Hardcover
Literature Literature
CD CD
J´rˆme Euzenat
eo Ontology matching 17 / 36 J´rˆme Euzenat
eo Ontology matching 18 / 36
Instance similarity Combining different techniques
Monograph Basic matchers provide candidate correspondences, most of the systems use
Item several such matchers and further combine and filter their results.
o
Essay M A
Literary critics
DVD A A A
Politics
Book
Biography M A M A
Paperback
Autobiography o Matcher composition Aggregation Filtering
Hardcover
Bertrand Russell: My life Literature
CD Iteration
Albert Camus: La chute
J´rˆme Euzenat
eo Ontology matching 19 / 36 J´rˆme Euzenat
eo Ontology matching 20 / 36
6. How well do these approaches work? Evaluation process
Ontology Alignment Evaluation Initiative (OAEI)
Formal comparative evaluation of different ontology-matching tools;
Run every year since 2004; R
Variety of test cases (in size, in formalism, in content); o parameters m
evaluator
Results consistent across test cases;
Results very dependent on the tasks and the data (from under 50% of matching A
precision and recall to well over 80% if ontologies are relatively similar)
Progress every year! o resources
http://oaei.ontologymatching.org
Now involved in the SEALS (Semantics Evaluation At Large Scale) project.
J´rˆme Euzenat
eo Ontology matching 21 / 36 J´rˆme Euzenat
eo Ontology matching 22 / 36
Benchmark results (precision and recall Tools you should be aware of
curves)
1.
Frameworks
2010
ASMOV Alignment API: used by many tools; provides an exchange format and
2009 evaluation tools for OAEI. Alignment server for sharing.
Lily PROMPT (a Prot´g´ plug-in): includes a user interface and a plug-in
e e
2008 architecture.
precision
Lily COMA++: oriented toward database integration (many basic algorithms
implemented).
2007
ASMOV Matching systems
2006 OAEI best performers (Falcon, RiMOM, ASMOV, etc.)
RiMOM Available systems (FOAM, Falcon, COMA++, Aroma, etc.)
2005
Falcon
0. edna
0. recall 1.
J´rˆme Euzenat
eo Ontology matching 23 / 36 J´rˆme Euzenat
eo Ontology matching 24 / 36
7. The data interlinking problem Example: Linking INSEE and NUTS
owl:sameAs NUTS: Nomenclature of territorial units for statistics
#INSEE INSEE name NUTS Level #NUTS
URI1 URI2 1 Pays 0 34
1 142
26 R´gion
e 2 344
100 D´partement
e 3 1488
Data interlinking 342 Arrondissement
4036 Canton 4
52422 Commune 5
o o
J´rˆme Euzenat
eo Ontology matching 25 / 36 J´rˆme Euzenat
eo Ontology matching 26 / 36
INSEE and NUTS: ontology alignment Simple alignments are not sufficient
Territoire FR Region
Territoire FR Region =
nom name
integer
code = name
nom string level Region
Pays ≤
code
Region = ≤
≤ hasSubRegion Departement NUTSRegion
subdivision ≤ =
chef-lieu Country DEP 75 ≤ FR101
≤ nom = = name
Departement ≤ NUTSRegion
=
Commune Paris
Arrondissement LAURegion
=
Commune COM 75056
nom
J´rˆme Euzenat
eo Ontology matching 27 / 36 J´rˆme Euzenat
eo Ontology matching 28 / 36
8. Expressive alignments are necessary Query generation
SELECT ?r
PREFIX insee: <http://rdf.insee.fr/ontologie-geo-2006.rdf#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
FROM <http://rdf.insee.fr/geo/regions-2011.rdf>
NUTSRegion WHERE {
?r rdf:type insee:Region .
= 2 level
Region = }
=
FR1 hasParentRegion
= SELECT ?n
subdivision hasSubRegion
PREFIX nuts: <http://ec.europa.eu/eurostat/ramon/ontologies/geographic.rdf#
= PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
nom name
FROM <http://ec.europa.eu/eurostat/ramon/rdfdata/nuts2008/>
WHERE {
?n rdf:type nuts:NUTSRegion .
?n nuts:level 2^^xsd:int .
?n nuts:hasParentRegion nuts:FR1 .
}
J´rˆme Euzenat
eo Ontology matching 29 / 36 J´rˆme Euzenat
eo Ontology matching 30 / 36
Query generation What does this mean?
CONSTRUCT { ?r owl:sameAs ?n . }
PREFIX insee: <http://rdf.insee.fr/ontologie-geo-2006.rdf#>
PREFIX nuts: <http://ec.europa.eu/eurostat/ramon/ontologies/geographic.rdf#> Ontology alignments are schema-level expression of correspondences;
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> They are useful for focussing the search;
FROM <http://rdf.insee.fr/geo/regions-2011.rdf>
FROM <http://ec.europa.eu/eurostat/ramon/rdfdata/nuts2008/> Expressive alignments are necessary;
WHERE { They can be turned into SPARQL-based link generators.
?r rdf:type insee:Region .
?r insee:nom ?l . but it is also necessary to express instance level constraints:
?n rdf:type nuts:NUTSRegion . for converting data (e.g., mph vs. m/s);
?n nuts:name ?l . for expressing matching constraint on data (e.g., similarity).
?n nuts:level 2^^xsd:int .
?n nuts:hasParentRegion nuts:FR1 .
}
J´rˆme Euzenat
eo Ontology matching 31 / 36 J´rˆme Euzenat
eo Ontology matching 32 / 36
9. General framework Selected challenges
Scalability and efficiency
Current matchers can be fast, scale and accurate, but not all at once.
owl:sameAs New sources of matching
Context-based matching,
URI1 URI2 General purpose matching (vs. special purpose matching)
Data interlinking Matcher combination,
Matcher selection and self-configuration,
User involvement,
o A o Matching (serendipitously) while working,
How to explain alignments?
Social and collaborative ontology matching,
Alignment management: infrastructure and support,
Ontology matching
How do we maintain alignments when ontologies evolve?
Reasoning with alignments,
Being robust to incorrect alignments.
and, of course, many others,
J´rˆme Euzenat
eo Ontology matching 33 / 36 J´rˆme Euzenat
eo Ontology matching 34 / 36
Further reading
“Ontology Matching” by Euzenat and
Shvaiko Jerome.Euzenat@inria.fr
Proceedings of ISWC, ASWC, ESWC,
WWW conferences, etc.
Journal of web semantics, Semantic web http://exmo.inrialpes.fr
journal, Journal on data semantics, etc.
http://www.ontologymatching.org
J´rˆme Euzenat
eo Ontology matching 35 / 36