Abstract
An increasing number of applications rely on RDF, OWL 2, and SPARQL for storing and querying data. SPARQL, however, is not targeted towards end-users, and suitable query interfaces are needed. Faceted search is a prominent approach for end-user data access, and several RDF-based faceted search systems have been developed. There is, however, a lack of rigorous theoretical underpinning for faceted search in the context of RDF and OWL 2. In this paper, we provide such solid foundations. We formalise faceted interfaces for this context, identify a fragment of first-order logic capturing the underlying queries, and study the complexity of answering such queries for RDF and OWL 2 profiles. We then study interface generation and update, and devise efficiently implementable algorithms. Finally, we have implemented and tested our faceted search algorithms for scalability, with encouraging results.
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Semantic Faceted Search with SemFacet presentation
1. Semantic Faceted Search
with SemFacet
Evgeny Kharlamov
Information Systems Group
Department of Computer Science
University of Oxford
2. Finding Data w/ Keywords is Hard
§ Keyword search is the paradigm
to access data on the Web,
company websites, etc
§ Limitations of keyword search
§ Too many docs contain keywords
§ Meaning is not built in keywords
§ Becomes the art of
“finding the best combination”
§ Limited control on search
3. How to Improve Search Experience?
§ Improve the search paradigm
§ End-user oriented query formulation interfaces
§ Faceted search
§ Improve the data model
§ Semantic Web models
§ Our proposal:
§ do both and combine
§ Faceted search
§ Semantic Web model
5. Enhancing Keyword Search with Facets
§ A facet = control mechanism
§ Name
§ Set of values
§ Facets in action
§ Choose a value
6. Enhancing Keyword Search with Facets
§ A facet = control mechanism
§ Name
§ Set of values
§ Facets in action
§ Choose a value
§ Restrict search result
§ Advantages of facets
§ Allow to say what you
really mean
§ Give control over
search
7. Faceted Search in the Nutshell
stars
3-stars
restaurant
§ Search over
one set of items
§ Items annotated with
§ Strings
§ Search result:
subset of items
Asian
Italian
4-stars 5-stars
French
Find 4-star hotels with French restaurants
8. Faceted Search in the Nutshell
stars
3-stars
restaurant
§ Search over
one set of items
§ Items annotated with
§ Strings
§ Search result:
subset of items
Asian
Italian
4-stars 5-stars
French
Find 4-star hotels with French restaurants
9. Faceted Search in the Nutshell
stars
3-stars
restaurant
§ Search over
one set of items
§ Items annotated with
§ Strings
§ Search result:
subset of items
Asian
Italian
4-stars 5-stars
French
Find 4-star hotels with French restaurants
10. Faceted Search in the Nutshell
stars
3-stars
restaurant
§ Search over
one set of items
§ Items annotated with
§ Strings
§ Search result:
subset of items
Asian
Italian
4-stars 5-stars
French
output
Find 4-star hotels with French restaurants
12. Semantic Web Models
§ RDF data model
§ objects annotated with strings and objects
§ OWL 2 ontologies
§ structure vocabularies of annotations
4-stars French
stars
restaurant
type
walking
distance to
French restaurant is a Restaurant that offers French cuisine.
FrenchRestaurant ⊑ Restaurant ⊓ ∃ offers.FrenchCuisine
15. Enhancing Search with SW in Practice
Hello, my name is John Doe.
I study at the University if Dreams.
My daughter is Alice....
embedding
semantic
annotations
<section itemscope itemtype = "http://dava-vocabulary.org/Person"
itemid = "http://myitems/john-doe-1234" >
Hello, my name is
<span itemprop="name">John Doe</span>.
I study at the
<span itemprop="affiliation">University of Dreams</span>
My daughter is
<span itemtype = "http://dava-vocabulary.org/children"
itemid = "http://myitems/alice-doe-5678" >
Alice </span>
....
16. Semantic Web Models
§ RDF data model
§ objects annotated with strings and objects
§ OWL 2 ontologies
§ structure vocabularies of annotations
from 2011 to 2012 the fraction of structured data went from
3.5% to 13%
17. Semantic Web Models
§ RDF data model
§ objects annotated with strings and objects
§ OWL 2 ontologies
§ structure vocabularies of annotations
from 2011 to 2012 the fraction of structured data went from
3.5% to 13%
18. How to Improve Search Experience?
§ Improve the search paradigm
§ End-user oriented query formulation interfaces
§ Faceted Search
§ Improve the data model
§ Semantic Web models
§ RDF Data
§ OWL 2 ontologies
§ Our proposal:
§ Semantic Faceted Search that combines
§ Faceted search
§ Semantic Web model
19. Semantic Faceted Search in the Nutshell
4-stars
stars
3-stars
§ Search over
several sets of items
§ Items annotated with
§ Strings
§ Items
§ Search result:
§ user-chosen
subset of items
5-stars Asian Italian French
restaurant
Find 4-star hotels with French restaurants
that are walking distance to Eiffel tower
type
walking
distance to
20. Semantic Faceted Search in the Nutshell
stars
3-stars
§ Search over
several sets of items
§ Items annotated with
§ Strings
§ Items
§ Search result:
§ user-chosen
subset of items
4-stars 5-stars Asian Italian French
restaurant
Find 4-star hotels with French restaurants
that are walking distance to Eiffel tower
type
walking
distance to
21. Semantic Faceted Search in the Nutshell
stars
3-stars
§ Search over
several sets of items
§ Items annotated with
§ Strings
§ Items
§ Search result:
§ user-chosen
subset of items
4-stars 5-stars Asian Italian French
restaurant
Find 4-star hotels with French restaurants
that are walking distance to Eiffel tower
type
walking
distance to
22. Semantic Faceted Search in the Nutshell
stars
3-stars
§ Search over
several sets of items
§ Items annotated with
§ Strings
§ Items
§ Search result:
§ user-chosen
subset of items
4-stars 5-stars Asian Italian French
restaurant
Find 4-star hotels with French restaurants
that are walking distance to Eiffel tower
type
walking
distance to
23. Semantic Faceted Search in the Nutshell
stars
3-stars
§ Search over
several sets of items
§ Items annotated with
§ Strings
§ Items
§ Search result:
§ user-chosen
subset of items
4-stars 5-stars Asian Italian French
restaurant
Find 4-star hotels with French restaurants
that are walking distance to Eiffel tower
type
walking
distance to
output
24. Semantic Faceted Search in the Nutshell
stars
3-stars
§ Search over
several sets of items
§ Items annotated with
§ Strings
§ Items
§ Search result:
§ user-chosen
subset of items
4-stars 5-stars Asian Italian French
restaurant
Find 4-star hotels with French restaurants
that are walking distance to Eiffel tower
type
walking
distance to
output
25. Semantic Faceted Search in the Nutshell
stars
3-stars
§ Search over
several sets of items
§ Items annotated with
§ Strings
§ Items
§ Search result:
§ user-chosen
subset of items
4-stars 5-stars Asian Italian French
restaurant
Find 4-star hotels with French restaurants
that are walking distance to Eiffel tower
type
walking
distance to
output
26. Research Contributions
§ Solid foundation for Semantic F-Search
§ Projection of ontologies on
graph data structures
§ Allows to incorporate ontologies
into faceted search
§ Gives better faceted interfaces
politicians Search
More Focus
type
USpres
Country
More Focus
More Focus
Remove
More Focus
Remove
§ Generate more facets / Prune irrelevant facets
§ Scalable algorithms to
§ generate and update facets from
§ Data and Ontologies
§ Algorithms to evaluate faceted queries over semantic data
§ Exploits bottom up query evaluation
http://en.wikipedia.org/wiki/Bill_Clinton
William Jefferson "Bill" Clinton (born William
Jefferson Blythe III; August 19, 1946) is an
American politician who served as the 42nd
President of the United States from 1993 to
2001. Inaugurated at age 46, he was the third-youngest
president. He took office at the end
of the Cold War, and was the first president of
the baby boomer generation...
has child
ANY
Remove
Remove
is graduated from
Stanford Uni.
is graduated from
Stanford Uni.
Harvard Uni.
Georgetown Uni.
27. SemFacet System
§ Integration of
§ Keyword search and
§ Semantic faceted search
§ Main features
§ Automatic generation of f-search interfaces
over RDF data and OWL 2 ontologies
§ In memory
§ Online and offline reasoning
§ Efficient on millions of triples
§ Flexible configuration
§ Interchangeable triple stores
§ RDFOX, PAGOdA, Hermit, Sesame
§ Configurable answers (snippets)
§ Support of Or and And facets
Faceted Query
Interface
Answers as
Snippets
Presentation
Layer
Application
Layer
Data
Layer
Facet
Generator
Query
Converter
Snippet
Generator
Triple Store:
Ontology
Data
Keyword
Based Search
KBS
Engine
Inverted Index
e.g. DBpedia
Abstracts
RDFOX, PAGOdA, Hermit, Sesame
28. SemFacet Team
§ Marcelo Arenas
§ Bernardo Cuenca Grau
§ Evgeny Kharlamov
§ Sarunas Marciuska
§ Dmitriy Zheleznyakov