The growing number of datasets published on the Web as linked data brings both opportunities for high data availability and challenges inherent to querying data in a semantically heterogeneous and distributed environment. Approaches used for querying siloed databases fail at Web-scale because users don't have an a priori understanding of all the available datasets. This article investigates the main challenges in constructing a query and search solution for linked data and analyzes existing approaches and trends.
2. IEEE Internet Computing
Digital Enterprise Research Institute www.deri.ie
A. Freitas, E. Curry, J. G.
Oliveira, and S. O’Riain,
“Querying Heterogeneous
Datasets on the Linked
Data Web: Challenges,
Approaches, and
Trends,”e IEEE Internet
Computing, vol. 16, no. 1,
pp. 24-33, 2012.
http://doi.ieeecomputersociety.org/10.1109/MIC.2011.141
http://andrefreitas.org
4. Querying Data over the Web
Digital Enterprise Research Institute www.deri.ie
We can see (a) natural language query over two search engines;
(b) corresponding SPARQL representation; and (c) semantic gap
between the user’s information needs and data representation.
5. Expressivity-Usability Trade-Off
Digital Enterprise Research Institute www.deri.ie
Expressivity–usability trade-off for querying over structured data.
Blue dots indicate an ideal query mechanism for linked data must
provide both high expressivity and high usability
7. Challenges
Digital Enterprise Research Institute www.deri.ie
Analysis focuses on investigation of existing
approaches under the perspective of the
usability-expressivity trade-off.
This focus guides the categorization and
analysis of existing challenges, approaches
and trends.
8. Challenge Dimensions
Digital Enterprise Research Institute www.deri.ie
Query Expressivity
Ability to query datasets by referencing elements
in data model structure, as well as to operate
over the data (aggregate results, express
conditional statements, etc.)
Usability
Easy-to-operate, intuitive, and task-efficient
query interface
Vocabulary-level Semantic Matching
Ability to semantically match user query terms to
dataset vocabulary-level terms
9. Challenge Dimensions
Digital Enterprise Research Institute www.deri.ie
Entity Reconciliation
Matches entities expressed in the query to
semantically equivalent dataset entities
Semantic Tractability
Ability to answer queries not supported by
explicit dataset statements
– For example, “Is Natalie Portman an Actress?” can be
supported by the statement “Natalie Portman starred
Star Wars,” instead of an explicit statement “Natalie
Portman occupation Actress,” which might not be
present in dataset
11. Approaches
Digital Enterprise Research Institute www.deri.ie
Information Retrieval approaches
Entity-centric search
Structure search
Natural Language approaches
Question Answering
Semantic best-effort natural language interfaces
17. Addressing the Challenges
Digital Enterprise Research Institute www.deri.ie
The functionality analysis of existing
approaches provides insights on how the
major challenges should be addressed.
This set of strategic functionalities define
the set of trends.
20. Trends
Digital Enterprise Research Institute www.deri.ie
Complementary Search and Query Services
User Interaction and Feedback Mechanisms
Semantic Best-Effort Query Model
Natural Language Processing Techniques
Distributional Semantic Model
External Knowledge Sources for Semantic
Enrichment
Integrated Entity Reconciliation Techniques
21. IEEE Internet Computing
Digital Enterprise Research Institute www.deri.ie
A. Freitas, E. Curry, J. G.
Oliveira, and S. O’Riain,
“Querying Heterogeneous
Datasets on the Linked
Data Web: Challenges,
Approaches, and
Trends,”e IEEE Internet
Computing, vol. 16, no. 1,
pp. 24-33, 2012.
http://doi.ieeecomputersociety.org/10.1109/MIC.2011.141
http://andrefreitas.org
22. Further Reading
Digital Enterprise Research Institute www.deri.ie
A. Freitas, E. Curry, J. G. Oliveira, and S. O’Riain, A Distributional
Structured Semantic Space for Querying RDF Graph Data, International
Journal of Semantic Computing, vol. 5, no. 4, pp. 433-462, 201
S. O’Riain, E. Curry, and A. Harth, XBRL and Open Data for Global Financial
Ecosystems: A Linked Data Approach, International Journal of Accounting
Information Systems, vol. 13, no. 2, pp. 141-162, 2012.
A. Freitas, E. Curry, and S. O'Riain, p A Distributional Approach for
Terminology-Level Semantic Search on the Linked Data Web, in 27th ACM
Symposium On Applied Computing (SAC 2012), 2012.
A. Freitas, J. G. Oliveira, S. O'Riain, and E. Curry,WA Multidimensional
Semantic Space for Data Model Independent Queries over RDF Data, in
Fifth IEEE International Conference on Semantic Computing (ICSC 2011)
A. Freitas, T. Knap, S. O’Riain, and E. Curry, W3P: Building an OPM based
provenance model for the Web, Future Generation Computer Systems, vol.
27, no. 6, pp. 766-774, Jun. 2011.