This document outlines Michael Bergman's perspective on pragmatic approaches to the semantic web. It argues that while linked data is useful, the focus should be broader to include structured data from any source and ontologies for interoperability. Some successes so far include technologies from Google, Bing, and Siri that use structured data and natural language processing. The document also recommends a layered approach that preserves existing assets while exposing structured data, metadata, and relations to enable interoperability.
1. Pragmatic Approaches to the Semantic Web
or, Why Aren’t We in Hyperland Yet?
Michael K. Bergman
2. Outline
Intro to SD and Me
Summary of Main Thesis
A Wee Bit of History
What is Not Working?
Problems with Linked Data
What is Working?
Some Pragmatic Lessons
SD’s Pragmatic Approach
Conclusion and Q & A
2
3. Structured Dynamics
Founded 2008; predecessor Zitgist LLC; two
principals
Privately held, revenue funded
Boutique semantic technology shop
Services and consulting:
Semantic enterprise adoption
Ontology development and mapping
Tech transfer and training
Development and software:
Open source OSF stack
Data conversion and migration
Client-specific development
3
4. Current Products and OSF Stack
the pivotal product; Web services middleware that
provides distributed data access and federation
Drupal-based structured data linkage to structWSF
spreadsheet, JSON and XML authoring and
conversion framework
reference set of linking subjects and basis for domain
vocabularies
an ontology- and entity-driven information extraction
and tagging system
4
8. Main Arguments
Not against linked data
Proponent and explicator since 2006
But, linked data burdensome, not pivotal to
interoperability
Interoperability requires:
Structured data (from any source)
Canonical data model (RDF)
(Relatively simple) ontologies for world views, schema
Curation
8
12. Linked Data
“Linked Data is a set of best practices for publishing
and deploying instance and class data using the RDF
data model, naming the data objects using uniform
resource identifiers (URIs), thereby exposing the data
for access via the HTTP protocol, while emphasizing
data interconnections, interrelationships and context
useful to both humans and machine agents.”
12
14. Some Disappointments to Date
Full semantic Web vision
Widescale adoption of the semantic Web, linked data
Lack of intelligent agents
Many aspects of the practice of linked data
14
16. Problems with Linked Data
Burdensome on publishers
Naïve linkages:
Overuse of sameAs
Lack of accurate alignments
(Often) poor data quality
Wrong focus
16
17. Some Conditions for Interoperability
<Interoperability> <needsMapping> <Predicates>
<Interoperability> <needsReference> <Nouns>
17
26. Some Lessons Learned
Structure is good in any form
Keep semantic technology in the background
Open Web (FYN) likely to be disappointing
Ontologies essential for alignments
NLP an essential contributor to structure
Metadata an essential contributor to characterization,
use
Linked data is a burden to publishers, places
semantic emphasis on wrong part of chain
26
28. Preserving Existing Assets
Relational databases (RDBMs)
Distributed structured assets
spreadsheets
lightweight datastores
Web pages and Web sites
Existing documents and text
Web databases and APIs
Other databases (RDF, OO, etc.)
28
29. irON Dataset Exchange Framework
Simple authoring and dataset creation
irON includes an abstract notation and vocabulary for
instance records
Notations for:
Instance records
Schema
Datasets and metadata
Linkages to other schema
Serializations available for:
XML (irXML)
JSON (irJSON)
CSV/spreadsheets (commON)
29
37. Summary
If you can, do linked data; it is a GOOD THING
In any event, expose your data:
Structured (use NLP for unstructured)
Metadata
Definitions
Relations (simple)
“Semsets” (synonyms, acronyms, spelling variants)
Build vocabulary and ontology consortia
Build trust and curation communities
Semantics essential at the interoperability level, not
necessarily publication or data transfer
37
38. Take Aways
James Hendler:
“A little bit of semantics goes a long way”
Leverage linked data, but broaden focus
Consider adopting the semantic enterprise as the
broader focus
38