Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Self-service Linked Government Data
1. Digital Enterprise Research Institute www.deri.ie
Self-service Linked Government Data
Fadi Maali, Richard Cyganiak, Vassilios Peristeras
firstname.lastname@deri.org
Copyright 2011 Digital Enterprise Research Institute. All rights reserved.
Enabling networked knowledge
6. Why Linked Governemnt Data
(LGD)?
Digital Enterprise Research Institute www.deri.ie
Web accessible
Interlinkable
Decentralised publishing of
data
Standardised
Enabling networked knowledge
6
7. LGD
Digital Enterprise Research Institute www.deri.ie
We need government
data as Linked Data not
just Raw Data
….aha, and of a good
quality!
Enabling networked knowledge
7
8. LGD is Costly
Digital Enterprise Research Institute www.deri.ie
We want governments to
provide Linked Data not
just Raw Data… and of
good quality
http://code.google.com/p/google-refine/
Enabling networked knowledge
8
10. Self-service Approach
Digital Enterprise Research Institute www.deri.ie
DIY
Provide tools, models and algorithms that enable the self-service approach (a
publishing pipeline)
Enabling networked knowledge
10
11. Publishing pipeline requirements
Digital Enterprise Research Institute www.deri.ie
Interactive approach
Graphical user interface
Reproducibility and traceability
Flexibility
Decentralisation
Results sharing
Enabling networked knowledge
11
12. Publishing pipeline requirements
Digital Enterprise Research Institute www.deri.ie
Interactive approach
Graphical user interface
Reproducibility and traceability
Flexibility
Decentralisation
Results sharing
Enabling networked knowledge
12
13. Google Refine
Digital Enterprise Research Institute www.deri.ie
Powerful data editing, transformation and enriching capabilities
Import capabilities e.g. JSON, Excel, CSV, TSV, XML, etc.
Persistent undo/redo history
Popular in open data community
Extensible and under active development
Free and open source
http://code.google.com/p/google-refine/
Enabling networked knowledge
13
14. DIY Recipe (1000 feet view)
Digital Enterprise Research Institute www.deri.ie
Publishers provide RDF Tool support to select
representation of their datasets of interest and User shares the RDF
catalogues put them into RDF data
Enabling networked knowledge
14
15. DIY Recipe (100 feet view)
Digital Enterprise Research Institute www.deri.ie
Publishers provide RDF representation
of their catalogues
Tool support to select
datasets of interest User shares the
and put them into RDF RDF data
dcat
Enabling networked knowledge
15
16. DIY Recipe (100 feet view)
Digital Enterprise Research Institute www.deri.ie
Tool support to select datasets of
Publishers provide
RDF representation of interest and put them into RDF User shares the RDF
their catalogues data
dcat
Google Refine
+ RDF export extension
+ RDF reconciliation extension
Enabling networked knowledge
16
17. DIY Recipe (100 feet view)
Digital Enterprise Research Institute www.deri.ie
Publishers provide Tool support to select
RDF representation of datasets of interest and put User shares the RDF data
their catalogues them into RDF
dcat Google Refine Share RDF data publicly (on
+ RDF export extension CKAN.net) along with the sufficient
+ RDF reconciliation extension provenance description
Enabling networked knowledge
17
23. Data on CKAN.net
Digital Enterprise Research Institute www.deri.ie
Enabling networked knowledge
23
24. Data Provenance (simplified)
Digital Enterprise Research Institute www.deri.ie
:dataset
dct:source
:wasExportedBy
:json-history :export-process :csv-ds
:operations :usedData
Enabling networked knowledge
24
25. DIY Recipe (10 feet view)
Digital Enterprise Research Institute www.deri.ie
Dcat
An RDF vocabulary to describe government catalogues
Current status: First Public Working Draft by the W3C GLD Working
Group
http://www.w3.org/TR/vocab-dcat/
Used on data.gov.uk (RDFa) and CKAN-based catalogues
“Enabling Interoperability of Government Data Catalogues.”
EGOV 2010
Enabling networked knowledge
25
26. DIY Recipe (10 feet view)
Digital Enterprise Research Institute www.deri.ie
RDF Mapping
Enabling networked knowledge
26
27. More on RDF Mapping
Digital Enterprise Research Institute www.deri.ie
RDF-centric mapping
Multiple tree structure
Expression language for
custom expression
Vocabularies/ontologies
support
Enabling networked knowledge
27
28. DIY Recipe (10 feet view)
Digital Enterprise Research Institute www.deri.ie
Interlinking
Silk LSL
RDF Reconcile Crafted RDF Silk Server
Google
Extension
Refine
SPARQL endpoint
SPARQL endpoint with
fulltext extension
Enabling networked knowledge
28
29. More on Interlinking
Digital Enterprise Research Institute www.deri.ie
Interlinking as a pre-RDF-creation step less unnecessary
owl:sameAs
Focus on the interface
Semi-automatic process with good user support
“Re-using Cool URIs: Entity Reconciliation Against LOD Hubs.”
LDOW 2011
Enabling networked knowledge
29
30. DIY Recipe (10 feet view)
Digital Enterprise Research Institute www.deri.ie
Sharing
Captures the operations applied to the data
Represent them according to Open Provenance Model
Vocabulary (OPMV)
Share the data and its provennce on CKAN.net
CKAN Extension fro Google Refine
http://lab.linkeddata.deri.ie/2011/grefine-ckan/
Enabling networked knowledge
30
31. Case study - Fingal Catalogue
Digital Enterprise Research Institute www.deri.ie
Number of datasets: 74 (68 available in CSV and 56 in XML)
Fingal county Council (41), Central Statistics
Top publishers: Office (17), Department of Education and
Science (4)
Demographics(18), Citizen Participation(18),
Top domains:
Education(9)
http://data.fingal.ie
Enabling networked knowledge
31
32. Case study - Fingal Catalogue
Digital Enterprise Research Institute www.deri.ie
The catalogue was represented in Dcat
60 datasets were converted to RDF using the publishing
pipeline (~300K triples)
Data Cube was used for statistical data
URIs were used consistently and shared among datasets
the data was interlinked
Externally linked to DBpedia
Enabling networked knowledge
32
33. Open Issues
Digital Enterprise Research Institute www.deri.ie
Evaluating/Refining the crowd-sourcing aspects of the RDF
creation process
RDF Modeling: Can we assist RDF modeling by examining the
raw data?
Enabling networked knowledge
33
34. Lessons Learned
Digital Enterprise Research Institute www.deri.ie
Interactive approach
Focus on plumbing tools together but don’t enforce a rigid
process
Make it easy to adopt best-practices and good recipes
Enabling networked knowledge
34