Federal Spatial Data Infrastructure Linked Data Service

Linked data refers to using the Web to connect related data that were not previously linked. The data are identified, shared and referenced by Uniform Resource Identifier (URI). The Resource Description Framework (RDF) and the underlying standards like SPARQL are used to encode and link the data

  1. 1. Federal Spatial Data Infrastructure Linked Data Service Federal Office of Topography swisstopo Coordination, Geo- Information and Services (COGIS)
  2. 2. www.geo.admin.ch/linkeddataPasquale Di Donato - COGIS WWW Agenda Triple Store Linked Data Frontend swissBOUNDARIES3D RDF Serialization Vocabularies Linked DataSPARQL HTML SPARQL Clients RDF Clients HTML Clients Virtuoso Open Source Trifid (open source) https://ld.geo.admin.chhttps://sparql.geo.admin.ch • Data • Vocabularies • Tools • Triple Store and LD Frontend • Infrastructure
  3. 3. www.geo.admin.ch/linkeddataPasquale Di Donato - COGIS Data • swissBOUNDARIES3D (2016 and 2017) • National boundaries • Cantons • Districts • Municipalities • WGS84 (long/lat) as CRS for geometries to foster the wider cross-community interoperability and usage
  4. 4. www.geo.admin.ch/linkeddataPasquale Di Donato - COGIS Vocabularies • Reuse as much as possible existing vocabularies: • GeoSPARQL • GeoNames • schema.org • Dublin Core • Wikidata • DBpedia • We had to define few properties: • bfsNumber • cantonalTerritory • lakeArea
  5. 5. www.geo.admin.ch/linkeddataPasquale Di Donato - COGIS Vocabularies 5 GeoNames GeoSPARQL DBpedia Dublin Core schema.org GeoNames GeoSPARQL ld.geo.admin.ch/def/
  6. 6. www.geo.admin.ch/linkeddataPasquale Di Donato - COGIS RDF Serialization • GeoTriples • An R2RML/RML mapping generator/editor • An R2RML/RML mapping processor • Extends R2RML and RML to model the transformation of geospatial data into RDF graphs • CLI and GUI • Open Source • Updates: • Yearly • Manual process so far. An update takes circa 1h • 99% of the process can be automatized
  7. 7. www.geo.admin.ch/linkeddataPasquale Di Donato - COGIS Links https://ld.geo.admin.ch/boundaries/municipality/351 http://classifications.data.admin.ch/municipality/351 http://www.wikidata.org/entity/Q70 rdfs:seeAlso https://www.wikidata.org/wiki/Property:P1325 rdfs:seeAlso
  8. 8. www.geo.admin.ch/linkeddataPasquale Di Donato - COGIS Triple Store • Triple Store and SPARQL Endpoint: Virtuoso OS (7.2) • GeoSPARQL support is limited • Only WGS84 as CRS (CRS info to be deleted from the RDF, otherwise data does not get loaded) • Built-in spatial functions (no GeoSPARQL functions) • Spatial queries somehow inaccurate (especially for point-in-polygon queries) • Not easily scalable: • Cannot share the same DB among servers (lock file) • Replication features not available in the OS version • No valuable OS (and possibly not java-based) alternative available for production environments 8
  9. 9. www.geo.admin.ch/linkeddataPasquale Di Donato - COGIS LD Front End • Trifid by Zazuko • Lightweight Linked Data Server and Proxy • Open Source • LD Interface to SPARQL • HTML renderering (with embedded RDF) • Content-Negotiation • SPARQL Client based on a “geo-enabled” YASGUI • Listing of resources in a graph • Search 9
  11. 11. www.geo.admin.ch/linkeddataPasquale Di Donato - COGIS Infrastructure • t2.large = 2CPU, 8GiB Mem • Availability > 98% so far • Virtuoso has just one backend (it is so far not load-balanced) 11 t2.large Rancher Virtuoso Trifid upkick t2.large Rancher Trifid upkick lb t2.large Rancher Virtuoso Trifid upkick t2.large Rancher Trifid upkick lb dev prod
  12. 12. www.geo.admin.ch/linkeddataPasquale Di Donato - COGIS Infrastructure 12 Rancher Virtuoso Trifid upkick Docker Hub tenforce/virtuoso/ swisstopo/linkeddata-trifid/ Git Hub geoadmin/linkeddata-trifid zazuko/trifid ld.geo.admin.ch • After each code commit a new Trifid image is built and automatically deployed • Upkick continuously check is a new image is available and in case launches the deploy
  13. 13. www.geo.admin.ch/linkeddataPasquale Di Donato - COGIS Open issues: ld.geo.admin.ch • Usage statistics • Implement «410 Gone» for those resources which are no more available on the server • Virtuoso: • GeoSPARQL support is not optimal • BBOX used for spatial queries: can lead to inaccurate results • Scaling is difficult • More data will hopefully come
  14. 14. www.geo.admin.ch/linkeddataPasquale Di Donato - COGIS Open issues: Ids • Most of the data we deal with contains identifiers defined in the legislation (e.g.) • Verordnung über das eidgenössische Gebäude- und Wohnungsregister • EGID, EWID • Verordnung über die geografischen Namen • Gemeindeverzeichnis: BFS Number • Ortschaftenverzeichnis: PLZ • Verzeichnis der Strassen: ESID • Verzeichnis der Gebäudeadressen (EGID, EDID, EGAID) • Verzeichnis der Stationsnamen • A common vocabulary would help and improve interoperability among different publication systems. Need to be: • Commonly agreed • «Official» • Lightweight • Dynamic