SlideShare uma empresa Scribd logo
1 de 35
Statistical Data in RDF
Knowledge Engineering Group
Seminar, November 4th 2010
Jindřich Mynarz
@jindrichmynarz
Scope of the talk
• not microdata (e.g., survey data)
• but aggregated data (e.g., averages)
• only RDF
• overview of existing statistical datasets
RDF
• separation of content and layout
o in tabular data table layout defines
the way of interpretation
• flexible, schema-less data format
o not overly inclusive, nor overly
exclusive
Existing statistics in RDF
• CIA World Factbook
• U.S. Census 2000 dataset
• LOIUS - Italian linked university statistics
• Linked Environment Data
• EnAKTing datasets
• data.gov.uk datasets
Eurostat data
• Freie Universität Berlin - D2R Server
• riese (RDFizing and Interlinking the EuroStat Data Set
Effort)
• OntologyCentral - real-time wrapper
• Eurostat's own RDF datasets
Governmental statistics
• data.gov
• data.gov.uk
o EnAKTing mashups and data visualizations
o population, crime, CO2 emissions, transport, agriculture,
education...
Data modelling
• what is being modelled?
o the real world
o a part of the real world
o statistics
• two parts of modelling
o structural semantics
o domain semantics
Structural semantics
• means of expression for the cube's
structure
• groups, slices, time series
• addressed in Data Cube
vocabulary
Domain semantics
• how a dataset refers to the
things that it is about
• connecting statistical
observations to the model of
the domain described by
them
• domain is a set of non-
information resources
Vocabularies
• number of ad hoc vocabularies
• riese
• SCOVO
• SCOVOLink
• Data Cube
• SDMX/RDF
SCOVO
• The Statistical Core Vocabulary
• inspired by riese vocabulary
• modelling of dimensions and observations as separate
resources
• lightweight, easy to adopt
• SCOVOLink addresses domain semantics
Data Cube
• inspired by SCOVO
o added expressive power
• generalization from SDMX/RDF
• re-use of SKOS for codelists
Data Cube
Data Cube
• dimensions (rdf:Property)
• coded values (skos:Concept)
Data Cube
SDMX/RDF
• Statistical Data and Metadata eXchange reformulated in
RDF
• built on top of Data Cube
• contains:
o sdmx
o sdmx-attribute
o sdmx-code
o sdmx-concept
o sdmx-dimension
o sdmx-measure
o sdmx-metadata
o sdmx-subject 
Important parts of modelling
• re-use
• units
• time
• identifiers
• URI patterns
Re-use oriented design
• re-purposing parts of the
existing datasets
• re-using shared
vocabularies
• vocabulary hi-jacking and
extension
Units of measurement
• implicit
o “78693011 mˆ2”, “117 
b”
o eurostat:total_ar
ea_km2
• explicit
o :unit, sdmx-
attribute:unitMea
sure
Modelling of time
• exclusion of the dimension of time (D2R Eurostat, U.S. 
Census 2000)
• time dimension (riese, SDMX/RDF)
o dimension:Time, sdmx:TimeRole
o time series
Identifiers
• blank nodes
• URIs
• HTTP URIs
URI design patterns
• on the Web
o http://
• human-readable
o what/is/this/about
• clustered by resource type
o type/unique-id
• standardized
o {provider 1}/path/to/an/observation
o {provider 2}/path/to/an/observation
• hierarchical
o {broader}/{narrower} 
• reflecting the location of an observation in a data cube
o {dimension 1}/{dimension 2}
Following steps
• data conversion
• interlinking dataset's resources
• linking external datasets
• publishing
Legacy datasets
• statistics-specific data formats
• implicit context of interpretation
• parsing, cleaning
• conversion mechanisms
o SQL DB wrappers (e.g., 
D2R Server)
o real-time exporters (e.g., 
OntologyCentral)
o RDFizers (e.g. RDF123)
o custom-built scripts
Linking
• re-use by reference
• lightweight intergration
• linkable data
• linking properties
o e.g., owl:sameAs, 
skos:closeMatch
Publishing data
• new dissemination standards
• exchanging data with the Web
• RDF dumps
• linked data distribution
• SPARQL
• RDFa
Linked open data cloud
Benefits
• data can be intergrated
• open data
• re-usable data
• data available for applications
Integration
• combining and merging with other datasets
• re-use oriented design
Open data
• freedom of information for public sector information
• open licences
o Creative Commons, Open Government Licence...
• public domain
Anyone can solve the cube
• data is available for individual
analysis
• offices for national statistics
still have the monopoly on
data collection, but no longer
on interpretation of that data
• data-driven journalism
Building on top of statistical data
• once the data is available
useful applications can be
built on top of it
• data visualizations
• data analysis tools
Questions!
Thank you for attention!
Image credits
Semantic Web Rubik's Cube. http://www.flickr.com/photos/dullhunk/3448804778/
Rubik's Cube. http://www.flickr.com/photos/bramus/3249196137/
Hypercube. http://commons.wikimedia.org/wiki/File:Hypercube.png
PICOL: Pictorial communication language. http://picol.org/
Dictionary. http://www.flickr.com/photos/horiavarlan/4268897748/
Oops! http://www.flickr.com/photos/rore/299375688/
Tape Measure. http://www.flickr.com/photos/wwarby/4915969081/
Rubik's Cube 1. http://www.flickr.com/photos/lifeontheedge/374960949/
Detroit's Skyline. http://www.flickr.com/photos/showmeone/4154861617/
Linked Oped Data Cloud. http://richard.cyganiak.de/2007/10/lod/
Cube. http://followtherhythm.deviantart.com/art/cube-128329792
Data Cube diagram. http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/html/qb-
fig1.png

Mais conteúdo relacionado

Mais procurados

FMI Open Data Interface and Data Models
FMI Open Data Interface and Data ModelsFMI Open Data Interface and Data Models
FMI Open Data Interface and Data ModelsRoope Tervo
 
PIC Tier-1 (LHCP Conference / Barcelona)
PIC Tier-1 (LHCP Conference / Barcelona)PIC Tier-1 (LHCP Conference / Barcelona)
PIC Tier-1 (LHCP Conference / Barcelona)Josep Flix
 
AusPlots field data collection with AusScribe
AusPlots field data collection with AusScribeAusPlots field data collection with AusScribe
AusPlots field data collection with AusScribeTERN Australia
 
Collaboratively Conceived, Designed and Implemented: Matching Visualization ...
Collaboratively Conceived, Designed and Implemented:  Matching Visualization ...Collaboratively Conceived, Designed and Implemented:  Matching Visualization ...
Collaboratively Conceived, Designed and Implemented: Matching Visualization ...Nancy Hoebelheinrich
 
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...BigData_Europe
 
Eco-informatics: Data services for bringing together and publishing the full ...
Eco-informatics: Data services for bringing together and publishing the full ...Eco-informatics: Data services for bringing together and publishing the full ...
Eco-informatics: Data services for bringing together and publishing the full ...TERN Australia
 
Data Infrastructure Development for SKA/Jasper Horrell
Data Infrastructure Development for SKA/Jasper HorrellData Infrastructure Development for SKA/Jasper Horrell
Data Infrastructure Development for SKA/Jasper HorrellAfrican Open Science Platform
 
Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...terradue
 
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
2004-09-12 Data and Tools for Web-Based Monitoring and AnalysisRudolf Husar
 
Open Data and and INSPIRE
Open Data and and INSPIREOpen Data and and INSPIRE
Open Data and and INSPIRERoope Tervo
 
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...Mario Juric
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 
Strahlendorff - Insitu searching challenges
Strahlendorff - Insitu searching challengesStrahlendorff - Insitu searching challenges
Strahlendorff - Insitu searching challengesMikko Strahlendorff
 
Meteorological and Aviation Weather Open Data implementation utilising OGC st...
Meteorological and Aviation Weather Open Data implementation utilising OGC st...Meteorological and Aviation Weather Open Data implementation utilising OGC st...
Meteorological and Aviation Weather Open Data implementation utilising OGC st...Roope Tervo
 
Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics iosrjce
 
Weather Data Analytics Using Hadoop
Weather Data Analytics Using HadoopWeather Data Analytics Using Hadoop
Weather Data Analytics Using HadoopNajima Begum
 
LSST/DM: Building a Next Generation Survey Data Processing System
LSST/DM: Building a Next Generation Survey Data Processing SystemLSST/DM: Building a Next Generation Survey Data Processing System
LSST/DM: Building a Next Generation Survey Data Processing SystemMario Juric
 
Bioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pBioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pRobert Grossman
 

Mais procurados (20)

FMI Open Data Interface and Data Models
FMI Open Data Interface and Data ModelsFMI Open Data Interface and Data Models
FMI Open Data Interface and Data Models
 
PIC Tier-1 (LHCP Conference / Barcelona)
PIC Tier-1 (LHCP Conference / Barcelona)PIC Tier-1 (LHCP Conference / Barcelona)
PIC Tier-1 (LHCP Conference / Barcelona)
 
AusPlots field data collection with AusScribe
AusPlots field data collection with AusScribeAusPlots field data collection with AusScribe
AusPlots field data collection with AusScribe
 
Collaboratively Conceived, Designed and Implemented: Matching Visualization ...
Collaboratively Conceived, Designed and Implemented:  Matching Visualization ...Collaboratively Conceived, Designed and Implemented:  Matching Visualization ...
Collaboratively Conceived, Designed and Implemented: Matching Visualization ...
 
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
 
Eco-informatics: Data services for bringing together and publishing the full ...
Eco-informatics: Data services for bringing together and publishing the full ...Eco-informatics: Data services for bringing together and publishing the full ...
Eco-informatics: Data services for bringing together and publishing the full ...
 
Data Infrastructure Development for SKA/Jasper Horrell
Data Infrastructure Development for SKA/Jasper HorrellData Infrastructure Development for SKA/Jasper Horrell
Data Infrastructure Development for SKA/Jasper Horrell
 
Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...
 
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
 
Open Data and and INSPIRE
Open Data and and INSPIREOpen Data and and INSPIRE
Open Data and and INSPIRE
 
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
Strahlendorff - Insitu searching challenges
Strahlendorff - Insitu searching challengesStrahlendorff - Insitu searching challenges
Strahlendorff - Insitu searching challenges
 
Meteorological and Aviation Weather Open Data implementation utilising OGC st...
Meteorological and Aviation Weather Open Data implementation utilising OGC st...Meteorological and Aviation Weather Open Data implementation utilising OGC st...
Meteorological and Aviation Weather Open Data implementation utilising OGC st...
 
Application of web ontology to harvest estimation of rice in Thailand
Application of web ontology to harvest estimation of rice in ThailandApplication of web ontology to harvest estimation of rice in Thailand
Application of web ontology to harvest estimation of rice in Thailand
 
Application of web ontology to harvest estimation of rice in thailand
Application of web ontology to harvest estimation of rice in thailandApplication of web ontology to harvest estimation of rice in thailand
Application of web ontology to harvest estimation of rice in thailand
 
Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics
 
Weather Data Analytics Using Hadoop
Weather Data Analytics Using HadoopWeather Data Analytics Using Hadoop
Weather Data Analytics Using Hadoop
 
LSST/DM: Building a Next Generation Survey Data Processing System
LSST/DM: Building a Next Generation Survey Data Processing SystemLSST/DM: Building a Next Generation Survey Data Processing System
LSST/DM: Building a Next Generation Survey Data Processing System
 
Bioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pBioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9p
 

Semelhante a Statistical Data Modelling and Publishing in RDF

IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesDr.-Ing. Thomas Hartmann
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big datahktripathy
 
An R primer for SQL folks
An R primer for SQL folksAn R primer for SQL folks
An R primer for SQL folksThomas Hütter
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataMarin Dimitrov
 
RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)Vladimir Alexiev, PhD, PMP
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataEUCLID project
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
 
Linked Data Usecases
Linked Data UsecasesLinked Data Usecases
Linked Data UsecasesMyungjin Lee
 
Experimental transformation of ABS data into Data Cube Vocabulary (DCV) form...
Experimental transformation of  ABS data into Data Cube Vocabulary (DCV) form...Experimental transformation of  ABS data into Data Cube Vocabulary (DCV) form...
Experimental transformation of ABS data into Data Cube Vocabulary (DCV) form...Alistair Hamilton
 
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...LinDa_FP7
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinAnja Jentzsch
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 
Esta ld -exploring-spatio-temporal-linked-statistical-data
Esta ld -exploring-spatio-temporal-linked-statistical-dataEsta ld -exploring-spatio-temporal-linked-statistical-data
Esta ld -exploring-spatio-temporal-linked-statistical-datageoknow
 
ESTA-LD exploring spatio-temporal linked statistical data
ESTA-LD exploring spatio-temporal linked statistical dataESTA-LD exploring spatio-temporal linked statistical data
ESTA-LD exploring spatio-temporal linked statistical datageoknow
 
Imcs review 2013_04_v7
Imcs review 2013_04_v7Imcs review 2013_04_v7
Imcs review 2013_04_v7Karel Charvat
 

Semelhante a Statistical Data Modelling and Publishing in RDF (20)

IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with Triples
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
An R primer for SQL folks
An R primer for SQL folksAn R primer for SQL folks
An R primer for SQL folks
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Data Mining Newspapers Metadata
Data Mining Newspapers MetadataData Mining Newspapers Metadata
Data Mining Newspapers Metadata
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
The Ariadne Project
The Ariadne ProjectThe Ariadne Project
The Ariadne Project
 
2013.05 - LDOW 2013 @ WWW 2013
2013.05 - LDOW 2013 @ WWW 20132013.05 - LDOW 2013 @ WWW 2013
2013.05 - LDOW 2013 @ WWW 2013
 
Bosch, Wackerow: Linked data on the web
Bosch, Wackerow: Linked data on the web Bosch, Wackerow: Linked data on the web
Bosch, Wackerow: Linked data on the web
 
Linked Data Usecases
Linked Data UsecasesLinked Data Usecases
Linked Data Usecases
 
Experimental transformation of ABS data into Data Cube Vocabulary (DCV) form...
Experimental transformation of  ABS data into Data Cube Vocabulary (DCV) form...Experimental transformation of  ABS data into Data Cube Vocabulary (DCV) form...
Experimental transformation of ABS data into Data Cube Vocabulary (DCV) form...
 
dwdm unit 1.ppt
dwdm unit 1.pptdwdm unit 1.ppt
dwdm unit 1.ppt
 
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
Esta ld -exploring-spatio-temporal-linked-statistical-data
Esta ld -exploring-spatio-temporal-linked-statistical-dataEsta ld -exploring-spatio-temporal-linked-statistical-data
Esta ld -exploring-spatio-temporal-linked-statistical-data
 
ESTA-LD exploring spatio-temporal linked statistical data
ESTA-LD exploring spatio-temporal linked statistical dataESTA-LD exploring spatio-temporal linked statistical data
ESTA-LD exploring spatio-temporal linked statistical data
 
Imcs review 2013_04_v7
Imcs review 2013_04_v7Imcs review 2013_04_v7
Imcs review 2013_04_v7
 

Mais de Jindřich Mynarz

EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.orgEC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.orgJindřich Mynarz
 
Applying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementApplying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementJindřich Mynarz
 
Integration of an Automatic Indexing System within the Document Flow of a Gre...
Integration of an Automatic Indexing System within the Document Flow of a Gre...Integration of an Automatic Indexing System within the Document Flow of a Gre...
Integration of an Automatic Indexing System within the Document Flow of a Gre...Jindřich Mynarz
 
Linked data as a library data platform
Linked data as a library data platformLinked data as a library data platform
Linked data as a library data platformJindřich Mynarz
 

Mais de Jindřich Mynarz (6)

EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.orgEC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
 
Applying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementApplying Linked Open Data to Public Procurement
Applying Linked Open Data to Public Procurement
 
Linking library data
Linking library dataLinking library data
Linking library data
 
Integration of an Automatic Indexing System within the Document Flow of a Gre...
Integration of an Automatic Indexing System within the Document Flow of a Gre...Integration of an Automatic Indexing System within the Document Flow of a Gre...
Integration of an Automatic Indexing System within the Document Flow of a Gre...
 
Linked data as a library data platform
Linked data as a library data platformLinked data as a library data platform
Linked data as a library data platform
 
Linked library data
Linked library dataLinked library data
Linked library data
 

Último

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 

Último (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 

Statistical Data Modelling and Publishing in RDF

  • 1. Statistical Data in RDF Knowledge Engineering Group Seminar, November 4th 2010 Jindřich Mynarz @jindrichmynarz
  • 2. Scope of the talk • not microdata (e.g., survey data) • but aggregated data (e.g., averages) • only RDF • overview of existing statistical datasets
  • 3. RDF • separation of content and layout o in tabular data table layout defines the way of interpretation • flexible, schema-less data format o not overly inclusive, nor overly exclusive
  • 4. Existing statistics in RDF • CIA World Factbook • U.S. Census 2000 dataset • LOIUS - Italian linked university statistics • Linked Environment Data • EnAKTing datasets • data.gov.uk datasets
  • 5. Eurostat data • Freie Universität Berlin - D2R Server • riese (RDFizing and Interlinking the EuroStat Data Set Effort) • OntologyCentral - real-time wrapper • Eurostat's own RDF datasets
  • 6. Governmental statistics • data.gov • data.gov.uk o EnAKTing mashups and data visualizations o population, crime, CO2 emissions, transport, agriculture, education...
  • 7. Data modelling • what is being modelled? o the real world o a part of the real world o statistics • two parts of modelling o structural semantics o domain semantics
  • 8. Structural semantics • means of expression for the cube's structure • groups, slices, time series • addressed in Data Cube vocabulary
  • 9. Domain semantics • how a dataset refers to the things that it is about • connecting statistical observations to the model of the domain described by them • domain is a set of non- information resources
  • 10. Vocabularies • number of ad hoc vocabularies • riese • SCOVO • SCOVOLink • Data Cube • SDMX/RDF
  • 11. SCOVO • The Statistical Core Vocabulary • inspired by riese vocabulary • modelling of dimensions and observations as separate resources • lightweight, easy to adopt • SCOVOLink addresses domain semantics
  • 12. Data Cube • inspired by SCOVO o added expressive power • generalization from SDMX/RDF • re-use of SKOS for codelists
  • 14. Data Cube • dimensions (rdf:Property) • coded values (skos:Concept)
  • 16. SDMX/RDF • Statistical Data and Metadata eXchange reformulated in RDF • built on top of Data Cube • contains: o sdmx o sdmx-attribute o sdmx-code o sdmx-concept o sdmx-dimension o sdmx-measure o sdmx-metadata o sdmx-subject 
  • 17. Important parts of modelling • re-use • units • time • identifiers • URI patterns
  • 18. Re-use oriented design • re-purposing parts of the existing datasets • re-using shared vocabularies • vocabulary hi-jacking and extension
  • 19. Units of measurement • implicit o “78693011 mˆ2”, “117  b” o eurostat:total_ar ea_km2 • explicit o :unit, sdmx- attribute:unitMea sure
  • 20. Modelling of time • exclusion of the dimension of time (D2R Eurostat, U.S.  Census 2000) • time dimension (riese, SDMX/RDF) o dimension:Time, sdmx:TimeRole o time series
  • 22. URI design patterns • on the Web o http:// • human-readable o what/is/this/about • clustered by resource type o type/unique-id • standardized o {provider 1}/path/to/an/observation o {provider 2}/path/to/an/observation • hierarchical o {broader}/{narrower}  • reflecting the location of an observation in a data cube o {dimension 1}/{dimension 2}
  • 23. Following steps • data conversion • interlinking dataset's resources • linking external datasets • publishing
  • 24. Legacy datasets • statistics-specific data formats • implicit context of interpretation • parsing, cleaning • conversion mechanisms o SQL DB wrappers (e.g.,  D2R Server) o real-time exporters (e.g.,  OntologyCentral) o RDFizers (e.g. RDF123) o custom-built scripts
  • 25. Linking • re-use by reference • lightweight intergration • linkable data • linking properties o e.g., owl:sameAs,  skos:closeMatch
  • 26. Publishing data • new dissemination standards • exchanging data with the Web • RDF dumps • linked data distribution • SPARQL • RDFa
  • 28. Benefits • data can be intergrated • open data • re-usable data • data available for applications
  • 29. Integration • combining and merging with other datasets • re-use oriented design
  • 30. Open data • freedom of information for public sector information • open licences o Creative Commons, Open Government Licence... • public domain
  • 31. Anyone can solve the cube • data is available for individual analysis • offices for national statistics still have the monopoly on data collection, but no longer on interpretation of that data • data-driven journalism
  • 32. Building on top of statistical data • once the data is available useful applications can be built on top of it • data visualizations • data analysis tools
  • 34. Thank you for attention!
  • 35. Image credits Semantic Web Rubik's Cube. http://www.flickr.com/photos/dullhunk/3448804778/ Rubik's Cube. http://www.flickr.com/photos/bramus/3249196137/ Hypercube. http://commons.wikimedia.org/wiki/File:Hypercube.png PICOL: Pictorial communication language. http://picol.org/ Dictionary. http://www.flickr.com/photos/horiavarlan/4268897748/ Oops! http://www.flickr.com/photos/rore/299375688/ Tape Measure. http://www.flickr.com/photos/wwarby/4915969081/ Rubik's Cube 1. http://www.flickr.com/photos/lifeontheedge/374960949/ Detroit's Skyline. http://www.flickr.com/photos/showmeone/4154861617/ Linked Oped Data Cloud. http://richard.cyganiak.de/2007/10/lod/ Cube. http://followtherhythm.deviantart.com/art/cube-128329792 Data Cube diagram. http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/html/qb- fig1.png