SlideShare uma empresa Scribd logo
1 de 35
Statistical Data in RDF
Knowledge Engineering Group
Seminar, November 4th 2010
Jindřich Mynarz
@jindrichmynarz
Scope of the talk
• not microdata (e.g., survey data)
• but aggregated data (e.g., averages)
• only RDF
• overview of existing statistical datasets
RDF
• separation of content and layout
o in tabular data table layout defines
the way of interpretation
• flexible, schema-less data format
o not overly inclusive, nor overly
exclusive
Existing statistics in RDF
• CIA World Factbook
• U.S. Census 2000 dataset
• LOIUS - Italian linked university statistics
• Linked Environment Data
• EnAKTing datasets
• data.gov.uk datasets
Eurostat data
• Freie Universität Berlin - D2R Server
• riese (RDFizing and Interlinking the EuroStat Data Set
Effort)
• OntologyCentral - real-time wrapper
• Eurostat's own RDF datasets
Governmental statistics
• data.gov
• data.gov.uk
o EnAKTing mashups and data visualizations
o population, crime, CO2 emissions, transport, agriculture,
education...
Data modelling
• what is being modelled?
o the real world
o a part of the real world
o statistics
• two parts of modelling
o structural semantics
o domain semantics
Structural semantics
• means of expression for the cube's
structure
• groups, slices, time series
• addressed in Data Cube
vocabulary
Domain semantics
• how a dataset refers to the
things that it is about
• connecting statistical
observations to the model of
the domain described by
them
• domain is a set of non-
information resources
Vocabularies
• number of ad hoc vocabularies
• riese
• SCOVO
• SCOVOLink
• Data Cube
• SDMX/RDF
SCOVO
• The Statistical Core Vocabulary
• inspired by riese vocabulary
• modelling of dimensions and observations as separate
resources
• lightweight, easy to adopt
• SCOVOLink addresses domain semantics
Data Cube
• inspired by SCOVO
o added expressive power
• generalization from SDMX/RDF
• re-use of SKOS for codelists
Data Cube
Data Cube
• dimensions (rdf:Property)
• coded values (skos:Concept)
Data Cube
SDMX/RDF
• Statistical Data and Metadata eXchange reformulated in
RDF
• built on top of Data Cube
• contains:
o sdmx
o sdmx-attribute
o sdmx-code
o sdmx-concept
o sdmx-dimension
o sdmx-measure
o sdmx-metadata
o sdmx-subject 
Important parts of modelling
• re-use
• units
• time
• identifiers
• URI patterns
Re-use oriented design
• re-purposing parts of the
existing datasets
• re-using shared
vocabularies
• vocabulary hi-jacking and
extension
Units of measurement
• implicit
o “78693011 mˆ2”, “117 
b”
o eurostat:total_ar
ea_km2
• explicit
o :unit, sdmx-
attribute:unitMea
sure
Modelling of time
• exclusion of the dimension of time (D2R Eurostat, U.S. 
Census 2000)
• time dimension (riese, SDMX/RDF)
o dimension:Time, sdmx:TimeRole
o time series
Identifiers
• blank nodes
• URIs
• HTTP URIs
URI design patterns
• on the Web
o http://
• human-readable
o what/is/this/about
• clustered by resource type
o type/unique-id
• standardized
o {provider 1}/path/to/an/observation
o {provider 2}/path/to/an/observation
• hierarchical
o {broader}/{narrower} 
• reflecting the location of an observation in a data cube
o {dimension 1}/{dimension 2}
Following steps
• data conversion
• interlinking dataset's resources
• linking external datasets
• publishing
Legacy datasets
• statistics-specific data formats
• implicit context of interpretation
• parsing, cleaning
• conversion mechanisms
o SQL DB wrappers (e.g., 
D2R Server)
o real-time exporters (e.g., 
OntologyCentral)
o RDFizers (e.g. RDF123)
o custom-built scripts
Linking
• re-use by reference
• lightweight intergration
• linkable data
• linking properties
o e.g., owl:sameAs, 
skos:closeMatch
Publishing data
• new dissemination standards
• exchanging data with the Web
• RDF dumps
• linked data distribution
• SPARQL
• RDFa
Linked open data cloud
Benefits
• data can be intergrated
• open data
• re-usable data
• data available for applications
Integration
• combining and merging with other datasets
• re-use oriented design
Open data
• freedom of information for public sector information
• open licences
o Creative Commons, Open Government Licence...
• public domain
Anyone can solve the cube
• data is available for individual
analysis
• offices for national statistics
still have the monopoly on
data collection, but no longer
on interpretation of that data
• data-driven journalism
Building on top of statistical data
• once the data is available
useful applications can be
built on top of it
• data visualizations
• data analysis tools
Questions!
Thank you for attention!
Image credits
Semantic Web Rubik's Cube. http://www.flickr.com/photos/dullhunk/3448804778/
Rubik's Cube. http://www.flickr.com/photos/bramus/3249196137/
Hypercube. http://commons.wikimedia.org/wiki/File:Hypercube.png
PICOL: Pictorial communication language. http://picol.org/
Dictionary. http://www.flickr.com/photos/horiavarlan/4268897748/
Oops! http://www.flickr.com/photos/rore/299375688/
Tape Measure. http://www.flickr.com/photos/wwarby/4915969081/
Rubik's Cube 1. http://www.flickr.com/photos/lifeontheedge/374960949/
Detroit's Skyline. http://www.flickr.com/photos/showmeone/4154861617/
Linked Oped Data Cloud. http://richard.cyganiak.de/2007/10/lod/
Cube. http://followtherhythm.deviantart.com/art/cube-128329792
Data Cube diagram. http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/html/qb-
fig1.png

Mais conteúdo relacionado

Mais procurados

FMI Open Data Interface and Data Models
FMI Open Data Interface and Data ModelsFMI Open Data Interface and Data Models
FMI Open Data Interface and Data ModelsRoope Tervo
 
PIC Tier-1 (LHCP Conference / Barcelona)
PIC Tier-1 (LHCP Conference / Barcelona)PIC Tier-1 (LHCP Conference / Barcelona)
PIC Tier-1 (LHCP Conference / Barcelona)Josep Flix
 
AusPlots field data collection with AusScribe
AusPlots field data collection with AusScribeAusPlots field data collection with AusScribe
AusPlots field data collection with AusScribeTERN Australia
 
Collaboratively Conceived, Designed and Implemented: Matching Visualization ...
Collaboratively Conceived, Designed and Implemented:  Matching Visualization ...Collaboratively Conceived, Designed and Implemented:  Matching Visualization ...
Collaboratively Conceived, Designed and Implemented: Matching Visualization ...Nancy Hoebelheinrich
 
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...BigData_Europe
 
Eco-informatics: Data services for bringing together and publishing the full ...
Eco-informatics: Data services for bringing together and publishing the full ...Eco-informatics: Data services for bringing together and publishing the full ...
Eco-informatics: Data services for bringing together and publishing the full ...TERN Australia
 
Data Infrastructure Development for SKA/Jasper Horrell
Data Infrastructure Development for SKA/Jasper HorrellData Infrastructure Development for SKA/Jasper Horrell
Data Infrastructure Development for SKA/Jasper HorrellAfrican Open Science Platform
 
Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...terradue
 
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
2004-09-12 Data and Tools for Web-Based Monitoring and AnalysisRudolf Husar
 
Open Data and and INSPIRE
Open Data and and INSPIREOpen Data and and INSPIRE
Open Data and and INSPIRERoope Tervo
 
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...Mario Juric
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 
Strahlendorff - Insitu searching challenges
Strahlendorff - Insitu searching challengesStrahlendorff - Insitu searching challenges
Strahlendorff - Insitu searching challengesMikko Strahlendorff
 
Meteorological and Aviation Weather Open Data implementation utilising OGC st...
Meteorological and Aviation Weather Open Data implementation utilising OGC st...Meteorological and Aviation Weather Open Data implementation utilising OGC st...
Meteorological and Aviation Weather Open Data implementation utilising OGC st...Roope Tervo
 
Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics iosrjce
 
Weather Data Analytics Using Hadoop
Weather Data Analytics Using HadoopWeather Data Analytics Using Hadoop
Weather Data Analytics Using HadoopNajima Begum
 
LSST/DM: Building a Next Generation Survey Data Processing System
LSST/DM: Building a Next Generation Survey Data Processing SystemLSST/DM: Building a Next Generation Survey Data Processing System
LSST/DM: Building a Next Generation Survey Data Processing SystemMario Juric
 
Bioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pBioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pRobert Grossman
 

Mais procurados (20)

FMI Open Data Interface and Data Models
FMI Open Data Interface and Data ModelsFMI Open Data Interface and Data Models
FMI Open Data Interface and Data Models
 
PIC Tier-1 (LHCP Conference / Barcelona)
PIC Tier-1 (LHCP Conference / Barcelona)PIC Tier-1 (LHCP Conference / Barcelona)
PIC Tier-1 (LHCP Conference / Barcelona)
 
AusPlots field data collection with AusScribe
AusPlots field data collection with AusScribeAusPlots field data collection with AusScribe
AusPlots field data collection with AusScribe
 
Collaboratively Conceived, Designed and Implemented: Matching Visualization ...
Collaboratively Conceived, Designed and Implemented:  Matching Visualization ...Collaboratively Conceived, Designed and Implemented:  Matching Visualization ...
Collaboratively Conceived, Designed and Implemented: Matching Visualization ...
 
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
 
Eco-informatics: Data services for bringing together and publishing the full ...
Eco-informatics: Data services for bringing together and publishing the full ...Eco-informatics: Data services for bringing together and publishing the full ...
Eco-informatics: Data services for bringing together and publishing the full ...
 
Data Infrastructure Development for SKA/Jasper Horrell
Data Infrastructure Development for SKA/Jasper HorrellData Infrastructure Development for SKA/Jasper Horrell
Data Infrastructure Development for SKA/Jasper Horrell
 
Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...
 
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
 
Open Data and and INSPIRE
Open Data and and INSPIREOpen Data and and INSPIRE
Open Data and and INSPIRE
 
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
Strahlendorff - Insitu searching challenges
Strahlendorff - Insitu searching challengesStrahlendorff - Insitu searching challenges
Strahlendorff - Insitu searching challenges
 
Meteorological and Aviation Weather Open Data implementation utilising OGC st...
Meteorological and Aviation Weather Open Data implementation utilising OGC st...Meteorological and Aviation Weather Open Data implementation utilising OGC st...
Meteorological and Aviation Weather Open Data implementation utilising OGC st...
 
Application of web ontology to harvest estimation of rice in Thailand
Application of web ontology to harvest estimation of rice in ThailandApplication of web ontology to harvest estimation of rice in Thailand
Application of web ontology to harvest estimation of rice in Thailand
 
Application of web ontology to harvest estimation of rice in thailand
Application of web ontology to harvest estimation of rice in thailandApplication of web ontology to harvest estimation of rice in thailand
Application of web ontology to harvest estimation of rice in thailand
 
Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics
 
Weather Data Analytics Using Hadoop
Weather Data Analytics Using HadoopWeather Data Analytics Using Hadoop
Weather Data Analytics Using Hadoop
 
LSST/DM: Building a Next Generation Survey Data Processing System
LSST/DM: Building a Next Generation Survey Data Processing SystemLSST/DM: Building a Next Generation Survey Data Processing System
LSST/DM: Building a Next Generation Survey Data Processing System
 
Bioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pBioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9p
 

Semelhante a Statistical data in RDF

IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesDr.-Ing. Thomas Hartmann
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big datahktripathy
 
An R primer for SQL folks
An R primer for SQL folksAn R primer for SQL folks
An R primer for SQL folksThomas Hütter
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataMarin Dimitrov
 
RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)Vladimir Alexiev, PhD, PMP
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataEUCLID project
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
 
Linked Data Usecases
Linked Data UsecasesLinked Data Usecases
Linked Data UsecasesMyungjin Lee
 
Experimental transformation of ABS data into Data Cube Vocabulary (DCV) form...
Experimental transformation of  ABS data into Data Cube Vocabulary (DCV) form...Experimental transformation of  ABS data into Data Cube Vocabulary (DCV) form...
Experimental transformation of ABS data into Data Cube Vocabulary (DCV) form...Alistair Hamilton
 
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...LinDa_FP7
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinAnja Jentzsch
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 
Esta ld -exploring-spatio-temporal-linked-statistical-data
Esta ld -exploring-spatio-temporal-linked-statistical-dataEsta ld -exploring-spatio-temporal-linked-statistical-data
Esta ld -exploring-spatio-temporal-linked-statistical-datageoknow
 
ESTA-LD exploring spatio-temporal linked statistical data
ESTA-LD exploring spatio-temporal linked statistical dataESTA-LD exploring spatio-temporal linked statistical data
ESTA-LD exploring spatio-temporal linked statistical datageoknow
 
Imcs review 2013_04_v7
Imcs review 2013_04_v7Imcs review 2013_04_v7
Imcs review 2013_04_v7Karel Charvat
 

Semelhante a Statistical data in RDF (20)

IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with Triples
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
An R primer for SQL folks
An R primer for SQL folksAn R primer for SQL folks
An R primer for SQL folks
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Data Mining Newspapers Metadata
Data Mining Newspapers MetadataData Mining Newspapers Metadata
Data Mining Newspapers Metadata
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
The Ariadne Project
The Ariadne ProjectThe Ariadne Project
The Ariadne Project
 
2013.05 - LDOW 2013 @ WWW 2013
2013.05 - LDOW 2013 @ WWW 20132013.05 - LDOW 2013 @ WWW 2013
2013.05 - LDOW 2013 @ WWW 2013
 
Bosch, Wackerow: Linked data on the web
Bosch, Wackerow: Linked data on the web Bosch, Wackerow: Linked data on the web
Bosch, Wackerow: Linked data on the web
 
Linked Data Usecases
Linked Data UsecasesLinked Data Usecases
Linked Data Usecases
 
Experimental transformation of ABS data into Data Cube Vocabulary (DCV) form...
Experimental transformation of  ABS data into Data Cube Vocabulary (DCV) form...Experimental transformation of  ABS data into Data Cube Vocabulary (DCV) form...
Experimental transformation of ABS data into Data Cube Vocabulary (DCV) form...
 
dwdm unit 1.ppt
dwdm unit 1.pptdwdm unit 1.ppt
dwdm unit 1.ppt
 
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
Esta ld -exploring-spatio-temporal-linked-statistical-data
Esta ld -exploring-spatio-temporal-linked-statistical-dataEsta ld -exploring-spatio-temporal-linked-statistical-data
Esta ld -exploring-spatio-temporal-linked-statistical-data
 
ESTA-LD exploring spatio-temporal linked statistical data
ESTA-LD exploring spatio-temporal linked statistical dataESTA-LD exploring spatio-temporal linked statistical data
ESTA-LD exploring spatio-temporal linked statistical data
 
Imcs review 2013_04_v7
Imcs review 2013_04_v7Imcs review 2013_04_v7
Imcs review 2013_04_v7
 

Mais de Jindřich Mynarz

EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.orgEC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.orgJindřich Mynarz
 
Applying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementApplying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementJindřich Mynarz
 
Integration of an Automatic Indexing System within the Document Flow of a Gre...
Integration of an Automatic Indexing System within the Document Flow of a Gre...Integration of an Automatic Indexing System within the Document Flow of a Gre...
Integration of an Automatic Indexing System within the Document Flow of a Gre...Jindřich Mynarz
 
Linked data as a library data platform
Linked data as a library data platformLinked data as a library data platform
Linked data as a library data platformJindřich Mynarz
 

Mais de Jindřich Mynarz (6)

EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.orgEC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
 
Applying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementApplying Linked Open Data to Public Procurement
Applying Linked Open Data to Public Procurement
 
Linking library data
Linking library dataLinking library data
Linking library data
 
Integration of an Automatic Indexing System within the Document Flow of a Gre...
Integration of an Automatic Indexing System within the Document Flow of a Gre...Integration of an Automatic Indexing System within the Document Flow of a Gre...
Integration of an Automatic Indexing System within the Document Flow of a Gre...
 
Linked data as a library data platform
Linked data as a library data platformLinked data as a library data platform
Linked data as a library data platform
 
Linked library data
Linked library dataLinked library data
Linked library data
 

Último

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Statistical data in RDF

  • 1. Statistical Data in RDF Knowledge Engineering Group Seminar, November 4th 2010 Jindřich Mynarz @jindrichmynarz
  • 2. Scope of the talk • not microdata (e.g., survey data) • but aggregated data (e.g., averages) • only RDF • overview of existing statistical datasets
  • 3. RDF • separation of content and layout o in tabular data table layout defines the way of interpretation • flexible, schema-less data format o not overly inclusive, nor overly exclusive
  • 4. Existing statistics in RDF • CIA World Factbook • U.S. Census 2000 dataset • LOIUS - Italian linked university statistics • Linked Environment Data • EnAKTing datasets • data.gov.uk datasets
  • 5. Eurostat data • Freie Universität Berlin - D2R Server • riese (RDFizing and Interlinking the EuroStat Data Set Effort) • OntologyCentral - real-time wrapper • Eurostat's own RDF datasets
  • 6. Governmental statistics • data.gov • data.gov.uk o EnAKTing mashups and data visualizations o population, crime, CO2 emissions, transport, agriculture, education...
  • 7. Data modelling • what is being modelled? o the real world o a part of the real world o statistics • two parts of modelling o structural semantics o domain semantics
  • 8. Structural semantics • means of expression for the cube's structure • groups, slices, time series • addressed in Data Cube vocabulary
  • 9. Domain semantics • how a dataset refers to the things that it is about • connecting statistical observations to the model of the domain described by them • domain is a set of non- information resources
  • 10. Vocabularies • number of ad hoc vocabularies • riese • SCOVO • SCOVOLink • Data Cube • SDMX/RDF
  • 11. SCOVO • The Statistical Core Vocabulary • inspired by riese vocabulary • modelling of dimensions and observations as separate resources • lightweight, easy to adopt • SCOVOLink addresses domain semantics
  • 12. Data Cube • inspired by SCOVO o added expressive power • generalization from SDMX/RDF • re-use of SKOS for codelists
  • 14. Data Cube • dimensions (rdf:Property) • coded values (skos:Concept)
  • 16. SDMX/RDF • Statistical Data and Metadata eXchange reformulated in RDF • built on top of Data Cube • contains: o sdmx o sdmx-attribute o sdmx-code o sdmx-concept o sdmx-dimension o sdmx-measure o sdmx-metadata o sdmx-subject 
  • 17. Important parts of modelling • re-use • units • time • identifiers • URI patterns
  • 18. Re-use oriented design • re-purposing parts of the existing datasets • re-using shared vocabularies • vocabulary hi-jacking and extension
  • 19. Units of measurement • implicit o “78693011 mˆ2”, “117  b” o eurostat:total_ar ea_km2 • explicit o :unit, sdmx- attribute:unitMea sure
  • 20. Modelling of time • exclusion of the dimension of time (D2R Eurostat, U.S.  Census 2000) • time dimension (riese, SDMX/RDF) o dimension:Time, sdmx:TimeRole o time series
  • 22. URI design patterns • on the Web o http:// • human-readable o what/is/this/about • clustered by resource type o type/unique-id • standardized o {provider 1}/path/to/an/observation o {provider 2}/path/to/an/observation • hierarchical o {broader}/{narrower}  • reflecting the location of an observation in a data cube o {dimension 1}/{dimension 2}
  • 23. Following steps • data conversion • interlinking dataset's resources • linking external datasets • publishing
  • 24. Legacy datasets • statistics-specific data formats • implicit context of interpretation • parsing, cleaning • conversion mechanisms o SQL DB wrappers (e.g.,  D2R Server) o real-time exporters (e.g.,  OntologyCentral) o RDFizers (e.g. RDF123) o custom-built scripts
  • 25. Linking • re-use by reference • lightweight intergration • linkable data • linking properties o e.g., owl:sameAs,  skos:closeMatch
  • 26. Publishing data • new dissemination standards • exchanging data with the Web • RDF dumps • linked data distribution • SPARQL • RDFa
  • 28. Benefits • data can be intergrated • open data • re-usable data • data available for applications
  • 29. Integration • combining and merging with other datasets • re-use oriented design
  • 30. Open data • freedom of information for public sector information • open licences o Creative Commons, Open Government Licence... • public domain
  • 31. Anyone can solve the cube • data is available for individual analysis • offices for national statistics still have the monopoly on data collection, but no longer on interpretation of that data • data-driven journalism
  • 32. Building on top of statistical data • once the data is available useful applications can be built on top of it • data visualizations • data analysis tools
  • 34. Thank you for attention!
  • 35. Image credits Semantic Web Rubik's Cube. http://www.flickr.com/photos/dullhunk/3448804778/ Rubik's Cube. http://www.flickr.com/photos/bramus/3249196137/ Hypercube. http://commons.wikimedia.org/wiki/File:Hypercube.png PICOL: Pictorial communication language. http://picol.org/ Dictionary. http://www.flickr.com/photos/horiavarlan/4268897748/ Oops! http://www.flickr.com/photos/rore/299375688/ Tape Measure. http://www.flickr.com/photos/wwarby/4915969081/ Rubik's Cube 1. http://www.flickr.com/photos/lifeontheedge/374960949/ Detroit's Skyline. http://www.flickr.com/photos/showmeone/4154861617/ Linked Oped Data Cloud. http://richard.cyganiak.de/2007/10/lod/ Cube. http://followtherhythm.deviantart.com/art/cube-128329792 Data Cube diagram. http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/html/qb- fig1.png