SlideShare uma empresa Scribd logo
1 de 89
Interaction with Linked Data
Presented by:
Barry Norton
Michael Meier
Motivation: Music!
2
Visualization
Module
Metadata
Streaming providers
Physical Wrapper
Downloads
Dataacquisition
R2R Transf.LD Wrapper
Musical Content
Application
Analysis &
Mining Module
LDDatasetAccess
LD Wrapper
RDF/
XML
Integrated
Dataset
Interlinking Cleansing
Vocabulary
Mapping
SPARQL
Endpoint
Publishing
RDFa
Other content
EUCLID – Interaction with Linked Data
Motivation: Music! (2)
EUCLID – Interaction with Linked Data 3
• Our aim: build a music-based portal using Linked
Data technologies
• So far, we have studied different mechanisms to
consume Linked Data:
• Executing SPARQL queries
• Dereferencing URIs
• Downloading RDF dumps
• Extracting RDFa data
• The output of these mechanisms corresponds to
data in machine-readable formats
CH 2
CH 3
CH 1
Examples of machine-readable output:
Motivation: Music! (3)
EUCLID – Interaction with Linked Data 4
Visualizations techniques are needed in order to
transform the machine-readable data into this:
Motivation: Music! (4)
EUCLID – Interaction with Linked Data 5
Source: http://musicbrainz.fluidops.net/
In addition, visualization techniques allow for:
Motivation: Music! (5)
EUCLID – Interaction with Linked Data 6
• Telling a story
• Engaging our pattern matching
brain
• Identifying data characteristics
which cannot be directly inferred
from statistical properties:
• Anscombe’s quartet: 4 datasets very
different, but with same statistical values.
Image: http://en.wikipedia.org/wiki/Anscombe's_quartet
Source: Donaldson, I. and Lamere P. Using Visualizations for Music Discovery
Image: Chan W., Qu. H, Mak, W. Visualizing the
Semantic Structure in Classical Musical Works.
Agenda
1. Linked Data visualization
2. Linked Data search
3. Methods for Linked Data analysis
7EUCLID – Interaction with Linked Data
LINKED DATAVISUALIZATION
EUCLID – Interaction with Linked Data 8
LDVisualizationTechniques
• Linked Data visualization techniques should provide
graphical representations of the information within
the LD datasets
• Visualization techniques should be selected
accordingly to:
– The type of data: Specific types of data should be
visualized in a certain way
– The purpose of the visualization: Depending on the type
of analysis/application to employ
9EUCLID – Interaction with Linked Data
LDVisualizationTechniques (2)
EUCLID – Interaction with Linked Data 10
• (Raw) RDF data: Instance data, taxonomies,
ontologies, vocabularies.
• Analytically extracted data: Subset of
the data denominated region of interest (ROI),
obtained via data extraction mechanisms, for
example, SPARQL queries.
• Visualization abstraction: It is obtained by
applying visualization transformations to render the
data into displayable information.
• View: Final result. The visual mapping
transformations obtain a graphic representation of
the data using the selected visualization technique.
• User interaction: The user interacts (click,
zoom, etc.) with the visualization, which may trigger
a new visualization process.
RDF data
Analytically
extracted data
Visualization
abstraction
View
Data extraction
Visualization
transformation
Visual mapping
transformation
Overview of the Linked DataVisualization process
Process partially based on: Brunetti , J.M.; Auer, S.; García, R. The Linked Data Visualization Model.
(Optional)
User
interaction
country releases
United Kingdom 225
United States 140
Germany 30
Luxembourg 29
LDVisualizationTechniques (3)
EUCLID – Interaction with Linked Data 11
Example of the Linked DataVisualization process
…
RDF data
Analytically
extracted data
…
Visualization
abstraction
SELECT ?country (COUNT(?release) AS ?releases)
WHERE {
<http://dbpedia.org/resource/The_Beatles> foaf:made
?release .
?release a mo:Release ;
mo:label ?label .
?label foaf:based_near ?country .}
GROUP BY ?country
ORDER BY DESC(?releases)
Data extraction
SPARQL query: Retrieve number of releases per
country of The Beatles
#widget : HeatMap |
input = 'country_code' |
output = {{ 'releases' }}
Visualization
transformation
country_code releases
GB 225
US 140
DE 30
LU 29
?country_code2 := REPLACE(str(?country), "http://ontologi.es/place/", "", "i”)
?country_code := REPLACE(?country_code2, "%", "", "i")
Formatting the names of the countries
View Visual mapping
transformation
Selecting the visualization technique (input, output)
Can be performed in a single step
… …
LDVisualizationTechniques (3)
EUCLID – Interaction with Linked Data 12
Example of the Linked DataVisualization process
View
Challenges for
Linked DataVisualization
EUCLID – Interaction with Linked Data 13
• Enabling user interaction
– Users must be able to navigate through the data by exploiting the
connections between Linked Data resources
– The user might edit the underlying data to enrich it by:
• Creating additional metadata
• Highlighting or correcting errors
• Validating data
• Supporting data reusability
– The output (the plotted data or the visualization itself) might be
encoded using standard ontologies and vocabularies
• Scalability
– Linked Data visualization techniques should support the display of
large amount of data in an efficient way
Challenges for
Linked Open DataVisualization
EUCLID – Interaction with Linked Data 14
• Extracting data from different repositories
– A Linked Data set might be partitioned into several repositories
– The region of interest (ROI) might include data from different data
sets, requiring the access to distributed repositories
• Handling heterogeneous data
– The same data (concepts) might be modeled differently, for example,
using different vocabularies
– Certain values might have different formats, for example, dates
represented as DD-MM-YYYY, MM-DD-YYYY or just YYYY
• Dealing with missing values
– Due to the semi-structuredness of Linked Data, some instances might
have missing values for certain properties
Classification of
VisualizationTechniques
15EUCLID – Interaction with Linked Data
Task Visualization techniques
Comparison of attributes /
values
• Bar/column and pie chart
• Line charts
• Histogram
Analysis of relationships
and hierarchies
• Graph
• Arc diagram
• Matrix
• Node-link visualizations
• Space-filling techniques: Treemaps, icicles and sunburst,
circle packing and rose diagrams
Analysis of temporal or
geographical events
• Timeline
• Maps
Analysis of multi-
dimensional data
• Parallel coordinates
• Radar/star chart
• Scatter plot
Bar/column chart
Allows the comparison of values of
different categories.
Pie chart
Useful for performing comparison
of percentages or proportions.
Comparison of
Attributes /Values
16EUCLID – Interaction with Linked Data
Line chart
Allows visualizing data as a series of
data points, where the measurement
points (x-axis) are ordered.
Histogram
Graphical representation of the
distribution of the data.
Image source: http://mbostock.github.io/protovis/Image source: http://musicbrainz.fluidops.net
Image source: http://mbostock.github.io/protovis/Image source: http://musicbrainz.fluidops.net
Arc diagram
The nodes are displayed in one
dimension, and the arcs represent
the connections.
Analysis of
Relationships and Hierarchies
Graph
The data entries are represented as
nodes and the links as edges.
17EUCLID – Interaction with Linked Data
Adjacency Matrix diagram
The nodes are displayed as rows and
columns, and the links between the
nodes are entries in the matrix.
Node-link visualizations
The data is organized in hierarchies.
Source of images: http://mbostock.github.io/protovis/
Icicles and sunburst
Hierarchies are represented by
adjacencies.
Analysis of
Relationships and Hierarchies (2)
Treemaps
Subdivide area into rectangles.
18EUCLID – Interaction with Linked Data
Circle-packing
Containment is used to represent the
hierarchies.
Rose diagrams
Areas are equal angles and the data
is represented by
the extension of
the area.
Source of images: http://mbostock.github.io/protovis/
Space-fillingtechniques
Analysis of Temporal or
Geographical Events
Timeline
19EUCLID – Interaction with Linked Data
Maps
Source: http://mbostock.github.io/protovis/
Choropleth maps
Aggregate data by
geographical area
Location maps
Display geo-points on a map
Dorling cartograms
Aggregate data and replace
each area with a circle
Discrete data points in time Continuous data in time
Source: http://www.kottke.org/08/08/2008-movie-box-office-chart
Source: http//musicbrainz.fluidops.net
Source: Google Map API Source: http//musicbrainz.fluidops.net
Scatter plot
Useful for performing comparison
of percentages or proportions.
Analysis of
Multidimensional Data
Radar/star chart
Displays multivariate data as a two-
dimensional chart. The axes
correspond to the
variables.
20EUCLID – Interaction with Linked Data
Parallel coordinates
Allows visualizing high-dimensional data.
Each vertical axis denotes a dimension, and
a multidimensional point is represented as
a polyline with vertices on the axes.
Source: http://mbostock.github.io/protovis/
Source: http://mbostock.github.io/protovis/Source: http://mbostock.github.io/protovis/
OtherVisualizationTechniques
EUCLID – Interaction with Linked Data 21
• Text-based visualizations: tag clouds
• Some of the previously presented techniques can be
combined to produce more complex data
visualizations
Phrase Net of Beatles Lyrics
DBpedia music genres
Source: http://www.wordle.net
Source: http://many-eyes.com
• Get an overview of the data
• Identification of relevant resources, classes or properties in
datasets
• Learning about certain underlying characteristics of the data,
e.g., vocabularies or ontologies
• Detecting missing links between nodes in an RDF graph
• Discovering new paths between nodes in an RDF graph
• Identifying hidden patterns in the data
• Finding errors or atypical values (outliers)
22EUCLID – Interaction with Linked Data
Applications of Linked Data
Visualization Techniques
Linked DataVisualization
Tool Requirements
The requirements for visualization tools that consume Linked Data can be
summarized as follows:
• Data navigation and exploration capabilities in order to understand the
structure and the content
• Exploiting data structures:
• Links to visualize hierarchies or graphs
• Multi-dimensional
• User interaction:
• Basic and advanced querying
• Filtering values
• Interactive UI: responsive to the user input
• Publication/syndication of the graphical representation of the data
• Data extraction in order to export the data such that can be reused by
third parties
23EUCLID – Interaction with Linked Data
Linked DataVisualization
ToolTypes
1. LD browsers with text-based representation
• Dereference URIs to retrieve the resource description
• Use a textual representation of LD resources
• Display adequately texts and images
• Mainly support exploratory browsing and knowledge discovery
2. LD and RDF browsers with visualization options
• Exploit picture, graphics, images and other visual
representations of the data
• Support user interaction: allows for querying, filtering and
jumping between resources
• Suitable for browsing and knowledge discovery as well as
analytic activities
24EUCLID – Interaction with Linked Data
Linked DataVisualization
ToolTypes (2)
3.Visualization toolkits
• Frameworks providing a wide range of visualization techniques
• General toolkits support LD visualization by applying a set of
transformations of the data
• Some toolkits are specially designed to consume LD
4. SPARQL visualization
• These tools allow transforming the output of SPARQL queries
into graphics
• Contact SPARQL endpoints in order to evaluate the query
• Suitable for analytical activities
25EUCLID – Interaction with Linked Data
Linked DataVisualization
ToolTypes (3)
26EUCLID – Interaction with Linked Data
LD browsers with text-
based presentations
Sig.ma
Sindice
OpenLink RDF Browser
Marbles
Disco Hyperdata Browser
Piggy Bank (SIMILE)
Zitgist DataViewer
iLOD
URI Burner
Dipper – Talis Platform Browser
LD and RDF browsers
with visualization
options
Tabulator
IsaViz
OpenLink Data Explorer
RDF Gravity
RelFinder
DBpedia Mobile
LESS
SIMILE Exhibit
Haystack
FoaF Explorer
Humboldt
LENA
Noadster
Visualization toolkits
Linked Data tools:
Information Workbench
Visual RDF (by Graves)
LOD Live
LOD Visualization
Data-Driven Documents (D3)
NetworkX
Many Eyes
Tableau
Prefuse
SPARQL visualization
Information Workbench
Google Visualization API
SPARQL package for R
Gruff (for AllegroGraph)
Linked Data:
General data:
Linked DataVisualization
Examples (1)
EUCLID – Interaction with Linked Data 27
Sig.ma
Source: http://sig.ma/search?q=The+Beatles
Retrieves information from
different LD sources
Keyword
search
Displays
values per
predicate
Displays
the source
for each
value
Linked DataVisualization
Examples (2)
EUCLID – Interaction with Linked Data 28
Sig.ma
Source: http://sig.ma/search?q=The+Beatles
Displays
values per
predicate:
May include (redundant)
information in different
languages, for example: annés
and anno
Summary:
• Sig.ma lists all the triples, and group
them per predicate
• Useful for browsing predicates and
values within data sets
• The meaning of the values is not evident
URIs are clickable, allowing
navigation through RDF
resources
Linked DataVisualization
Examples (3)
EUCLID – Interaction with Linked Data 29
Sindice
Keyword
search
Filtering
per type
of
document
Retrieves links
to documents
Allows accessing
cache documents
Allows inspecting
resources
Source: http://sindice.com/search?q=The+Beatles
Linked DataVisualization
Examples (4)
EUCLID – Interaction with Linked Data 30
Sindice
Both interfaces display the
set of triples related to the
inspected resource
Cache triples
Live triples
Linked DataVisualization
Examples (5)
EUCLID – Interaction with Linked Data 31
Information Workbench
• Demo available at: http://musicbrainz.fluidops.net
• Displays human-readable content about Linked Data
resources
• Supports visualization techniques (different types of charts,
maps, timelines, etc.) to plot results from SPARQL queries
• Allows the user to interact with the displayed data
Linked DataVisualization
Examples (6)
EUCLID – Interaction with Linked Data 32
Information Workbench: Browsing a music artist
(1) Search options (2) Search results
Linked DataVisualization
Examples (7)
EUCLID – Interaction with Linked Data 33
Information Workbench: Browsing a music artist
(3) Browsing the selected resource
Linked DataVisualization
Examples (8)
EUCLID – Interaction with Linked Data 34
Information Workbench: Visualization techniques
(3) Browsing the selected resource
Linked DataVisualization
Examples (9)
EUCLID – Interaction with Linked Data 35
Information Workbench: User interaction
LD visualizations must support navigation through the data
Source: http://musicbrainz.fluidops.net/resource/Analytical5
Linked DataVisualization
Examples (9)
EUCLID – Interaction with Linked Data 36
Information Workbench: SPARQLVisualization
Implements widgets which allow:
• Retrieving ROI via SPARQL queries
• Selecting the appropriate visualization technique
• Configuring parameters of the visualization
Linked DataVisualization
Examples (10)
EUCLID – Interaction with Linked Data 37
Information Workbench: SPARQL visualization
SELECT ?release
((SUM(xsd:double(?duration/60000))) AS ?avg)
WHERE {
<http://dbpedia.org/resource/The_Beatles>
foaf:made ?release .
?release mo:record ?record .
?record mo:track ?track .
?track mo:duration ?duration .}
GROUP BY ?release
ORDER BY DESC(?avg)
LIMIT 10
SPARQLQuery
Result set
Top ten The Beatles releases according to the sum of track durations in minutes
Linked DataVisualization
Examples (11)
EUCLID – Interaction with Linked Data 38
Information Workbench: SPARQL visualization
Top ten The Beatles releases according to the sum of track durations in minutes
Widget
Visualization: Bar chart
{{#widget: BarChart |
query ='SELECT (COUNT(?Release) AS ?COUNT)
?label WHERE {
<http://musicbrainz.org/artist/8538e728-ca0b-4321-b7e5-
cff6565dd4c0#_> foaf:made ?Release.
?Release rdf:type mo:Release .
?Release dc:title ?label .}
GROUP BY ?label
ORDER BY DESC(?COUNT)
LIMIT 20'
| settings = 'Settings:barvertical_mb'
| asynch = 'true'
| input = 'label'
| output = 'COUNT'
| height = '300’}}
Linked DataVisualization
Examples (12)
EUCLID – Interaction with Linked Data 39
Information Workbench: SPARQL visualization
Top ten The Beatles releases according to the sum of track durations in minutes
Other visualizations of the same result set …
Line chart:
Pie chart:
Linked DataVisualization
Examples (13)
EUCLID – Interaction with Linked Data 40
Information Workbench: Automated Widget Suggestion
Bar chart
Line chart
Pie chart
1
2 3Table
Pivot
view
Select a suggested visualization Visualization
automatically built
Linked DataVisualization
Examples (14)
EUCLID – Interaction with Linked Data 41
Other tools
Source: http://en.lodlive.it Source: http://lodvisualization.appspot.com
LODVisualizationLOD live
• Graph visualizations
• Interactive UI (the graph can be
expanded by clicking on the nodes)
• Live access to SPARQL endpoints
• Hierarchy visualizations: treemaps and trees
• Live access to SPARQL endpoints
(supporting JSON and SPARQL 1.1)
LinkingOpen Data Cloud
Visualization (1)
42EUCLID – Interaction with Linked Data
“The Linking Open Data cloud diagram”
by Richard Cyganiak and Anja Jentzsch
Source: http://lod-cloud.net
• The nodes correspond
to Linked Data sets
• The edges represent
connections between
Linked Data sets
• The size of the nodes is
proportional to the
number of triples in
each data set
• The datasets are
categorized by
knowledge domains
represented with colors
LinkingOpen Data Cloud
Visualization (2)
43EUCLID – Interaction with Linked Data
Image source: http://twitpic.com/17qj1h
“Linked Open Data Cloud” generated by Gephis
• The central cluster (green) displays DBpedia as a central focus
• The size of the nodes reflect the size of the datasets
• The length of the connections encode information about the data structure
Source: A. Dadzie and M. Rowe. Approaches to Visualizing Linked Data: A Survey. 2011
LinkingOpen Data Cloud
Visualization (3)
44EUCLID – Interaction with Linked Data
“Linked Open Data Graph” by Protovis
Source: http://inkdroid.org/lod-graph/
• The data to be displayed are
retrieved using the CKAN API
• The nodes represent Linked Data
sets available in the Data Hub “lod-
cloud” group
• The size of the nodes is proportional
to the data set size
• Edges are connections between data
sets
• The colors reflect the CKAN rating
and the intensity of the color reflects
the number of received ratings
• The nodes can be clicked to go to the
data set CKAN page
LD Reporting
EUCLID – Interaction with Linked Data 45
• Visualizations techniques are used in the creation of reports
included in data monitoring and management solutions
• Provides and overview of the dataset by generating a low-level
descriptive analysis:
• Quantitative information about the dataset
• Users may interact with the data via dashboards
• Some systems support this feature over structured data:
• Google Webmaster Tools (https://www.google.com/webmasters/tools)
• Information Workbench (http://www.fluidops.com/information-workbench)
• eCloudManager (http://www.fluidops.com/ecloudmanager)
GoogleWebmasterTool:
Structure Data Dashboard (1)
EUCLID – Interaction with Linked Data 46
• Provides to webmasters information about the structured
data embedded in their websites (and recognized by Google)
• The dashboard three levels:
i. Site-level view: aggregates the data by classes defined in
the vocabulary schema
ii. Item-type-level view: provides details per page for each
type of resource
iii. Page-level view: shows the attributes of every type of
resource on a given web page
GoogleWebmasterTool:
Structure Data Dashboard (2)
EUCLID – Interaction with Linked Data 47
Source: http://googlewebmastercentral.blogspot.de/2012/07/introducing-structured-data-dashboard.html
Site-level view
GoogleWebmasterTool:
Structure Data Dashboard (3)
EUCLID – Interaction with Linked Data 48
Source: http://googlewebmastercentral.blogspot.de/2012/07/introducing-structured-data-dashboard.html
Page-level view
Site-level view
LINKED DATA SEARCH
EUCLID – Interaction with Linked Data 49
Semantic Search Process
Using semantic models for the search process
50EUCLID – Interaction with Linked Data
Faceted
Search
Semantic
Search
Image based on: Tran, T., Herzig, D., Ladwig, G. SemSearchPro- Using semantics through the search process
Data graphs Query
Result
visualization/present
ation
User query
(e.g. keywords, NL)
Query visualization
(Optional) User
System
Refinement
Presentation
Analysis
Presentation /
Ranking
Graph matching
Entity Extraction /
Semantic query analysis
Image Source: http://musicontology.com
Semantic Search: Example (1)
51EUCLID – Interaction with Linked Data
User query
(NL)
“songs written by members of the beatles”
Entity extraction:
Query expansion:
song
track
melody
tune
synonym
mo:Track
Candidates
…
song member (of)written by (the) beatles
Entity mapping:
Semantic Search: Example (2)
52EUCLID – Interaction with Linked Data
User query
(NL)
“songs written by members of the beatles”
Entity extraction:
Query expansion:
writer
composer
creator
synonym
mo:composer
Image Source: http://musicontology.com
Candidates
written by
inverse of
…
song member (of)written by (the) beatles
Entity mapping:
Semantic Search: Example (3)
53EUCLID – Interaction with Linked Data
User query
(NL)
“songs written by members of the beatles”
Entity extraction: song member (of)written by (the) beatles
Query expansion:
member (of)
mo:member
_of
mo:member
inverse of
Image Source: http://musicontology.com
Entity mapping:
Semantic Search: Example (4)
54EUCLID – Interaction with Linked Data
User query
(NL)
“songs written by members of the beatles”
Entity extraction: song member (of)written by (the) beatles
Entity mapping:
(the) beatles
Candidates
Beatles
(Book)
The Beatles
(Music Group)
Beatle
(Animal)
Beatle
(Automobile)
How to identify the right “Beatle”? Examine the context (Contextual Analysis)
Semantic Search: Example (5)
55EUCLID – Interaction with Linked Data
User query
(NL)
“songs written by members of the beatles”
Entity extraction: song member (of)written by (the) beatles
Entity mapping:
(the) beatles
Contextual Analysis
foaf:Agent
mo:composer
mo:Track
mo:
MusicArtist
rdfs:subClassOf
mo:
MusicGroup
mo:member
rdfs:subClassOf
This subgraph is part of the query
The Beatles
(Music Group)
dbpedia:
The_Beatles
Entity mapping:
Semantic Search: Example (6)
56EUCLID – Interaction with Linked Data
User query
(NL)
“songs written by members of the beatles”
Entity extraction: song member (of)written by (the) beatles
?y
Mo:Track
?x
dbpedia:
The_Beatles
Results
(I want to) Come Home
Angel in Disguise
Another Day
…
Answers presented to the user
The results could be ranked
Query
foaf:Agent
Semantic Search
• Aims at understanding the meaning of the resources specified
in the query
• Different approaches to exploit semantics:
• Query expansion using ontologies
Since ontologies represent knowledge about specific domains, they can
be used to expand the query by incorporating related ontology terms into
the query.
• Contextual analysis
In LD, this approach may explore the resources specified in the query and their
adjacent nodes in the RDF graph. Mainly applied to disambiguate query terms.
• Reasoning
In some cases, the answer to a specific query is not explicitly contained in the
data, but it can be computed by using reasoning methods.
57EUCLID – Interaction with Linked Data
Semantic Search & Linked Data
58EUCLID – Interaction with Linked Data
Component Semantic search SPARQL query
Keyword or NL /
concept matching
Performs entity extraction
and matching to formal
concepts
Not supported
Fuzzy
concepts/relation/logics
Allows the application of
fuzzy qualifiers as query
constrains
Not supported
Graph patterns Uses the context and
other semantic
information to locate
interesting sub-graphs
Applies pattern matching
Path discovery Finds new interesting
links that may lead to
additional information
Not supported
Semantic Search vs. SPARQL query
Semantic Search: Google (1)
59EUCLID – Interaction with Linked Data
Input: query in NL
Output: List of answers
Google performs semantic search on certain entities and queries!
Semantic Search: Google (2)
60EUCLID – Interaction with Linked Data
Input: question in NL
Output: List of web pages
ranked using the algorithm
Google PageRank to display the
most relevant pages first
Semantic Search: DuckDuckGo (1)
61EUCLID – Interaction with Linked Data
Input: question in NL
Output: List of answers
Semantic Search: DuckDuckGo (2)
62EUCLID – Interaction with Linked Data
Performs disambiguation of the
query terms.
The 45 suggestions are grouped by
classes according to their
corresponding knowledge domain:
This approach is denominated
Faceted Search
Faceted Search: Example
InformationWorkbench: Searching for artists in categories
63EUCLID – Interaction with Linked Data
Facet
Facet
Facet
Source: http://musicbrainz.fluidops.net/resource/mo:MusicArtist?view=pivot
Depictions of artists
Faceted Search
• Facets = properties
• Suitable for browsing multi-dimensional taxonomies based on
the search attributes
• Allows user to explore the data:
• User submits a (keyword) query
• Faceted system dynamically identifies the relevant facets (properties)
for the given query and the constrains (values of those properties), and
display the search results
• User may “drill down” by selecting specific constrains to the search
results
• Information can be accessed and ranked in multiple ways
64EUCLID – Interaction with Linked Data
Faceted Search (2)
Challenges for supporting Faceted Search
• Identifying which facets to surface:
• In heterogeneous datasets, data entries may have different facets
• Dynamically identify the most appropriate facets for each query
• Ordering the facets depending on the relevance to the query
• Computing previews:
• Accurately predicting counts, without examining all the results
• Offering facet preview to give users an idea of what to expect
65EUCLID – Interaction with Linked Data
Source: Teevan , J., Dumais, S., Gutt. Z. Challenges for Supporting Faceted Search in Large, Heterogeneous
Corpora like the Web
Faceted Search: LD Example (1)
FacetedDBLP
• Retrieves information from the DBLP collection
• Shows the result set with different facets:
• Publication years
• Authors
• Conferences
• It is implemented upon the DBLP++ dataset (enhancement of
DBLP including additional keywords and abstracts):
• DBLP ++ is stored in a MySQL database
• Uses D2R server to consume RDF triples
66EUCLID – Interaction with Linked Data
Faceted Search: LD Example (2)
67EUCLID – Interaction with Linked Data
Input: “crowdsourcing”
Facets
485 results
FacetedDBLP
Classification of Search Engines
68EUCLID – Interaction with Linked Data
Semantic
Search
Systems
Faceted
Search
Systems
Google
(GKG)Bing
KIM
sig.ma
LOD cloud cache
/facet
Longwell
mSpace
Exhibit (SIMILE)
PoolParty Semantic
Search Server
DuckDuckGo
Hakia
SenseBot
PowerSet
DeepDive
Kosmix
Factibles
Lexxe
Information Workbench
Searching for Semantic Data
69EUCLID – Interaction with Linked Data
Search for
• Ontologies
• Vocabularies
• RDF documents
Semantic Data Search Engines (1)
EUCLID – Interaction with Linked Data 70
Searching for ontologies
Swoogle
http://kmi-web05.open.ac.uk/WatsonWUIhttp://swoogle.umbc.edu
Watson
Keyword search
Keyword search
Semantic Data Search Engines (2)
Searching for vocabularies: LOV Portal
• Allows to search properties, classes or vocabularies in
the Linked Open Vocabulary (LOV) catalog
• The LOV search engine implement faceted search on:
• The knowledge domain
• The role of the resource matched from the input query
• The vocabulary containing the resource
• Results are ranked according to a score considering:
• Relevancy to the query (string)
• Element labels matched importance
• Number of LOV vocabularies that refer to the element
71EUCLID – Interaction with Linked Data
Semantic Data Search Engines (3)
72EUCLID – Interaction with Linked Data
Facets
84 results
Input: “artist”
CH 3
Searching for vocabularies: LOV Portal
Semantic Data Search Engines (4)
EUCLID – Interaction with Linked Data 73
Searching for documents
http://swse.deri.org http://sindice.com
Semantic Web Search Engine Sindice
METHODS FOR LINKED DATA
ANALYSIS
EUCLID – Interaction with Linked Data 74
Features of Data Analysis
75EUCLID – Interaction with Linked Data
Statistical analysis
• Allows describing the data via Exploratory Data Analysis (EDA) methods
• Includes statistical inference and prediction
Data aggregation & filtering
• One of the first steps in data analysis is pre-processing in order to select the
appropriate data to study
Visualization techniques can be built on top of these as part of data analysis
Machine learning
• Focuses on prediction
• Combines Artificial Intelligence and Statistics
• Includes supervised and unsupervised learning (not covered in this course)
LD Data Aggregation & Filtering
EUCLID – Interaction with Linked Data 76
• Data aggregation refers to merging/summarizing several
values into a single a one
• Filtering allows retrieving relevant data properties and
selecting a particular range of data values
• SPARQL is able to perform these features via SELECT queries
as follows:
Features SPARQL capabilities
Aggregation Combining aggregate functions (COUNT, SUM, AVG, … ) and
GROUP BY operator
Filtering Combining projection, FILTER and HAVING operators
LD Statistical Analysis
EUCLID – Interaction with Linked Data 77
• Statistical analysis supports descriptive and predictive
operations
• SPARQL supports some descriptive operations (average,
maximum, minimum) but does not offer more sophisticated
statistical features like:
• Fitting distributions
• Linear regressions
• Analysis of variance
• …
• Some approaches are able to consume data retrieved from
SPARQL endpoints:
– “R for SPARQL” by Willen Robert van Hage & Tomi Kauppinen
– “Performing Statistical Methods on Linked Data” by Zapilko & Mathiak
R – Statistical Computing
EUCLID – Interaction with Linked Data 78
• R is a language and environment for statistical computing
• R provides a wide variety of statistical and graphical
techniques
• Linear and nonlinear modeling
• Classical statistical tests
• Time-series analysis
• Classification (Machine Learning)
• Clustering (Machine Learning)
• Extensible with further functionalities
• R is available as Free Software (under the terms of the
GNU general public license)
Statistical Analysis with R
EUCLID – Interaction with Linked Data 79
R for SPARQL
EUCLID – Interaction with Linked Data 80
• The R for SPARQL Package enables to:
• Connect a SPARQL endpoint over HTTP
• Pose a SELECT query or an UPDATE operation (LOAD, INSERT, DELETE)
• If given a SELECT query, it returns the results as a data frame
• The results can directly be mapped and visualized
• Posing requests:
• If the parameter query is given, it is assumed that the input is a SELECT query
and a GET request will be performed to get the results from the URL of the
endpoint
• If the parameter update is given, it is assumed that the input is an UPDATE
operation and a POST request will be submit to the URL of the endpoint.
Nothing is returned
Source: http://linkedscience.org/tools/sparql-package-for-r/
R for SPARQL: Example (1)
EUCLID – Interaction with Linked Data 81
1. Download the R package and load it:
• library(SPARQL)
• Library(sp) #user for plotting spatial data
2. Define the endpoint with the triples
• endpoint = "http://spatial.linkedscience.org/sparql"
3. Define the query
• q = "SELECT ?cell ?row ?col ?polygon ?DEFOR_2002
WHERE {
?cell a <http://linkedscience.org/lsv/ns#Item> ;
<http://spatial.linkedscience.org/context/amazon/Lin> ?row ;
<http://spatial.linkedscience.org/context/amazon/Col> ?col;
<http://observedchange.com/tisc/ns#geometry> ?polygon .
<http://spatial.linkedscience.org/context/amazon/DEFOR_2002>
?DEFOR_2002 .
}"
Source: http://linkedscience.org/tools/sparql-package-for-r
R for SPARQL: Example (2)
EUCLID – Interaction with Linked Data 82
4. Link the result to an object
• res <- SPARQL(endpoint,q)$results
5. Handling the results
• res$row <- -res$row
• coordinates(res) <- ~col - row
6. Chose the graphical format and plot the results
• spplot(res,"DEFOR_2002",col.regions=rev(heat.colors(
17))[-1], at=(0:16)/100, main="relative
deforestation per pixel during 2002")
Source: http://linkedscience.org/tools/sparql-package-for-r
R for SPARQL: Example (3)
EUCLID – Interaction with Linked Data 83
Source: http://linkedscience.org/tools/sparql-package-for-r
Machine Learning
EUCLID – Interaction with Linked Data 84
• Machine Learning techniques allow to extract interesting
information from data sources, and can be used to discover
hidden patterns within datasets by generalizing from examples
• Different ML approaches can be applied:
• Clustering: groups similar data into data partitions called clusters
• Association rule learning: discovers relations between variables
• Decision tree learning: analyses observations to build a predictive
model represented as a tree
• Many others …
• Weka is a Data Mining framework commonly used to apply ML
on tabular data:
– www.cs.waikato.ac.nz/ml/weka
Machine Learning on LD
EUCLID – Interaction with Linked Data 85
Challenges for applying Machine Learning on LD
• LD heterogeneity introduces noise to the data:
– Same LD resources, different URIs
– Predicates with similar semantics, but different constraints
• The data is not independent and identically distributed (iid):
– It does not consist of only one type of objects
– The entities are related to each other
• LD rarely contains negative examples needed for ML
algorithms:
– For example, owl:differentFrom
Source http://www.cip.ifi.lmu.de/~nickel/iswc2012-slides
Applications of
Machine Learning on LD
EUCLID – Interaction with Linked Data 86
• Node ranking:
– Ranking nodes according to their relevance for a query
• Link prediction:
– Infer edges between LD resources
– Predict the new edges that will be added to the RDF graph
• Entity resolution:
– Determine whether two URIs correspond to the same real-
world object
• Taxonomy learning:
– Infer taxonomies or concept hierarchies from a given
vocabulary or ontology
Summary
EUCLID – Interaction with Linked Data 87
• Linked Data visualization techniques:
• Visualizations must be chosen according the type of the data
• Wide variety of tools supporting SPARQL results’ visualization
• Might be used in dashboards for supporting administrative tasks
• Linked Data search
• Semantic search: exploits the meaning of user queries (NL or set of
keywords) to present useful results
• Faceted search: allows browsing multi-dimensional data
• Linked Data analysis:
• Includes data manipulation such as aggregation & filtering
• Applies statistical methods to get a better understanding of the data
• Machine Learning techniques can be applied for predictive analysis
• Visualization techniques can be built on top of the previous features
For exercises, quiz and further material visit our website:
EUCLID - Providing Linked Data 88
@euclid_project euclidproject euclidproject
http://www.euclid-project.eu
Other channels:
eBook Course
Acknowledgements
• Alexander Mikroyannidis
• Alice Carpentier
• Andreas Harth
• Andreas Wagner
• Andriy Nikolov
• Barry Norton
• Daniel M. Herzig
• Elena Simperl
• Günter Ladwig
• Inga Shamkhalov
• Jacek Kopecky
• John Domingue

• Juan Sequeda
• Kalina Bontcheva
• Maria Maleshkova
• Maria-Esther Vidal
• Maribel Acosta
• Michael Meier
• Ning Li
• Paul Mulholland
• Peter Haase
• Richard Power
• Steffen Stadtmüller
89

Mais conteúdo relacionado

Mais procurados

Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the webChiara Del Vescovo
 
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons LearnedWWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons LearnedStefan Dietze
 
ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...
ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...
ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...eswcsummerschool
 
Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Asuncion Gomez-Perez
 
Linked data HHS 2015
Linked data HHS 2015Linked data HHS 2015
Linked data HHS 2015Cason Snow
 
Brief State of the Art - Semantic Web technologies for geospatial data - Mode...
Brief State of the Art - Semantic Web technologies for geospatial data - Mode...Brief State of the Art - Semantic Web technologies for geospatial data - Mode...
Brief State of the Art - Semantic Web technologies for geospatial data - Mode...Ana Roxin
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
Alphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata Matters
Alphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata MattersAlphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata Matters
Alphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata MattersNew York University
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapubeswcsummerschool
 
DBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, DublinDBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, Dublinm_ackermann
 
Introduction | Categories for Description of Works of Art | CDWA-LITE
Introduction | Categories for Description of Works of Art | CDWA-LITE Introduction | Categories for Description of Works of Art | CDWA-LITE
Introduction | Categories for Description of Works of Art | CDWA-LITE Kymberly Keeton
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...eswcsummerschool
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionRonald Ashri
 

Mais procurados (20)

Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the web
 
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons LearnedWWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
 
ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...
ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...
ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...
 
Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data
 
The Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of LeipzigThe Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of Leipzig
 
Linked data HHS 2015
Linked data HHS 2015Linked data HHS 2015
Linked data HHS 2015
 
Linked data life cycles
Linked data life cyclesLinked data life cycles
Linked data life cycles
 
April 24, 2013 NISO/DCMI Webinar: Deployment of RDA (Resource Description and...
April 24, 2013 NISO/DCMI Webinar: Deployment of RDA (Resource Description and...April 24, 2013 NISO/DCMI Webinar: Deployment of RDA (Resource Description and...
April 24, 2013 NISO/DCMI Webinar: Deployment of RDA (Resource Description and...
 
Brief State of the Art - Semantic Web technologies for geospatial data - Mode...
Brief State of the Art - Semantic Web technologies for geospatial data - Mode...Brief State of the Art - Semantic Web technologies for geospatial data - Mode...
Brief State of the Art - Semantic Web technologies for geospatial data - Mode...
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Alphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata Matters
Alphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata MattersAlphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata Matters
Alphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata Matters
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early AdoptersApril 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
 
DBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, DublinDBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, Dublin
 
Introduction | Categories for Description of Works of Art | CDWA-LITE
Introduction | Categories for Description of Works of Art | CDWA-LITE Introduction | Categories for Description of Works of Art | CDWA-LITE
Introduction | Categories for Description of Works of Art | CDWA-LITE
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
 

Destaque

Querying Linked Data on Android
Querying Linked Data on AndroidQuerying Linked Data on Android
Querying Linked Data on AndroidEUCLID project
 
Best Practices for Linked Data Education
Best Practices for Linked Data EducationBest Practices for Linked Data Education
Best Practices for Linked Data EducationEUCLID project
 
Online Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionOnline Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionEUCLID project
 
Speech Technology and Big Data
Speech Technology and Big DataSpeech Technology and Big Data
Speech Technology and Big DataEUCLID project
 
Data Science Curriculum for Professionals
Data Science Curriculum for ProfessionalsData Science Curriculum for Professionals
Data Science Curriculum for ProfessionalsEUCLID project
 
Mapping Relational Databases to Linked Data
Mapping Relational Databases to Linked DataMapping Relational Databases to Linked Data
Mapping Relational Databases to Linked DataEUCLID project
 
Relational Database to RDF (RDB2RDF)
Relational Database to RDF (RDB2RDF)Relational Database to RDF (RDB2RDF)
Relational Database to RDF (RDB2RDF)EUCLID project
 
Annotation Processor, trésor caché de la JVM
Annotation Processor, trésor caché de la JVMAnnotation Processor, trésor caché de la JVM
Annotation Processor, trésor caché de la JVMRaphaël Brugier
 
Automatic Term Ambiguity Detection
Automatic Term Ambiguity DetectionAutomatic Term Ambiguity Detection
Automatic Term Ambiguity DetectionYunyao Li
 
Exploring Linked Data content through network analysis
Exploring Linked Data content through network analysisExploring Linked Data content through network analysis
Exploring Linked Data content through network analysisChristophe Guéret
 
A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific VocabularyA Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific VocabularyTimm Heuss
 
Linked Data: What’s the Story?
Linked Data: What’s the Story?Linked Data: What’s the Story?
Linked Data: What’s the Story?WiLS
 
SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...
SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...
SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...Guy De Pauw
 
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Olivier Grisel
 
Understanding Named-Entity Recognition (NER)
Understanding Named-Entity Recognition (NER) Understanding Named-Entity Recognition (NER)
Understanding Named-Entity Recognition (NER) Stephen Shellman
 

Destaque (18)

Querying Linked Data on Android
Querying Linked Data on AndroidQuerying Linked Data on Android
Querying Linked Data on Android
 
Best Practices for Linked Data Education
Best Practices for Linked Data EducationBest Practices for Linked Data Education
Best Practices for Linked Data Education
 
Online Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionOnline Learning and Linked Data: An Introduction
Online Learning and Linked Data: An Introduction
 
Speech Technology and Big Data
Speech Technology and Big DataSpeech Technology and Big Data
Speech Technology and Big Data
 
Data Science Curriculum for Professionals
Data Science Curriculum for ProfessionalsData Science Curriculum for Professionals
Data Science Curriculum for Professionals
 
Mapping Relational Databases to Linked Data
Mapping Relational Databases to Linked DataMapping Relational Databases to Linked Data
Mapping Relational Databases to Linked Data
 
Relational Database to RDF (RDB2RDF)
Relational Database to RDF (RDB2RDF)Relational Database to RDF (RDB2RDF)
Relational Database to RDF (RDB2RDF)
 
Comment manager des geeks - Devoxx 2015
Comment manager des geeks - Devoxx 2015Comment manager des geeks - Devoxx 2015
Comment manager des geeks - Devoxx 2015
 
Annotation Processor, trésor caché de la JVM
Annotation Processor, trésor caché de la JVMAnnotation Processor, trésor caché de la JVM
Annotation Processor, trésor caché de la JVM
 
Building and managing a research team %281%29
Building and managing a research team %281%29Building and managing a research team %281%29
Building and managing a research team %281%29
 
Automatic Term Ambiguity Detection
Automatic Term Ambiguity DetectionAutomatic Term Ambiguity Detection
Automatic Term Ambiguity Detection
 
Exploring Linked Data content through network analysis
Exploring Linked Data content through network analysisExploring Linked Data content through network analysis
Exploring Linked Data content through network analysis
 
Entity Search Engine
Entity Search Engine Entity Search Engine
Entity Search Engine
 
A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific VocabularyA Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
 
Linked Data: What’s the Story?
Linked Data: What’s the Story?Linked Data: What’s the Story?
Linked Data: What’s the Story?
 
SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...
SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...
SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...
 
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
 
Understanding Named-Entity Recognition (NER)
Understanding Named-Entity Recognition (NER) Understanding Named-Entity Recognition (NER)
Understanding Named-Entity Recognition (NER)
 

Semelhante a Linked Data Visualization Techniques

ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interactio...
ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interactio...ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interactio...
ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interactio...eswcsummerschool
 
Integrating GIS utility data in the UK
Integrating GIS utility data in the UKIntegrating GIS utility data in the UK
Integrating GIS utility data in the UKAntArch
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEnno Meijers
 
SEMANCO - Integrating multiple data sources, domains and tools in urban ener...
SEMANCO - Integrating multiple data sources, domains and tools in  urban ener...SEMANCO - Integrating multiple data sources, domains and tools in  urban ener...
SEMANCO - Integrating multiple data sources, domains and tools in urban ener...Álvaro Sicilia
 
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Rinke Hoekstra
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataMarin Dimitrov
 
Big Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopBig Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopExtremeEarth
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...giuseppe_futia
 
A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...Europeana
 
Visualization of Linked Data
Visualization of Linked DataVisualization of Linked Data
Visualization of Linked Datagiuseppe_futia
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.pptPalaniKumarR2
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeNational Institute of Informatics
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRathachai Chawuthai
 

Semelhante a Linked Data Visualization Techniques (20)

ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interactio...
ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interactio...ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interactio...
ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interactio...
 
Integrating GIS utility data in the UK
Integrating GIS utility data in the UKIntegrating GIS utility data in the UK
Integrating GIS utility data in the UK
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz
 
Visualization Proess
Visualization ProessVisualization Proess
Visualization Proess
 
Echoes Project
Echoes ProjectEchoes Project
Echoes Project
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage information
 
SEMANCO - Integrating multiple data sources, domains and tools in urban ener...
SEMANCO - Integrating multiple data sources, domains and tools in  urban ener...SEMANCO - Integrating multiple data sources, domains and tools in  urban ener...
SEMANCO - Integrating multiple data sources, domains and tools in urban ener...
 
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Big Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopBig Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open Workshop
 
Linked Data
Linked DataLinked Data
Linked Data
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
 
A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...
 
Visualization of Linked Data
Visualization of Linked DataVisualization of Linked Data
Visualization of Linked Data
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
 
Introduction to DataMining
Introduction to DataMiningIntroduction to DataMining
Introduction to DataMining
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
 

Último

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 

Último (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 

Linked Data Visualization Techniques

  • 1. Interaction with Linked Data Presented by: Barry Norton Michael Meier
  • 2. Motivation: Music! 2 Visualization Module Metadata Streaming providers Physical Wrapper Downloads Dataacquisition R2R Transf.LD Wrapper Musical Content Application Analysis & Mining Module LDDatasetAccess LD Wrapper RDF/ XML Integrated Dataset Interlinking Cleansing Vocabulary Mapping SPARQL Endpoint Publishing RDFa Other content EUCLID – Interaction with Linked Data
  • 3. Motivation: Music! (2) EUCLID – Interaction with Linked Data 3 • Our aim: build a music-based portal using Linked Data technologies • So far, we have studied different mechanisms to consume Linked Data: • Executing SPARQL queries • Dereferencing URIs • Downloading RDF dumps • Extracting RDFa data • The output of these mechanisms corresponds to data in machine-readable formats CH 2 CH 3 CH 1
  • 4. Examples of machine-readable output: Motivation: Music! (3) EUCLID – Interaction with Linked Data 4
  • 5. Visualizations techniques are needed in order to transform the machine-readable data into this: Motivation: Music! (4) EUCLID – Interaction with Linked Data 5 Source: http://musicbrainz.fluidops.net/
  • 6. In addition, visualization techniques allow for: Motivation: Music! (5) EUCLID – Interaction with Linked Data 6 • Telling a story • Engaging our pattern matching brain • Identifying data characteristics which cannot be directly inferred from statistical properties: • Anscombe’s quartet: 4 datasets very different, but with same statistical values. Image: http://en.wikipedia.org/wiki/Anscombe's_quartet Source: Donaldson, I. and Lamere P. Using Visualizations for Music Discovery Image: Chan W., Qu. H, Mak, W. Visualizing the Semantic Structure in Classical Musical Works.
  • 7. Agenda 1. Linked Data visualization 2. Linked Data search 3. Methods for Linked Data analysis 7EUCLID – Interaction with Linked Data
  • 8. LINKED DATAVISUALIZATION EUCLID – Interaction with Linked Data 8
  • 9. LDVisualizationTechniques • Linked Data visualization techniques should provide graphical representations of the information within the LD datasets • Visualization techniques should be selected accordingly to: – The type of data: Specific types of data should be visualized in a certain way – The purpose of the visualization: Depending on the type of analysis/application to employ 9EUCLID – Interaction with Linked Data
  • 10. LDVisualizationTechniques (2) EUCLID – Interaction with Linked Data 10 • (Raw) RDF data: Instance data, taxonomies, ontologies, vocabularies. • Analytically extracted data: Subset of the data denominated region of interest (ROI), obtained via data extraction mechanisms, for example, SPARQL queries. • Visualization abstraction: It is obtained by applying visualization transformations to render the data into displayable information. • View: Final result. The visual mapping transformations obtain a graphic representation of the data using the selected visualization technique. • User interaction: The user interacts (click, zoom, etc.) with the visualization, which may trigger a new visualization process. RDF data Analytically extracted data Visualization abstraction View Data extraction Visualization transformation Visual mapping transformation Overview of the Linked DataVisualization process Process partially based on: Brunetti , J.M.; Auer, S.; García, R. The Linked Data Visualization Model. (Optional) User interaction
  • 11. country releases United Kingdom 225 United States 140 Germany 30 Luxembourg 29 LDVisualizationTechniques (3) EUCLID – Interaction with Linked Data 11 Example of the Linked DataVisualization process … RDF data Analytically extracted data … Visualization abstraction SELECT ?country (COUNT(?release) AS ?releases) WHERE { <http://dbpedia.org/resource/The_Beatles> foaf:made ?release . ?release a mo:Release ; mo:label ?label . ?label foaf:based_near ?country .} GROUP BY ?country ORDER BY DESC(?releases) Data extraction SPARQL query: Retrieve number of releases per country of The Beatles #widget : HeatMap | input = 'country_code' | output = {{ 'releases' }} Visualization transformation country_code releases GB 225 US 140 DE 30 LU 29 ?country_code2 := REPLACE(str(?country), "http://ontologi.es/place/", "", "i”) ?country_code := REPLACE(?country_code2, "%", "", "i") Formatting the names of the countries View Visual mapping transformation Selecting the visualization technique (input, output) Can be performed in a single step … …
  • 12. LDVisualizationTechniques (3) EUCLID – Interaction with Linked Data 12 Example of the Linked DataVisualization process View
  • 13. Challenges for Linked DataVisualization EUCLID – Interaction with Linked Data 13 • Enabling user interaction – Users must be able to navigate through the data by exploiting the connections between Linked Data resources – The user might edit the underlying data to enrich it by: • Creating additional metadata • Highlighting or correcting errors • Validating data • Supporting data reusability – The output (the plotted data or the visualization itself) might be encoded using standard ontologies and vocabularies • Scalability – Linked Data visualization techniques should support the display of large amount of data in an efficient way
  • 14. Challenges for Linked Open DataVisualization EUCLID – Interaction with Linked Data 14 • Extracting data from different repositories – A Linked Data set might be partitioned into several repositories – The region of interest (ROI) might include data from different data sets, requiring the access to distributed repositories • Handling heterogeneous data – The same data (concepts) might be modeled differently, for example, using different vocabularies – Certain values might have different formats, for example, dates represented as DD-MM-YYYY, MM-DD-YYYY or just YYYY • Dealing with missing values – Due to the semi-structuredness of Linked Data, some instances might have missing values for certain properties
  • 15. Classification of VisualizationTechniques 15EUCLID – Interaction with Linked Data Task Visualization techniques Comparison of attributes / values • Bar/column and pie chart • Line charts • Histogram Analysis of relationships and hierarchies • Graph • Arc diagram • Matrix • Node-link visualizations • Space-filling techniques: Treemaps, icicles and sunburst, circle packing and rose diagrams Analysis of temporal or geographical events • Timeline • Maps Analysis of multi- dimensional data • Parallel coordinates • Radar/star chart • Scatter plot
  • 16. Bar/column chart Allows the comparison of values of different categories. Pie chart Useful for performing comparison of percentages or proportions. Comparison of Attributes /Values 16EUCLID – Interaction with Linked Data Line chart Allows visualizing data as a series of data points, where the measurement points (x-axis) are ordered. Histogram Graphical representation of the distribution of the data. Image source: http://mbostock.github.io/protovis/Image source: http://musicbrainz.fluidops.net Image source: http://mbostock.github.io/protovis/Image source: http://musicbrainz.fluidops.net
  • 17. Arc diagram The nodes are displayed in one dimension, and the arcs represent the connections. Analysis of Relationships and Hierarchies Graph The data entries are represented as nodes and the links as edges. 17EUCLID – Interaction with Linked Data Adjacency Matrix diagram The nodes are displayed as rows and columns, and the links between the nodes are entries in the matrix. Node-link visualizations The data is organized in hierarchies. Source of images: http://mbostock.github.io/protovis/
  • 18. Icicles and sunburst Hierarchies are represented by adjacencies. Analysis of Relationships and Hierarchies (2) Treemaps Subdivide area into rectangles. 18EUCLID – Interaction with Linked Data Circle-packing Containment is used to represent the hierarchies. Rose diagrams Areas are equal angles and the data is represented by the extension of the area. Source of images: http://mbostock.github.io/protovis/ Space-fillingtechniques
  • 19. Analysis of Temporal or Geographical Events Timeline 19EUCLID – Interaction with Linked Data Maps Source: http://mbostock.github.io/protovis/ Choropleth maps Aggregate data by geographical area Location maps Display geo-points on a map Dorling cartograms Aggregate data and replace each area with a circle Discrete data points in time Continuous data in time Source: http://www.kottke.org/08/08/2008-movie-box-office-chart Source: http//musicbrainz.fluidops.net Source: Google Map API Source: http//musicbrainz.fluidops.net
  • 20. Scatter plot Useful for performing comparison of percentages or proportions. Analysis of Multidimensional Data Radar/star chart Displays multivariate data as a two- dimensional chart. The axes correspond to the variables. 20EUCLID – Interaction with Linked Data Parallel coordinates Allows visualizing high-dimensional data. Each vertical axis denotes a dimension, and a multidimensional point is represented as a polyline with vertices on the axes. Source: http://mbostock.github.io/protovis/ Source: http://mbostock.github.io/protovis/Source: http://mbostock.github.io/protovis/
  • 21. OtherVisualizationTechniques EUCLID – Interaction with Linked Data 21 • Text-based visualizations: tag clouds • Some of the previously presented techniques can be combined to produce more complex data visualizations Phrase Net of Beatles Lyrics DBpedia music genres Source: http://www.wordle.net Source: http://many-eyes.com
  • 22. • Get an overview of the data • Identification of relevant resources, classes or properties in datasets • Learning about certain underlying characteristics of the data, e.g., vocabularies or ontologies • Detecting missing links between nodes in an RDF graph • Discovering new paths between nodes in an RDF graph • Identifying hidden patterns in the data • Finding errors or atypical values (outliers) 22EUCLID – Interaction with Linked Data Applications of Linked Data Visualization Techniques
  • 23. Linked DataVisualization Tool Requirements The requirements for visualization tools that consume Linked Data can be summarized as follows: • Data navigation and exploration capabilities in order to understand the structure and the content • Exploiting data structures: • Links to visualize hierarchies or graphs • Multi-dimensional • User interaction: • Basic and advanced querying • Filtering values • Interactive UI: responsive to the user input • Publication/syndication of the graphical representation of the data • Data extraction in order to export the data such that can be reused by third parties 23EUCLID – Interaction with Linked Data
  • 24. Linked DataVisualization ToolTypes 1. LD browsers with text-based representation • Dereference URIs to retrieve the resource description • Use a textual representation of LD resources • Display adequately texts and images • Mainly support exploratory browsing and knowledge discovery 2. LD and RDF browsers with visualization options • Exploit picture, graphics, images and other visual representations of the data • Support user interaction: allows for querying, filtering and jumping between resources • Suitable for browsing and knowledge discovery as well as analytic activities 24EUCLID – Interaction with Linked Data
  • 25. Linked DataVisualization ToolTypes (2) 3.Visualization toolkits • Frameworks providing a wide range of visualization techniques • General toolkits support LD visualization by applying a set of transformations of the data • Some toolkits are specially designed to consume LD 4. SPARQL visualization • These tools allow transforming the output of SPARQL queries into graphics • Contact SPARQL endpoints in order to evaluate the query • Suitable for analytical activities 25EUCLID – Interaction with Linked Data
  • 26. Linked DataVisualization ToolTypes (3) 26EUCLID – Interaction with Linked Data LD browsers with text- based presentations Sig.ma Sindice OpenLink RDF Browser Marbles Disco Hyperdata Browser Piggy Bank (SIMILE) Zitgist DataViewer iLOD URI Burner Dipper – Talis Platform Browser LD and RDF browsers with visualization options Tabulator IsaViz OpenLink Data Explorer RDF Gravity RelFinder DBpedia Mobile LESS SIMILE Exhibit Haystack FoaF Explorer Humboldt LENA Noadster Visualization toolkits Linked Data tools: Information Workbench Visual RDF (by Graves) LOD Live LOD Visualization Data-Driven Documents (D3) NetworkX Many Eyes Tableau Prefuse SPARQL visualization Information Workbench Google Visualization API SPARQL package for R Gruff (for AllegroGraph) Linked Data: General data:
  • 27. Linked DataVisualization Examples (1) EUCLID – Interaction with Linked Data 27 Sig.ma Source: http://sig.ma/search?q=The+Beatles Retrieves information from different LD sources Keyword search Displays values per predicate Displays the source for each value
  • 28. Linked DataVisualization Examples (2) EUCLID – Interaction with Linked Data 28 Sig.ma Source: http://sig.ma/search?q=The+Beatles Displays values per predicate: May include (redundant) information in different languages, for example: annés and anno Summary: • Sig.ma lists all the triples, and group them per predicate • Useful for browsing predicates and values within data sets • The meaning of the values is not evident URIs are clickable, allowing navigation through RDF resources
  • 29. Linked DataVisualization Examples (3) EUCLID – Interaction with Linked Data 29 Sindice Keyword search Filtering per type of document Retrieves links to documents Allows accessing cache documents Allows inspecting resources Source: http://sindice.com/search?q=The+Beatles
  • 30. Linked DataVisualization Examples (4) EUCLID – Interaction with Linked Data 30 Sindice Both interfaces display the set of triples related to the inspected resource Cache triples Live triples
  • 31. Linked DataVisualization Examples (5) EUCLID – Interaction with Linked Data 31 Information Workbench • Demo available at: http://musicbrainz.fluidops.net • Displays human-readable content about Linked Data resources • Supports visualization techniques (different types of charts, maps, timelines, etc.) to plot results from SPARQL queries • Allows the user to interact with the displayed data
  • 32. Linked DataVisualization Examples (6) EUCLID – Interaction with Linked Data 32 Information Workbench: Browsing a music artist (1) Search options (2) Search results
  • 33. Linked DataVisualization Examples (7) EUCLID – Interaction with Linked Data 33 Information Workbench: Browsing a music artist (3) Browsing the selected resource
  • 34. Linked DataVisualization Examples (8) EUCLID – Interaction with Linked Data 34 Information Workbench: Visualization techniques (3) Browsing the selected resource
  • 35. Linked DataVisualization Examples (9) EUCLID – Interaction with Linked Data 35 Information Workbench: User interaction LD visualizations must support navigation through the data Source: http://musicbrainz.fluidops.net/resource/Analytical5
  • 36. Linked DataVisualization Examples (9) EUCLID – Interaction with Linked Data 36 Information Workbench: SPARQLVisualization Implements widgets which allow: • Retrieving ROI via SPARQL queries • Selecting the appropriate visualization technique • Configuring parameters of the visualization
  • 37. Linked DataVisualization Examples (10) EUCLID – Interaction with Linked Data 37 Information Workbench: SPARQL visualization SELECT ?release ((SUM(xsd:double(?duration/60000))) AS ?avg) WHERE { <http://dbpedia.org/resource/The_Beatles> foaf:made ?release . ?release mo:record ?record . ?record mo:track ?track . ?track mo:duration ?duration .} GROUP BY ?release ORDER BY DESC(?avg) LIMIT 10 SPARQLQuery Result set Top ten The Beatles releases according to the sum of track durations in minutes
  • 38. Linked DataVisualization Examples (11) EUCLID – Interaction with Linked Data 38 Information Workbench: SPARQL visualization Top ten The Beatles releases according to the sum of track durations in minutes Widget Visualization: Bar chart {{#widget: BarChart | query ='SELECT (COUNT(?Release) AS ?COUNT) ?label WHERE { <http://musicbrainz.org/artist/8538e728-ca0b-4321-b7e5- cff6565dd4c0#_> foaf:made ?Release. ?Release rdf:type mo:Release . ?Release dc:title ?label .} GROUP BY ?label ORDER BY DESC(?COUNT) LIMIT 20' | settings = 'Settings:barvertical_mb' | asynch = 'true' | input = 'label' | output = 'COUNT' | height = '300’}}
  • 39. Linked DataVisualization Examples (12) EUCLID – Interaction with Linked Data 39 Information Workbench: SPARQL visualization Top ten The Beatles releases according to the sum of track durations in minutes Other visualizations of the same result set … Line chart: Pie chart:
  • 40. Linked DataVisualization Examples (13) EUCLID – Interaction with Linked Data 40 Information Workbench: Automated Widget Suggestion Bar chart Line chart Pie chart 1 2 3Table Pivot view Select a suggested visualization Visualization automatically built
  • 41. Linked DataVisualization Examples (14) EUCLID – Interaction with Linked Data 41 Other tools Source: http://en.lodlive.it Source: http://lodvisualization.appspot.com LODVisualizationLOD live • Graph visualizations • Interactive UI (the graph can be expanded by clicking on the nodes) • Live access to SPARQL endpoints • Hierarchy visualizations: treemaps and trees • Live access to SPARQL endpoints (supporting JSON and SPARQL 1.1)
  • 42. LinkingOpen Data Cloud Visualization (1) 42EUCLID – Interaction with Linked Data “The Linking Open Data cloud diagram” by Richard Cyganiak and Anja Jentzsch Source: http://lod-cloud.net • The nodes correspond to Linked Data sets • The edges represent connections between Linked Data sets • The size of the nodes is proportional to the number of triples in each data set • The datasets are categorized by knowledge domains represented with colors
  • 43. LinkingOpen Data Cloud Visualization (2) 43EUCLID – Interaction with Linked Data Image source: http://twitpic.com/17qj1h “Linked Open Data Cloud” generated by Gephis • The central cluster (green) displays DBpedia as a central focus • The size of the nodes reflect the size of the datasets • The length of the connections encode information about the data structure Source: A. Dadzie and M. Rowe. Approaches to Visualizing Linked Data: A Survey. 2011
  • 44. LinkingOpen Data Cloud Visualization (3) 44EUCLID – Interaction with Linked Data “Linked Open Data Graph” by Protovis Source: http://inkdroid.org/lod-graph/ • The data to be displayed are retrieved using the CKAN API • The nodes represent Linked Data sets available in the Data Hub “lod- cloud” group • The size of the nodes is proportional to the data set size • Edges are connections between data sets • The colors reflect the CKAN rating and the intensity of the color reflects the number of received ratings • The nodes can be clicked to go to the data set CKAN page
  • 45. LD Reporting EUCLID – Interaction with Linked Data 45 • Visualizations techniques are used in the creation of reports included in data monitoring and management solutions • Provides and overview of the dataset by generating a low-level descriptive analysis: • Quantitative information about the dataset • Users may interact with the data via dashboards • Some systems support this feature over structured data: • Google Webmaster Tools (https://www.google.com/webmasters/tools) • Information Workbench (http://www.fluidops.com/information-workbench) • eCloudManager (http://www.fluidops.com/ecloudmanager)
  • 46. GoogleWebmasterTool: Structure Data Dashboard (1) EUCLID – Interaction with Linked Data 46 • Provides to webmasters information about the structured data embedded in their websites (and recognized by Google) • The dashboard three levels: i. Site-level view: aggregates the data by classes defined in the vocabulary schema ii. Item-type-level view: provides details per page for each type of resource iii. Page-level view: shows the attributes of every type of resource on a given web page
  • 47. GoogleWebmasterTool: Structure Data Dashboard (2) EUCLID – Interaction with Linked Data 47 Source: http://googlewebmastercentral.blogspot.de/2012/07/introducing-structured-data-dashboard.html Site-level view
  • 48. GoogleWebmasterTool: Structure Data Dashboard (3) EUCLID – Interaction with Linked Data 48 Source: http://googlewebmastercentral.blogspot.de/2012/07/introducing-structured-data-dashboard.html Page-level view Site-level view
  • 49. LINKED DATA SEARCH EUCLID – Interaction with Linked Data 49
  • 50. Semantic Search Process Using semantic models for the search process 50EUCLID – Interaction with Linked Data Faceted Search Semantic Search Image based on: Tran, T., Herzig, D., Ladwig, G. SemSearchPro- Using semantics through the search process Data graphs Query Result visualization/present ation User query (e.g. keywords, NL) Query visualization (Optional) User System Refinement Presentation Analysis Presentation / Ranking Graph matching Entity Extraction / Semantic query analysis
  • 51. Image Source: http://musicontology.com Semantic Search: Example (1) 51EUCLID – Interaction with Linked Data User query (NL) “songs written by members of the beatles” Entity extraction: Query expansion: song track melody tune synonym mo:Track Candidates … song member (of)written by (the) beatles Entity mapping:
  • 52. Semantic Search: Example (2) 52EUCLID – Interaction with Linked Data User query (NL) “songs written by members of the beatles” Entity extraction: Query expansion: writer composer creator synonym mo:composer Image Source: http://musicontology.com Candidates written by inverse of … song member (of)written by (the) beatles Entity mapping:
  • 53. Semantic Search: Example (3) 53EUCLID – Interaction with Linked Data User query (NL) “songs written by members of the beatles” Entity extraction: song member (of)written by (the) beatles Query expansion: member (of) mo:member _of mo:member inverse of Image Source: http://musicontology.com Entity mapping:
  • 54. Semantic Search: Example (4) 54EUCLID – Interaction with Linked Data User query (NL) “songs written by members of the beatles” Entity extraction: song member (of)written by (the) beatles Entity mapping: (the) beatles Candidates Beatles (Book) The Beatles (Music Group) Beatle (Animal) Beatle (Automobile) How to identify the right “Beatle”? Examine the context (Contextual Analysis)
  • 55. Semantic Search: Example (5) 55EUCLID – Interaction with Linked Data User query (NL) “songs written by members of the beatles” Entity extraction: song member (of)written by (the) beatles Entity mapping: (the) beatles Contextual Analysis foaf:Agent mo:composer mo:Track mo: MusicArtist rdfs:subClassOf mo: MusicGroup mo:member rdfs:subClassOf This subgraph is part of the query The Beatles (Music Group) dbpedia: The_Beatles Entity mapping:
  • 56. Semantic Search: Example (6) 56EUCLID – Interaction with Linked Data User query (NL) “songs written by members of the beatles” Entity extraction: song member (of)written by (the) beatles ?y Mo:Track ?x dbpedia: The_Beatles Results (I want to) Come Home Angel in Disguise Another Day … Answers presented to the user The results could be ranked Query foaf:Agent
  • 57. Semantic Search • Aims at understanding the meaning of the resources specified in the query • Different approaches to exploit semantics: • Query expansion using ontologies Since ontologies represent knowledge about specific domains, they can be used to expand the query by incorporating related ontology terms into the query. • Contextual analysis In LD, this approach may explore the resources specified in the query and their adjacent nodes in the RDF graph. Mainly applied to disambiguate query terms. • Reasoning In some cases, the answer to a specific query is not explicitly contained in the data, but it can be computed by using reasoning methods. 57EUCLID – Interaction with Linked Data
  • 58. Semantic Search & Linked Data 58EUCLID – Interaction with Linked Data Component Semantic search SPARQL query Keyword or NL / concept matching Performs entity extraction and matching to formal concepts Not supported Fuzzy concepts/relation/logics Allows the application of fuzzy qualifiers as query constrains Not supported Graph patterns Uses the context and other semantic information to locate interesting sub-graphs Applies pattern matching Path discovery Finds new interesting links that may lead to additional information Not supported Semantic Search vs. SPARQL query
  • 59. Semantic Search: Google (1) 59EUCLID – Interaction with Linked Data Input: query in NL Output: List of answers Google performs semantic search on certain entities and queries!
  • 60. Semantic Search: Google (2) 60EUCLID – Interaction with Linked Data Input: question in NL Output: List of web pages ranked using the algorithm Google PageRank to display the most relevant pages first
  • 61. Semantic Search: DuckDuckGo (1) 61EUCLID – Interaction with Linked Data Input: question in NL Output: List of answers
  • 62. Semantic Search: DuckDuckGo (2) 62EUCLID – Interaction with Linked Data Performs disambiguation of the query terms. The 45 suggestions are grouped by classes according to their corresponding knowledge domain: This approach is denominated Faceted Search
  • 63. Faceted Search: Example InformationWorkbench: Searching for artists in categories 63EUCLID – Interaction with Linked Data Facet Facet Facet Source: http://musicbrainz.fluidops.net/resource/mo:MusicArtist?view=pivot Depictions of artists
  • 64. Faceted Search • Facets = properties • Suitable for browsing multi-dimensional taxonomies based on the search attributes • Allows user to explore the data: • User submits a (keyword) query • Faceted system dynamically identifies the relevant facets (properties) for the given query and the constrains (values of those properties), and display the search results • User may “drill down” by selecting specific constrains to the search results • Information can be accessed and ranked in multiple ways 64EUCLID – Interaction with Linked Data
  • 65. Faceted Search (2) Challenges for supporting Faceted Search • Identifying which facets to surface: • In heterogeneous datasets, data entries may have different facets • Dynamically identify the most appropriate facets for each query • Ordering the facets depending on the relevance to the query • Computing previews: • Accurately predicting counts, without examining all the results • Offering facet preview to give users an idea of what to expect 65EUCLID – Interaction with Linked Data Source: Teevan , J., Dumais, S., Gutt. Z. Challenges for Supporting Faceted Search in Large, Heterogeneous Corpora like the Web
  • 66. Faceted Search: LD Example (1) FacetedDBLP • Retrieves information from the DBLP collection • Shows the result set with different facets: • Publication years • Authors • Conferences • It is implemented upon the DBLP++ dataset (enhancement of DBLP including additional keywords and abstracts): • DBLP ++ is stored in a MySQL database • Uses D2R server to consume RDF triples 66EUCLID – Interaction with Linked Data
  • 67. Faceted Search: LD Example (2) 67EUCLID – Interaction with Linked Data Input: “crowdsourcing” Facets 485 results FacetedDBLP
  • 68. Classification of Search Engines 68EUCLID – Interaction with Linked Data Semantic Search Systems Faceted Search Systems Google (GKG)Bing KIM sig.ma LOD cloud cache /facet Longwell mSpace Exhibit (SIMILE) PoolParty Semantic Search Server DuckDuckGo Hakia SenseBot PowerSet DeepDive Kosmix Factibles Lexxe Information Workbench
  • 69. Searching for Semantic Data 69EUCLID – Interaction with Linked Data Search for • Ontologies • Vocabularies • RDF documents
  • 70. Semantic Data Search Engines (1) EUCLID – Interaction with Linked Data 70 Searching for ontologies Swoogle http://kmi-web05.open.ac.uk/WatsonWUIhttp://swoogle.umbc.edu Watson Keyword search Keyword search
  • 71. Semantic Data Search Engines (2) Searching for vocabularies: LOV Portal • Allows to search properties, classes or vocabularies in the Linked Open Vocabulary (LOV) catalog • The LOV search engine implement faceted search on: • The knowledge domain • The role of the resource matched from the input query • The vocabulary containing the resource • Results are ranked according to a score considering: • Relevancy to the query (string) • Element labels matched importance • Number of LOV vocabularies that refer to the element 71EUCLID – Interaction with Linked Data
  • 72. Semantic Data Search Engines (3) 72EUCLID – Interaction with Linked Data Facets 84 results Input: “artist” CH 3 Searching for vocabularies: LOV Portal
  • 73. Semantic Data Search Engines (4) EUCLID – Interaction with Linked Data 73 Searching for documents http://swse.deri.org http://sindice.com Semantic Web Search Engine Sindice
  • 74. METHODS FOR LINKED DATA ANALYSIS EUCLID – Interaction with Linked Data 74
  • 75. Features of Data Analysis 75EUCLID – Interaction with Linked Data Statistical analysis • Allows describing the data via Exploratory Data Analysis (EDA) methods • Includes statistical inference and prediction Data aggregation & filtering • One of the first steps in data analysis is pre-processing in order to select the appropriate data to study Visualization techniques can be built on top of these as part of data analysis Machine learning • Focuses on prediction • Combines Artificial Intelligence and Statistics • Includes supervised and unsupervised learning (not covered in this course)
  • 76. LD Data Aggregation & Filtering EUCLID – Interaction with Linked Data 76 • Data aggregation refers to merging/summarizing several values into a single a one • Filtering allows retrieving relevant data properties and selecting a particular range of data values • SPARQL is able to perform these features via SELECT queries as follows: Features SPARQL capabilities Aggregation Combining aggregate functions (COUNT, SUM, AVG, … ) and GROUP BY operator Filtering Combining projection, FILTER and HAVING operators
  • 77. LD Statistical Analysis EUCLID – Interaction with Linked Data 77 • Statistical analysis supports descriptive and predictive operations • SPARQL supports some descriptive operations (average, maximum, minimum) but does not offer more sophisticated statistical features like: • Fitting distributions • Linear regressions • Analysis of variance • … • Some approaches are able to consume data retrieved from SPARQL endpoints: – “R for SPARQL” by Willen Robert van Hage & Tomi Kauppinen – “Performing Statistical Methods on Linked Data” by Zapilko & Mathiak
  • 78. R – Statistical Computing EUCLID – Interaction with Linked Data 78 • R is a language and environment for statistical computing • R provides a wide variety of statistical and graphical techniques • Linear and nonlinear modeling • Classical statistical tests • Time-series analysis • Classification (Machine Learning) • Clustering (Machine Learning) • Extensible with further functionalities • R is available as Free Software (under the terms of the GNU general public license)
  • 79. Statistical Analysis with R EUCLID – Interaction with Linked Data 79
  • 80. R for SPARQL EUCLID – Interaction with Linked Data 80 • The R for SPARQL Package enables to: • Connect a SPARQL endpoint over HTTP • Pose a SELECT query or an UPDATE operation (LOAD, INSERT, DELETE) • If given a SELECT query, it returns the results as a data frame • The results can directly be mapped and visualized • Posing requests: • If the parameter query is given, it is assumed that the input is a SELECT query and a GET request will be performed to get the results from the URL of the endpoint • If the parameter update is given, it is assumed that the input is an UPDATE operation and a POST request will be submit to the URL of the endpoint. Nothing is returned Source: http://linkedscience.org/tools/sparql-package-for-r/
  • 81. R for SPARQL: Example (1) EUCLID – Interaction with Linked Data 81 1. Download the R package and load it: • library(SPARQL) • Library(sp) #user for plotting spatial data 2. Define the endpoint with the triples • endpoint = "http://spatial.linkedscience.org/sparql" 3. Define the query • q = "SELECT ?cell ?row ?col ?polygon ?DEFOR_2002 WHERE { ?cell a <http://linkedscience.org/lsv/ns#Item> ; <http://spatial.linkedscience.org/context/amazon/Lin> ?row ; <http://spatial.linkedscience.org/context/amazon/Col> ?col; <http://observedchange.com/tisc/ns#geometry> ?polygon . <http://spatial.linkedscience.org/context/amazon/DEFOR_2002> ?DEFOR_2002 . }" Source: http://linkedscience.org/tools/sparql-package-for-r
  • 82. R for SPARQL: Example (2) EUCLID – Interaction with Linked Data 82 4. Link the result to an object • res <- SPARQL(endpoint,q)$results 5. Handling the results • res$row <- -res$row • coordinates(res) <- ~col - row 6. Chose the graphical format and plot the results • spplot(res,"DEFOR_2002",col.regions=rev(heat.colors( 17))[-1], at=(0:16)/100, main="relative deforestation per pixel during 2002") Source: http://linkedscience.org/tools/sparql-package-for-r
  • 83. R for SPARQL: Example (3) EUCLID – Interaction with Linked Data 83 Source: http://linkedscience.org/tools/sparql-package-for-r
  • 84. Machine Learning EUCLID – Interaction with Linked Data 84 • Machine Learning techniques allow to extract interesting information from data sources, and can be used to discover hidden patterns within datasets by generalizing from examples • Different ML approaches can be applied: • Clustering: groups similar data into data partitions called clusters • Association rule learning: discovers relations between variables • Decision tree learning: analyses observations to build a predictive model represented as a tree • Many others … • Weka is a Data Mining framework commonly used to apply ML on tabular data: – www.cs.waikato.ac.nz/ml/weka
  • 85. Machine Learning on LD EUCLID – Interaction with Linked Data 85 Challenges for applying Machine Learning on LD • LD heterogeneity introduces noise to the data: – Same LD resources, different URIs – Predicates with similar semantics, but different constraints • The data is not independent and identically distributed (iid): – It does not consist of only one type of objects – The entities are related to each other • LD rarely contains negative examples needed for ML algorithms: – For example, owl:differentFrom Source http://www.cip.ifi.lmu.de/~nickel/iswc2012-slides
  • 86. Applications of Machine Learning on LD EUCLID – Interaction with Linked Data 86 • Node ranking: – Ranking nodes according to their relevance for a query • Link prediction: – Infer edges between LD resources – Predict the new edges that will be added to the RDF graph • Entity resolution: – Determine whether two URIs correspond to the same real- world object • Taxonomy learning: – Infer taxonomies or concept hierarchies from a given vocabulary or ontology
  • 87. Summary EUCLID – Interaction with Linked Data 87 • Linked Data visualization techniques: • Visualizations must be chosen according the type of the data • Wide variety of tools supporting SPARQL results’ visualization • Might be used in dashboards for supporting administrative tasks • Linked Data search • Semantic search: exploits the meaning of user queries (NL or set of keywords) to present useful results • Faceted search: allows browsing multi-dimensional data • Linked Data analysis: • Includes data manipulation such as aggregation & filtering • Applies statistical methods to get a better understanding of the data • Machine Learning techniques can be applied for predictive analysis • Visualization techniques can be built on top of the previous features
  • 88. For exercises, quiz and further material visit our website: EUCLID - Providing Linked Data 88 @euclid_project euclidproject euclidproject http://www.euclid-project.eu Other channels: eBook Course
  • 89. Acknowledgements • Alexander Mikroyannidis • Alice Carpentier • Andreas Harth • Andreas Wagner • Andriy Nikolov • Barry Norton • Daniel M. Herzig • Elena Simperl • Günter Ladwig • Inga Shamkhalov • Jacek Kopecky • John Domingue
 • Juan Sequeda • Kalina Bontcheva • Maria Maleshkova • Maria-Esther Vidal • Maribel Acosta • Michael Meier • Ning Li • Paul Mulholland • Peter Haase • Richard Power • Steffen Stadtmüller 89

Notas do Editor

  1. visualizations techniques by visualization techniques Tell by TellingEngage by EngagingIdentify by Identifying
  2. Accordingly BY accordingly to
  3. may may BY may
  4. Allows BY allow
  5. Dbpedia by DBpedia
  6. Can you please send me: - the endpoint - a query that works
  7. Semantic query analysis mean: query expansion using ontologies, context analysis and reasoning
  8. Source http://www.r-project.org/screenshots/RAqua-scrshot1.jpg
  9. guest1Password1SPARQL Package enables to connect to a SPARQL end-point over HTTP, pose a SELECT query or an update query (LOAD, INSERT, DELETE).If given a SELECT query it returns the results as a data frame with a named column for each variable from the SELECT query, a list of prefixes and namespaces that were shortened to qnames is also returned.If given an update query nothing is returned. If the parameter “query” is given, it is assumed the given query is a SELECT query and a GET request will be done to get the results from the URL of the end point.Otherwise, if the parameter “update” is given, it is assumed the given query is an update query and a POST request will be done to send the request to the URL of the end point.
  10. Accessing the dataAt first, make sure that you have recent versions of the two R packages SPARQL and sp installed. Load the two packages by calling:library(SPARQL) # make sure to use at least version 1.9library(sp)Define the endpoint that will provide you with the triples byendpoint &lt;- &quot;http://spatial.linkedscience.org/sparql&quot;To reduce the XML’s file size, the data is queried piece-wise. The query is initiated byq &lt;- &quot;SELECT ?cell ?row ?col ?polygon WHERE { ?cell a &lt;http://linkedscience.org/lsv/ns#Item&gt; ; &lt;http://spatial.linkedscience.org/context/amazon/Lin&gt; ?row ; &lt;http://spatial.linkedscience.org/context/amazon/Col&gt; ?col ; &lt;http://observedchange.com/tisc/ns#geometry&gt; ?polygon . }&quot;res &lt;- SPARQL(url=endpoint, q)$resultsand completed within a loop over all deforestation variablesfor(var in c(&quot;DEFOR_2002&quot;, &quot;DEFOR_2003&quot;, &quot;DEFOR_2004&quot;, &quot;DEFOR_2005&quot;, &quot;DEFOR_2006&quot;, &quot;DEFOR_2007&quot;,&quot;DEFOR_2008&quot;)) {tmp_q &lt;- paste(&quot;SELECT ?cell ?&quot;,var,&quot;\\n WHERE { \\n ?cell a &lt;http://linkedscience.org/lsv/ns#Item&gt; ;\\n &lt;http://spatial.linkedscience.org/context/amazon/&quot;,var,&quot;&gt; ?&quot;,var,&quot; .\\n }\\n&quot;,sep=&quot;&quot;)cat(tmp_q) res &lt;- merge(res, SPARQL(endpoint, tmp_q)$results, by=&quot;cell&quot;)}Creating a SpatialPixelsDataFrameWe copy the results to a new object and flip the y-axis:amazon &lt;- resamazon$row &lt;- -res$rowAssigningcoordinates to a data.framewillresult in a Spatial-object. Setting the type to griddedwill produce a SpatialPixelsDataFrame:coordinates(amazon) &lt;- ~ col+rowgridded(amazon) &lt;- TRUEPlotting and handling the dataAs a first application, we produce a mapshowing relative deforestation per pixel during 2002 by:spplot(amazon,&quot;DEFOR_2002&quot;,col.regions=rev(heat.colors(17))[-1], at=(0:16)/100,       main=&quot;relative deforestation per pixel during 2002&quot;)
  11. Accessing the dataAt first, make sure that you have recent versions of the two R packages SPARQL and sp installed. Load the two packages by calling:library(SPARQL) # make sure to use at least version 1.9library(sp)Define the endpoint that will provide you with the triples byendpoint &lt;- &quot;http://spatial.linkedscience.org/sparql&quot;To reduce the XML’s file size, the data is queried piece-wise. The query is initiated byq &lt;- &quot;SELECT ?cell ?row ?col ?polygon WHERE { ?cell a &lt;http://linkedscience.org/lsv/ns#Item&gt; ; &lt;http://spatial.linkedscience.org/context/amazon/Lin&gt; ?row ; &lt;http://spatial.linkedscience.org/context/amazon/Col&gt; ?col ; &lt;http://observedchange.com/tisc/ns#geometry&gt; ?polygon . }&quot;res &lt;- SPARQL(url=endpoint, q)$resultsand completed within a loop over all deforestation variablesfor(var in c(&quot;DEFOR_2002&quot;, &quot;DEFOR_2003&quot;, &quot;DEFOR_2004&quot;, &quot;DEFOR_2005&quot;, &quot;DEFOR_2006&quot;, &quot;DEFOR_2007&quot;,&quot;DEFOR_2008&quot;)) {tmp_q &lt;- paste(&quot;SELECT ?cell ?&quot;,var,&quot;\\n WHERE { \\n ?cell a &lt;http://linkedscience.org/lsv/ns#Item&gt; ;\\n &lt;http://spatial.linkedscience.org/context/amazon/&quot;,var,&quot;&gt; ?&quot;,var,&quot; .\\n }\\n&quot;,sep=&quot;&quot;)cat(tmp_q) res &lt;- merge(res, SPARQL(endpoint, tmp_q)$results, by=&quot;cell&quot;)}Creating a SpatialPixelsDataFrameWe copy the results to a new object and flip the y-axis:amazon &lt;- resamazon$row &lt;- -res$rowAssigningcoordinates to a data.framewillresult in a Spatial-object. Setting the type to griddedwill produce a SpatialPixelsDataFrame:coordinates(amazon) &lt;- ~ col+rowgridded(amazon) &lt;- TRUEPlotting and handling the dataAs a first application, we produce a mapshowing relative deforestation per pixel during 2002 by:spplot(amazon,&quot;DEFOR_2002&quot;,col.regions=rev(heat.colors(17))[-1], at=(0:16)/100,       main=&quot;relative deforestation per pixel during 2002&quot;)
  12. Accessing the dataAt first, make sure that you have recent versions of the two R packages SPARQL and sp installed. Load the two packages by calling:library(SPARQL) # make sure to use at least version 1.9library(sp)Define the endpoint that will provide you with the triples byendpoint &lt;- &quot;http://spatial.linkedscience.org/sparql&quot;To reduce the XML’s file size, the data is queried piece-wise. The query is initiated byq &lt;- &quot;SELECT ?cell ?row ?col ?polygon WHERE { ?cell a &lt;http://linkedscience.org/lsv/ns#Item&gt; ; &lt;http://spatial.linkedscience.org/context/amazon/Lin&gt; ?row ; &lt;http://spatial.linkedscience.org/context/amazon/Col&gt; ?col ; &lt;http://observedchange.com/tisc/ns#geometry&gt; ?polygon . }&quot;res &lt;- SPARQL(url=endpoint, q)$resultsand completed within a loop over all deforestation variablesfor(var in c(&quot;DEFOR_2002&quot;, &quot;DEFOR_2003&quot;, &quot;DEFOR_2004&quot;, &quot;DEFOR_2005&quot;, &quot;DEFOR_2006&quot;, &quot;DEFOR_2007&quot;,&quot;DEFOR_2008&quot;)) {tmp_q &lt;- paste(&quot;SELECT ?cell ?&quot;,var,&quot;\\n WHERE { \\n ?cell a &lt;http://linkedscience.org/lsv/ns#Item&gt; ;\\n &lt;http://spatial.linkedscience.org/context/amazon/&quot;,var,&quot;&gt; ?&quot;,var,&quot; .\\n }\\n&quot;,sep=&quot;&quot;)cat(tmp_q) res &lt;- merge(res, SPARQL(endpoint, tmp_q)$results, by=&quot;cell&quot;)}Creating a SpatialPixelsDataFrameWe copy the results to a new object and flip the y-axis:amazon &lt;- resamazon$row &lt;- -res$rowAssigningcoordinates to a data.framewillresult in a Spatial-object. Setting the type to griddedwill produce a SpatialPixelsDataFrame:coordinates(amazon) &lt;- ~ col+rowgridded(amazon) &lt;- TRUEPlotting and handling the dataAs a first application, we produce a mapshowing relative deforestation per pixel during 2002 by:spplot(amazon,&quot;DEFOR_2002&quot;,col.regions=rev(heat.colors(17))[-1], at=(0:16)/100,       main=&quot;relative deforestation per pixel during 2002&quot;)
  13. Can you please send me: - the endpoint - a query that works
  14. Can you please send me: - the endpoint - a query that works
  15. Can you please send me: - the endpoint - a query that works