O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Building Linked Data Applications

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio

Confira estes a seguir

1 de 78 Anúncio

Building Linked Data Applications

Baixar para ler offline

This presentation gives details on technologies and approaches towards exploiting Linked Data by building LD applications. In particular, it gives an overview of popular existing applications and introduces the main technologies that support implementation and development. Furthermore, it illustrates how data exposed through common Web APIs can be integrated with Linked Data in order to create mashups.

This presentation gives details on technologies and approaches towards exploiting Linked Data by building LD applications. In particular, it gives an overview of popular existing applications and introduces the main technologies that support implementation and development. Furthermore, it illustrates how data exposed through common Web APIs can be integrated with Linked Data in order to create mashups.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Quem viu também gostou (20)

Anúncio

Semelhante a Building Linked Data Applications (20)

Mais recentes (20)

Anúncio

Building Linked Data Applications

  1. 1. Building Linked Data Applications Presented by: Christoph Pinkel
  2. 2. Motivation: Music! • Our aim: build a music-based portal using Linked CH 1 Data technologies. • So far, we have studied different mechanisms to consume Linked Data: • • • • Executing SPARQL queries Dereferencing URIs Downloading RDF dumps Extracting RDFa data CH 2 CH 3 • The output of these mechanisms is displayed to the CH 4 user by applying visualization techniques. EUCLID – Building Linked Data applications 2
  3. 3. Analysis & Mining Module Visualization Module RDFa Data acquisition LD Dataset Access Application Motivation: Music! SPARQL Endpoint Publishing Vocabulary Mapping Interlinking Physical Wrapper Integrated Dataset Cleansing LD Wrapper R2R Transf. LD Wrapper RDF/ XML Streaming providers Downloads Musical Content Metadata EUCLID – Building Linked Data applications Other content 3
  4. 4. Agenda 1. Characterization of Linked Data applications 2. Linked Data application architecture 3. Linked Data application development frameworks 4. Using Web APIs EUCLID – Building Linked Data applications 4
  5. 5. CHARACTERIZATION OF LINKED DATA APPLICATIONS EUCLID – Building Linked Data applications 5
  6. 6. Linked Data Application Consumes LD Manipulate s& Produces Web app LD LD app LD applications have three main parts: • Consumes Linked Data: Systems that only consume LD are considered mashups. Consuming LD does not necessarily mean that sources expose RDF-based data. The app may use wrappers to transform the data into Linked Data. • Manipulates/Produces Linked Data: Performs updates to RDF data and makes the data accessible on the Web of Data. • Web App/Interface: Often operates on the Web. Allows to easily integrate and export data. Source: M. Hausenblas. “Linked Data Applications” EUCLID – Building Linked Data applications 6
  7. 7. Categories of Linked Data Applications According to their usage, the majority of current Linked Data applications corresponds to: • Generic Linked Data browsers: Dereference URIs to retrieve the resource description. Consume and expose Linked Data. CH 4 Examples: Sig.ma, Sindice, Marbles, etc. • Linked Data search engines: Allows the user to submit queries. Consume and republish the retrieved data. Examples: Swoogle, Watson, etc. CH 4 • Domain-specific Linked Data applications: Built for specific purposes. Examples: see later. Source: M. Hausenblas. “Linked Data Applications” Source: M. Martin and S. Auer. “Categorisation of Semantic Web Applications” EUCLID – Building Linked Data applications 7
  8. 8. Categories of Linked Data Applications (2) Furthermore, Linked Data applications can be classified according to the following dimensions: Dimensions Levels Description Semantic technology depth Extrinsic Use of semantics on the surface of the application. Intrinsic Conventional technologies (e.g., RDBMS) are complemented or replaced with SW equivalents. Information flow direction Consuming LD is retrieved from the source or via a wrapper. Producing Publishes LD (in RDF-based formats). Semantic richness Shallow Simple taxonomies, use of RDF or RDFS. Strong High level representation formalisms (OWL variants) Isolated Creation of own vocabularies Integrated Reuse of information at schema or instance level Semantic integration Source: M. Martin and S. Auer. “Categorisation of Semantic Web Applications” EUCLID – Building Linked Data applications 8
  9. 9. Example: Data.gov.uk • Provides a data catalog about UK’s governmental information. Source: http://data.gov.uk EUCLID – Building Linked Data applications 9
  10. 10. Example: Data.gov.uk (3) • A catalog of applications is available at the website Source: http://data.gov.uk/apps EUCLID – Building Linked Data applications 10
  11. 11. Example: Data.gov • Provides a catalog about US governmental data. Source: http://catalog.data.gov/dataset EUCLID – Building Linked Data applications 11
  12. 12. Example: Data.gov (2) • App developers can build applications on top of Data.gov.uk data sets available at: https://catalog.data.gov/dataset • Their platform also provides a set of applications built on top of these data sets Mobile Apps Web Apps Source: http://www.data.gov/research/page/research-apps EUCLID – Building Linked Data applications 12
  13. 13. Example: BBC – Dynamic Semantic Publishing • The BBC DSP architecture aims at automating aggregation and publishing of interrelated content within the BBC portal. • Journalists are able to semantically annotate content with LD concepts through the Graffiti tool. Graffiti tool • OWLIM triple store is used to keep the RDF data and to perform reasoning over the data. Source: http://www.bbc.co.uk/blogs/bbcinternet/2012/04/sports_dynamic_semantic.html EUCLID – Building Linked Data applications 13
  14. 14. Example: ResearchSpace • The ResearchSpace environment aims at providing a set of RDF data sets and tools to describe concepts and objects related to cultural historical research. Image Annotation • The tools are highly interactive: allow users to access the data and contribute to the data set by creating RDF annotations. Geo Mapper Source: https://sites.google.com/a/researchspace.org/researchspace/ EUCLID – Building Linked Data applications 14
  15. 15. Example: ResearchSpace (2) The ResearchSpace infrastructure RDF data is accessed via SPARQL or Sesame OpenRDF API Implement GUI requirements Store and serve multiresolution images and titles User Interface Source: https://confluence.ontotext.com/display/ResearchSpace/RS+Infrastructure EUCLID – Building Linked Data applications 15
  16. 16. Example: ResearchSpace CRM Search System Search by predicates Faceted search Source: Snapshot from https://www.youtube.com/watch?v=HCnwgq6ebAs EUCLID – Building Linked Data applications 16
  17. 17. Example: Open Pharmacology Space • OPS is a platform that aims at integrating pharmacological data available in open standards. • The OPS platform offers an API to access its data. • The following applications have been built on top of OPS: • Open PHACTS Explorer: Allows browsing the OPS data. • ChemBioNavigator: Visualizes the composition of a molecule group. • PharmaTrek: Allows navigating the content of ChEMBL. Open PHACTS Explorer ChemBioNavigator PharmaTrek Source: http://www.openphacts.org/open-phacts-discovery-platform EUCLID – Building Linked Data applications 17
  18. 18. Example: Open Pharmacology Space (2) The OPS platform architecture Produces LD Semantic technology depth: intrinsic and extrinsic Consumes LD Source: Williams A., Harland L., Groth P,. et al.: Open PHACTS: Semantic interoperability for drug discovery. Drug Discovery Today, June 06, 2012 EUCLID – Building Linked Data applications 18
  19. 19. Example: eCloudManager Use case: data center management • Multitude of managed resources • Hardware (physical storage, network, computational infrastructure) • Virtualization capabilities (virtual clusters, live migration) • Software applications • Multitude of APIs and data sources • Tool sprawl! Source: http://www.fluidops.com/ecloudmanager/ EUCLID – Building Linked Data applications 19
  20. 20. Example: eCloudManager – Integrated View on the Data Center • Integration of different SW and HW components, storage systems, compute infrastructures, applications, CRM systems, ticket systems, project catalogs. • Automatic correlation of data retrieved from various systems. • Unified view on data and metadata across the border of company units. • Exploration, analysis, and actions based on the entire data corpus. Project Data Applications & Landscapes Compute Infrastructure Storage Infrastructure Source: http://www.fluidops.com/ecloudmanager/ Integrated view showing connections between hardware layer, application layer, projects, and customers
  21. 21. ARCHITECTURE OF LINKED DATA APPLICATIONS EUCLID – Building Linked Data applications 21
  22. 22. Software Architecture • Denotes the structures or components of a software system. • It is comprised of: • Elements: Includes software (logic) components, databases, web servers, services, legacy systems or other type of components required in the system. • Relationships between the elements: Mechanisms to communicate the different elements within the architecture. • Software architecture also refers to a set of practices to use or design a (software) system. EUCLID – Building Linked Data applications 22
  23. 23. Multitier Architecture • Logically separates the components/functions of the system into different tiers, allowing for easy reuse or replace of a particular tier. • The most common use of the multitier architecture is the three-tier architecture. Presentation tier Corresponds to the user interface. Translates the results into human-readable information. Logic tier Implements the business logic, analytical computation etc.: performs detailed processing. Sales per album of ‘The Beatles’ Search music artist: ‘The Beatles’. Retrieve album information. Data tier Stores the data. This tier is independent from the business logic. EUCLID – Building Linked Data applications SPARQL query Aggregate information per album. RDF results 23
  24. 24. General Architecture of Linked Data Applications Presentation Tier Logic Tier Data Tier Integrated Dataset (Triple Store) Data Access Component Republication Republication Component Data Integration Component Vocabulary Mapping Physical Wrapper Interlinking SPARQL Wr. R2R Transf. Cleansing LD Wrapper RDF/ XML Web Data accessed via APIs SPARQL Endpoints EUCLID – Building Linked Data applications Relational Data Linked Data 24
  25. 25. Architectural Patterns 1. The Crawling Pattern: Crawls or loads data in advance. Data is managed in one triple store, thus it can be accessed efficiently. The disadvantage of this pattern is that the data might not be up to date. 2. The On-The-Fly Dereferencing Pattern: URIs are dereferenced at the moment that the app requires the data. This pattern retrieves up to date data. Performance is affected when the app must dereference many URIs. Data Access App Cache Data Access App Data Access App 3. The (Federated) Query Pattern: Submits complex queries to a fixed set of data sources. Enables applications to work with current data directly retrieved from the sources. Finding optimal query execution plans over a large number of sources is a complex problem. Source: T. Heath, C. Bizer. Linked Data: Evolving the Web into a Global Data Space EUCLID – Building Linked Data applications 25
  26. 26. Data Layer Data Access Component • Linked Data applications may implement a Mediator-Wrapper Architecture to access heterogeneous sources: – Wrappers are built around each data source in order to provide an unified view of the retrieved data. • The method to access the data depends on the Linked Data architectural pattern. • The factors that determine the decision of a pattern are: – – – – Number of data sources to access Requirement of consuming up-to-date data Tolerance to high response time Requirement of discovering new data sources EUCLID – Building Linked Data applications 26
  27. 27. Data Layer (2) Data Access Component (2) • The data access component may be implemented by using one or a combination of the following tools: Mechanisms Tools (Examples) Linked Data Crawlers LDspider https://code.google.com/p/ldspider/ Slug https://code.google.com/p/slug-semweb-crawler/ Linked Data Client Libraries Semantic Web Client Library http://wifo5-03.informatik.unimannheim.de/bizer/ng4j/semwebclient/ The Tabulator http://www.w3.org/2005/ajar/tab Moriarty https://code.google.com/p/moriarty/ SPARQL Client Libraries Jena Semantic Web Framework http://jena.apache.org/ Federated SPARQL Engines ANAPSID https://github.com/anapsid/anapsid FedX http://www.fluidops.com/fedx/ SPLENDID https://code.google.com/p/rdffederator/ Search Engine APIs Sindice http://sindice.com/developers/api Uberblic http://uberblic.com/ EUCLID – Building Linked Data applications 27
  28. 28. Data Layer (3) Data Integration Component • Consolidates the data retrieved from heterogeneous sources. • This component may operate at: – Schema level: Performs vocabulary mappings in order to translate data into a single unified schema. Links correspond to RDFS properties CH 2 or OWL property and class axioms. – Instance level: Performs entity resolution via owl:sameAs links. In case the data sources do not provide the links, further tools like Silk or CH 3 Open Refine can be used to integrate the data. Data Access Component Data Integration Component Vocabulary Mapping Interlinking EUCLID – Building Linked Data applications Cleansing 28
  29. 29. Data Layer (4) Integrated Dataset • The dataset resulting of integrated and consolidated data can be cached in a RDF store. • There are many solutions to deploy triple/RDF stores, e.g.: • • • • • • OWLIM (http://www.ontotext.com/owlim) Jena TDB (http://jena.apache.org/documentation/tdb/) Cumulus RDF (https://code.google.com/p/cumulusrdf/) AllegroGraph (http://www.franz.com/agraph/allegrograph/) Virtuoso Universal Server (http://virtuoso.openlinksw.com/) RDF3x (https://code.google.com/p/rdf3x/) Integrated Dataset Republication EUCLID – Building Linked Data applications Republication Component 29
  30. 30. Data Layer (5) Republication Component • Exposes as Linked Data portions • There are different solutions to make the data accessible: Data Layer • • • • Via SPARQL endpoints (e.g., Sesame OpenRDF SPARQL Endpoint, …) Via APIs (e.g., Linked Data API) As RDF dumps With the built-in means of your framework/CMS (e.g., Drupal, Information Workbench, …) Integrated Dataset Republication EUCLID – Building Linked Data applications Republication Component 30
  31. 31. Application and Presentation Layers • The logic layer implements sophisticated processing according to the functionalities of the application. This layer may include data mining components as well as reasoners that are not integrated in the data layer. • The presentation layer displays the information to the user in various formats, including text, diagrams or other type of CH 4 visualization techniques. Presentation Layer Logic Layer EUCLID – Building Linked Data applications 31
  32. 32. LINKED DATA APPLICATION DEVELOPMENT FRAMEWORKS Information Workbench EUCLID – Building Linked Data applications 32
  33. 33. Information Workbench • Platform for development of linked data applications Semantics- & Linked Data-based Integration of Enterprise and Open Data Sources Intelligent Data Access and Analytics • Visual exploration • Semantic search • Dashboarding and reporting Collaboration and Knowledge Management Platform • Wiki-based curation & authoring of data • Collaborative workflows Source: http://www.fluidops.com/information-workbench/ Semantic Web Data EUCLID – Building Linked Data applications 33
  34. 34. Information Workbench (2) Customized application solutions Reusable UI and data integration components Data storage and management platform External resources to reuse data and create mashups EUCLID – Building Linked Data applications 34
  35. 35. Data Storage & Access Data Management based on Sesame framework • Open Source, written in Java • Layered architecture for semantic data Stable (yet extensilble) Sesame Access API management APIs for data access, manipulation, ... SAIL API • Easy to plug in new Stackable SAIL 1 (e.g. Query data management architecure of Optimization Layer) custom data components on SAIL 2 (e.g. Distributed Query management Execution Layer) demand components • Most of the existing DB2 DB3 DB1 Easy integration by triple stores support implementing a generic API Sesame API EUCLID – Building Linked Data applications 35
  36. 36. Back-End Configuration Options • • Back-end data store is specified via the IWB configuration properties Both local and remote data access are possible See http://iwb.fluidops.com/resource/Help:RepositoryConfiguration Local repository Remote Sesame repository Arbitrary SPARQL endpoint Sesame remote repository client API Sesame SPARQL repository client API Sesame HTTP Server Sesame Sail API Sesame Sail API Sesame native store OWLIM SYSTAP bigdata AllegroGraph SPARQL endpoint … … … EUCLID – Building Linked Data applications 36
  37. 37. Data Integration: Data Provider Concept Data providers support the periodic extraction & integration from external data sources into a central repository • Lifting from arbitrary data formats to RDF (e.g., relational, XML, CSV) • Parametrizable (e.g. connection information, refresh interval, ..) • Built-in UI for instantiating providers • Intuitive interfaces and APIs for writing own, custom providers Connect to data source Extract data from source Examples: R2RML SPARQL Convert data into RDF EUCLID – Building Linked Data applications XML2RDF RDF Groovy Script Store RDF in repository 37
  38. 38. Data Warehousing vs. Federation Warehousing Federation • Data is copied from the source into the warehouse • Query runs in the warehouse • Supported in IWB using data providers • Data remains in federated DB • Query is pushed down to federated DB • Supported in IWB using SPARQL federation Query Query Warehouse Federation Query Load DB DB DB EUCLID – Building Linked Data applications DB 38
  39. 39. Virtualized Data Integration with FedX Information Workbench: Integration of Virtualized Data Sources as a Service Application Layer Semantic Wiki Collaboration Reporting & Analytics Visual Exploration See http://iwb.fluidops.com/resource/Help:FedX Transparent & On-Demand Integration of Data Sources See http://www.fluidops.com/fedx/ Virtualization Layer Data Layer SPARQL Endpoint SPARQL Endpoint SPARQL Endpoint Data Source Data Source Data Source Metadata Registry EUCLID – Building Linked Data applications Data Registries CKAN, data.gov, etc. + Enterprise Data 39
  40. 40. Customizable User Interface Current resource Navigation shortcuts Wiki page management View selection toolbar Main view area Demo available at http://musicbrainz.fluidops.net EUCLID – Building Linked Data applications 40
  41. 41. User Interface Concept: One Page URI Resource page Resource page Resource page Resource page Graph EUCLID – Building Linked Data applications 41
  42. 42. Data Driven UI: Ontology as “Structural Backbone” Resource page UI templates Template:… Resource page Template:mo:MusicArtist Ontology (RDFS/OWL) RDF Data Graph EUCLID – Building Linked Data applications 42
  43. 43. Different Views on Every Resource Wiki View Table View Graph View Pivot View EUCLID – Building Linked Data applications 43
  44. 44. Widget-Based User Interface Visualization and Exploration Analytics and Reporting Authoring and Content Creation Mashups with Social Media Widgets are not static and can be integrated into the UI using a Wiki-style syntax. CH 4 EUCLID – Building Linked Data applications 44
  45. 45. Example: Add Widgets to Wiki • • • • • • • • • • • • • {{#widget: BarChart | query ='SELECT distinct (COUNT(?Release) AS ?COUNT) ?label WHERE { ?? foaf:made ?Release . ?Release rdf:type mo:Release . ?Release dc:title ?label . } GROUP BY ?label ORDER BY DESC(?COUNT) LIMIT 10 ' | input = 'label' | output = 'COUNT' }} Example: Show top 10 released records for an artist EUCLID – Building Linked Data applications 45
  46. 46. Music Example Page of a class: • Shows an overview of MusicArtist instances See http://musicbrainz.fluidops.net/resource/mo:MusicArtist EUCLID – Building Linked Data applications 46
  47. 47. Music Example (2) Page of a class template: • Defines a layout for displaying each resource of the class • Uses semantic wiki syntax See http://musicbrainz.fluidops.net/resource/Template:mo:MusicArtist EUCLID – Building Linked Data applications 47
  48. 48. Music Example (3) Page of a class instance: • Displays the data about the resource according to the class template See http://musicbrainz.fluidops.net/resource/?uri=http%3A%2F%2Fmusicbrainz.org% EUCLID – Building Linked Data applications 2Fartist%2Fb10bbbfc-cf9e-42e0-be17-e2c3e1d2600d%23_ 48
  49. 49. Mashups with external sources • Relevant information and UI elements from external sources can be incorporated in the wiki view • IWB contains multiple mashup widgets for popular social media sources – – – – – – Twitter Youtube Facebook New York Times news LinkedIn … Template instantiation ?? = http://musicbrainz.org/artist/a3cb23fcacd3-4ce0-8f36-1e5aa6a18432%23_ ?x = „U2“ {{#widget: Youtube | searchString = $SELECT ?x WHERE { ?? foaf:name ?x . }$ | asynch = 'true’ }} EUCLID – Building Linked Data applications 49
  50. 50. Triple Editor • Edit structured data associated with a resource • Make change, add and remove triples Table View EUCLID – Building Linked Data applications 50
  51. 51. Ontology-Based Data Input Triple Editor takes into account the ontology definition: • Autosuggestion tool considers the domains and ranges of the properties Example: properties available for the class mo:MusicGroup are suggested automatically EUCLID – Building Linked Data applications 51
  52. 52. Validation of User Input Validation uses property definitions in the ontology: • The property myIntegerProperty has an associated rdfs:range definition. • This ensures that all objects must be of XML schema type xsd:integer. EUCLID – Building Linked Data applications 52
  53. 53. Further Information • Information Workbench product page • http://www.fluidops.com/information-workbench/ • Demo system • http://musicbrainz.fluidops.net/ • Download a free Community Edition version • http://www.fluidops.com/information-workbench/iwb-download/ • Online documentation • http://help.fluidops.com/help/topic/iwb.help-2.5/help.html EUCLID – Building Linked Data applications 53
  54. 54. LINKED DATA APPLICATION DEVELOPMENT FRAMEWORKS Calimachus, lmf & Synth EUCLID – Building Linked Data applications 54
  55. 55. Callimachus • Scalable platform for creating and running data-driven websites. • Can be deployed on a server, allowing users to develop their pages and applications via a Web browser. • Resources can be created via the user interface to build an application. Source: http://callimachusproject.org EUCLID – Building Linked Data applications 55
  56. 56. Linked Media Framework (lmf) • Offers advanced services for linked media management, built on top of: • Apache Marmotta (Linked Data platform) • Apache Stanbol (extraction and enhancement framework) • Apache Solr (indexation) • Typical use cases include: Building semantic search over data Publishing legacy data as Linked Data Using a SKOS thesaurus for information extraction Source: https://code.google.com/p/lmf/ EUCLID – Building Linked Data applications 56
  57. 57. Synth • Development environment implemented with Ruby on Rails. • Allows for building applications following the Semantic Hypermedia Design Method (SHDM). • Provides a set of modules that receive models and produce the hypermedia app described in the model: • • • • Domain Navigation Behavior Interface Source: http://www.tecweb.inf.puc-rio.br/synth EUCLID – Building Linked Data applications 57
  58. 58. USING WEB APIS EUCLID – Building Linked Data applications 58
  59. 59. Underlying Technology Basics HTTP Overview • HTTP, by which all documents on the WWW are served, is a client server protocol • Every interaction is based on: Request Response EUCLID – Building Linked Data applications 59
  60. 60. Underlying Technology Basics (2) HTTP Request • Method Request • GET (retrieve entity identified by URI) • PUT (store entity under the given URI) • POST(submit the information as a new subordinate of the resource URI) • DELETE (delete entity identified by URI) • Additionally HEAD, TRACE, CONNECT, OPTIONS, PATCH • URI • Header • [optional] Body (with POST, PUT) EUCLID – Building Linked Data applications 60
  61. 61. Underlying Technology Basics (3) HTTP Response • Response Code (Integer) Response • 1xx: Provisional response, contains the Status-Line and optional headers • 2xx: Indicates that the applications request was successfully received, understood, and accepted. • 3xx: Further action needs to be taken by the user agent in order to fulfill the request. • 4xx: The applications request was erroneous. • 5xx: The server has erred or is incapable of performing the request. Source: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html • Header • [optional] Body EUCLID – Building Linked Data applications 61
  62. 62. Underlying Technology Basics (4) HTTP Request – Response Pattern • A Client can submit a request • Response from the server HTTP GET http://en.wikipedia.org/wiki/Beatles Client Web Server 200 (OK) [HTML Page about the Beatles] EUCLID – Building Linked Data applications 62
  63. 63. Underlying Technology Basics (5) HTTP Conneg & Linked Data URI Lookup • A foundational issue in Linked Data was the distinction of URIs for real-world objects versus documents (e.g., RDF) that might describe them. • This can be handled in the HTTP Header together with content negotiation (conneg): HTTP GET http://dbpedia.org/resource/The_Beatles Accept: text/html Client Web Server 303 (see other) http://dbpedia.org/page/The_Beatles EUCLID – Building Linked Data applications 63
  64. 64. Underlying Technology Basics (6) HTTP Conneg & Linked Data URI Lookup • A foundational issue in Linked Data was the distinction of URIs for real-world objects versus documents (e.g., RDF) that might describe them. • This can be handled in the HTTP Header together with content negotiation (conneg): HTTP GET http://dbpedia.org/resource/The_Beatles Accept: text/turtle Client Web Server 303 (see other) http://dbpedia.org/data/beatles.n3 EUCLID – Building Linked Data applications 64
  65. 65. Underlying Technology Basics (6) HTTP Conneg & Linked Data URI Lookup • A foundational issue in Linked Data was the distinction of URIs for real-world objects versus documents (e.g., RDF) that might describe them. • This can be handled in the HTTP Header together with content negotiation (conneg): HTTP GET http://dbpedia.org/data/beatles.n3 Accept: text/turtle Client Web Server 200 (OK) [RDF data about the Beatles] EUCLID – Building Linked Data applications 65
  66. 66. Web APIs Motivation The Web has more to offer than just the retrieval of static data: • Data is often dynamically created as a result of some calculation carried out over input data (e.g., weather information). • Data can change frequently (e.g., moving objects). • Service endpoints, forms and APIs are used to trigger functionalities in the Web and the real world and provide access to dynamic and static data sources. • Web APIs provide a programming interface exposed on the Web to allow apps to make use of these functionalities. EUCLID – Building Linked Data applications 66
  67. 67. Web APIs (2) Motivation • The number of Web APIs is significantly increasing • An important role on the Web plays Representational State Transfer (REST) • Architectural style for client–server interaction • Focused on the Web architecture • ProgrammableWeb is a general directory for Web APIs: • Allows providers to register their API • Allows application developers to search for APIs Source: http://programmableweb.com EUCLID – Building Linked Data applications 67
  68. 68. Richardson Maturity Model for REST Services Level 3: HATEOAS Level 2: HTTP Verbs Level 1: Resources and URIs Each layer builds on the concepts and technologies of the layers below Source: Richardson, L. & Ruby, S.; RESTful Web Services O'Reilly, 2007. EUCLID – Building Linked Data applications 68
  69. 69. Level 1: Thinking in Resources • Resources rather than service endpoints: • A resource is anything with which a client is able to interact • Real world resources (e.g., car, movie, person…) are projected onto the Web by making the information associated with it accessible on the Web • Identifier • URI uniquely identify (many-to-one) a resource on the Web • For addressing and manipulation • Different representations for one resource are possible Service boundary JSON Representation Movie Resource http://rest-music.org/artist/beatles RDF/N3 Representation http://rest-music.org/artist/the_beatles XML Representation Source: Webber, J.; Parastatidis, S. & Robinson, I. S.; REST in Practice - Hypermedia and Systems Architecture. O'Reilly, 2010. EUCLID – Building Linked Data applications 69
  70. 70. Level 2: The Web as a Platform Uniform interface for interaction • HTTP Verbs as methods to act on resources: HTTP Verb Effect Characteristic GET retrieve the representation of a resource identified with a URI safe, idempotent PUT create or overwrite a resource identified by a clientgenerated URI idempotent POST create a resource identified by a server-generated URI DELETE delete a resource (or its representation) identified with a URI idempotent OPTIONS safe, idempotent request for information about the available communication options • Response Codes to coordinate the interactions (e.g., 200 OK, 201 CREATED, 303 SEE OTHER, 404 NOT FOUND) EUCLID – Building Linked Data applications 70
  71. 71. Level 2: The Web as a Platform (2) Characteristics of HTTP Verbs • Safe: Guaranties not to change the resource on the server • Example: Retrieving the representation of a resource does not change it GET /artist/beatles Name: The Beatles Genre: Rock Origin: Liverpool … • Idempotent: The effect of several identical requests with an idempotent HTTP Verb is the same as for a single request • Example: Once a resource is deleted, deleting it again does not change anything DELETE /artist/beatles 200 OK Name: The Beatles Genre: Rock Origin: Liverpool … EUCLID – Building Linked Data applications DELETE /artist/beatles 404 not found 71
  72. 72. Level 3: The Hypermedia Constraint HATEOAS = Hypermedia As The Engine Of Application State • Include links in the resource representations to other relevant resources http://service.org/music/order HTTP POST dbp:Revolver a dbp:Album; mus:upc “094638241720”. Response mus:001 mus:001 mus:001 mus:001 mus:001 a mus:Order. db:content dbp:Revolver. mus:price “10€”. mus:status “awaiting payment” mus:pay mus:001_pay. Source: Fielding, R. T.; Architectural styles and the design of network-based software architectures, University of California, Irvine, 2000. EUCLID – Building Linked Data applications 72
  73. 73. Example: Freebase API Freebase API • To retrieve RDF Freebase offers a specific API • No content negotiation • Allows applications to retrieve a subgraph of data connected to a specific Freebase object • The URI is the concatenation of the RDF service URL and the Freebase identifier https://www.googleapis.com/freebase/v1/rdf/<id> Source: https://developers.google.com/freebase/ EUCLID – Building Linked Data applications 73
  74. 74. Example: Freebase API (2) Freebase API • Example: GET https://www.googleapis.com/freebase/v1/rdf/m/07c0j • Every Freebase fact is mapped to a triple • Slashes in the ID are replaced by dots • Some facts are mapped to RDF Schema (e.g., /type/object/name rdfs:label) • The response contains the first 100 values for each predicate @prefix ns: <http://rdf.freebase.com/ns/>. ns:m.07c0j ns:award.award_nominee.award_nominations ns:m.0jwnvw4; ns:base.websites.website.website <http://www.thebeatles.com>; … Source: https://developers.google.com/freebase/ EUCLID – Building Linked Data applications 74
  75. 75. Well-Known Non-RDF Web APIs • Twitter Provides access to timelines, tweets, direct messages between users, followers, users, places, … See http://dev.twitter.com • LastFM Provides access to music-related resources: albums, artists, groups, events, venues, … See http://dev.twitter.com • Foursquare Check in at their current location, create tips and lists, access recommendations, … See http://developer.foursquare.com • … EUCLID – Building Linked Data applications 75
  76. 76. Summary • Linked Data applications • LD application = Consumes LD + Manipulates/Produces LD + Web app • Can be categorized according the following dimensions: • • • • Semantic technology depth Information flow direction Semantic richness Semantic integration • Architecture of Linked Data applications • Multitier combined with a wrapper-mediator architecture • Architectural patterns to consume LD: Crawling, on-the-fly dereferencing, (federated) query pattern. • Main components: Triple store, logic components, UI components, data access & integration component, republishing component EUCLID – Building Linked Data applications 76
  77. 77. Summary (2) • Linked Data application development frameworks: Information Workbench • Data storage: Provides warehousing and federation capabilities • Data integration: Performed by Data Providers, e.g., OpenRefine • Data-driven, widget-based user interface, automatically generated by executing SPARQL queries • User input validation via comparison against the underlying ontology • Web APIs • Basic concepts: Request and Response • Request methods: GET, PUT, POST, DELETE (+ HEAD, TRACE, CONNECT, OPTIONS) • Response codes: 1xx provisional, 2xx success, 3xx further action required, 4xx client error, 5xx server error • Richardson Maturity Model for REST services: Level 1 – Resources and URIs, Level 2 – HTTP verbs, Level 3 – HATEOAS EUCLID – Building Linked Data applications 77
  78. 78. For exercises, quiz and further material visit our website: http://www.euclid-project.eu Course eBook Other channels: @euclid_project euclidproject EUCLID – Building Linked Data applications euclidproject 78

Notas do Editor

  • Maribel’s comment: The portals list “ZIP” as formats for data sets, but they do not provide further information about the format of the zipped files.Maribel’s comment: This graphic only shows the most common data formats in both data sets, but there are more.

×