SlideShare uma empresa Scribd logo
1 de 68
Baixar para ler offline
1




State of the Semantic Web

   Ivan Herman, W3C

        May 2009
2




What is the overall status of the
        Semantic Web?
3


       We have the basic technologies
•   Stable specifications for the basics since 2004:
    RDF, OWL
•   Work is being done to properly incorporate rules
•   We have a standard for query since 2008: SPAR-
    QL
•   We have some additional technologies to
    access/create RDF data: GRDDL, RDFa,
    POWDER, …
•   Some fundamental vocabularies became pervasive
    (FOAF, Dublin Core,…)
4


    Lots of Tools (not an exhaustive list!)
•   Categories:                    •   Some names:
    •   Triple Stores                  •   Jena, AllegroGraph, Mulgara,
                                           Sesame, flickurl, …
    •   Inference engines              •   TopBraid Suite, Virtuoso environ-
    •   Converters                         ment, Falcon, Drupal 7, Redland,
                                           Pellet, …
    •   Search engines                 •   Disco, Oracle 11g, RacerPro,
    •   Middleware                         IODT, Ontobroker, OWLIM, Tallis
                                           Platform, …
    •   CMS                            •   RDF Gateway, RDFLib, Open
    •   Semantic Web browsers              Anzo, DartGrid, Zitgist, Ontotext,
                                           Protégé, …
    •   Development environments       •   Thetus publisher, SemanticWorks,
                                           SWI-Prolog, RDFStore…
    •   Semantic Wikis                 •   …
    •   …
5


               Lots of tools (cont.)
•   Significant speed, store capacity, etc; improve-
    ments are reported every day
•   Some of the tools are open source, some are not;
    some are very mature, some are not: it is the usual
    picture of software tools, nothing special any more!
•   Anybody can start developing RDF-based applica-
    tions today
6


            There is a great community
•   There are lots of tutorials, overviews, and books
    around
    •   again, some of them good, some of them bad, just as
        with any other areas…
•   Active developers’ communities
    •   blogs, IRC channels, mailing lists, various fora: more
        than what one person can oversee…
7


                               Great community…




From a presentation given by David Norheim, Computas AS, at the ESTC2008 Conference, Vienna, Austria
8


        Some deployment communities
•   Major communities pick the technology up: digital
    libraries, defence, eGovernment, energy sector, fin-
    ancial services, health care, oil and gas industry, life
    sciences …
    •   Health care and life science sector is now very active
•   Semantic Web also appears in the “Web 2.0/Web
    3.0” world (whatever that means )
    •   exchange of social data
    •   personal “space” applications
    •   multimedia asset management (video, photos, audio, …)
    •   etc
9




So what is the Semantic Web?
10




•   There is a growing number of application patterns
    referring to the Semantic Web:
    •   data integration using RDF, SKOS, OWL, …
    •   knowledge engineering with complex ontologies
         •   using, eg, OWL and/or rule based reasoning
    •   better data management, archiving, cataloging, etc
         •   eg, digital library applications
    •   managing, coordinating, combining Web services
    •   intelligent software agents
    •   improving search (usually using domain specific vocab-
        ularies…)
    •   etc
11



The nice, structured view…
12


But maybe this is where we are?
13




•   Maybe, but being an elephant is not necessary
    bad!
    •   it shows that the Semantic Web is a mature technology
    •   that there is lots of interest, applications
    •   various application areas pick what they need…
         •   e.g., some need sophisticated knowledge management, so
             they go for complex ontologies…
         •   some concentrate on semantically simpler vocabularies but
             large volume of data
    •   …and that is fine, there is room for many!
14




•   But it is good to (re-)emphasize some principles
•   The Semantic Web:
    •   extends the principles of the Web from documents to
        data; create a Web of data
15




•   It is the Semantic Web, and not only Semantics!
    •   data, ontologies, vocabularies, etc, can (and should!) be
        shared, reused, potentially on Web scale
    •   one can use the Web infrastructure to denote “things”…
         •   Eg: http://www.ivan-herman/me denotes, well, me (not my
             home page, not my foaf file, but me!)
    •   … and add relationships for those, too!
•   The major importance of the SW is that it provides
    an abstract integration layer for data on the Web
16




Some new technologies to watch
17




How do I get data out?
18


             How to provide RDF data?
•   Of course, one could create RDF data manually…
•   … but that is unrealistic on a large scale
•   Goal is to generate RDF data automatically when
    possible and “fill in” by hand only when necessary
•   Various data formats should be considered
    •   databases (relational or otherwise)
    •   data in XML, HTML, in pictures, videos, etc
•   Details of the process is still subject of very active
    R&D!
19


        Bridge to relational databases
•   Huge amount of data are stored in (relational)
    databases
    •   “RDFying” them is impossible
•   “Bridges” are being defined:
    •   a layer between RDF and the relational data
         •   RDB tables are “mapped” to RDF graphs, possibly on the fly
    •   a number of systems can be used as database as well
        as triple stores (eg, Oracle, OpenLink, …)
•   Work for a standard mapping language may start at
    W3C soon
20


         Linking Open Data Project
•   Goal: “expose” open datasets in RDF
•   Set RDF links among the data items from different
    datasets
•   Set up query endpoints
•   Altogether billions of triples, millions of links…
21


         Example data source: DBpedia
•   DBpedia is a community effort to
    •   extract structured (“infobox”) information from Wikipedia
    •   provide a query endpoint to the dataset
    •   interlink the DBpedia dataset with other datasets on the
        Web
22


Extracting Wikipedia structured data
          @prefix dbpedia <http://dbpedia.org/resource/>.
          @prefix dbterm <http://dbpedia.org/property/>.

          dbpedia:Amsterdam
            dbterm:officialName “Amsterdam” ;
            dbterm:longd “4” ;
            dbterm:longm “53” ;
            dbterm:longs “32” ;
            ...
            dbterm:leaderTitle “Mayor” ;
            dbterm:leaderName dbpedia:Job_Cohen ;
            ...
            dbterm:areaTotalKm “219” ;
            ...
          dbpedia:ABN_AMRO
            dbterm:location dbpedia:Amsterdam ;
            ...
23


 Automatic links among open datasets
      <http://dbpedia.org/resource/Amsterdam>
        owl:sameAs <http://rdf.freebase.com/ns/...> ;
        owl:sameAs <http://sws.geonames.org/2759793> ;
        ...




  <http://sws.geonames.org/2759793>
    owl:sameAs <http://dbpedia.org/resource/Amsterdam>
    wgs84_pos:lat “52.3666667” ;
    wgs84_pos:long “4.8833333” ;
    geo:inCountry <http://www.geonames.org/countries/#NL> ;
   ...




Processors can switch automatically from one to the other…
24


The LOD “cloud”, March 2008
25


The LOD “cloud”, September 2008
26


The LOD “cloud”, March 2009
Generate (meta)data from unstructured
                                                                   27




                data
•   An emerging approach:
    • use Natural Language Processing (NLP) to analyse text
    • services exist (Reuter’s Open Calais and Tagaroo, Zemanta)
    • these often return URI-s into, eg, Dbpedia
•   Use these techniques to, eg, automatically “tag” entries
    • eg: Twine, Faviki
    • the tag URI-s provide “integration points”
28


Data may be extracted (a.k.a. “scraped”)
•   Different tools, services, etc, come to the fore:
    •   services to get RDF data from images’ XMP data, from
        Flickr…
    •   scripts to convert spreadsheets to RDF
    •   etc
•   Many of these tools are still individual “hacks”, but
    show a general tendency
•   Hopefully more tools will emerge
    •   there is a separate wiki page collecting references to ex-
        isting ones
29


Getting structured data to RDF: GRDDL
•   Access structured data in XML/XHTML and turn it
    into RDF:
    •   defines XML attributes to bind a suitable script to trans-
        form (part of) the data into RDF
         •   script is usually XSLT but not necessarily
         •   has a variant for XHTML
    •   a “GRDDL Processor” runs the script and produces
        RDF on–the–fly
•   A way to access existing structured data and
    “bring” it to RDF
    •   eg, a possible link to microformats
    •   exposing data from large XML use bases, like XBRL
30


    Getting structured data to RDF: RDFa
•   Extends XHTML with a set of attributes to include
    structured data into XHTML
•   Makes it easy to “bring” existing RDF vocabularies
    into XHTML
    •   uses namespaces for an easy mix of terminologies
•   It can also be used with GRDDL
    •   but: no need to implement a separate transformation
        per vocabulary
31


How to “assign” RDF data to resources?
•   This is important when the RDF data is used as
    “metadata”
•   Some examples:
    •   copyright information for your photographs
    •   is a Web page usable on a mobile phone and how?
    •   bibliographical data for a publication
    •   annotation of the data resulting from a scientific experi-
        ment
    •   etc
•   The issue: if I have the URI of the resource (photo-
    graph, publication, etc), how do I find the relevant
    RDF data?
32


          The data might be embedded
•   Some data formats allow the direct inclusion of
    (RDF) metadata:
    •   SVG (Scalable Vector Graphics)
    •   XHTML+RDFa
    •   microformats+GRDDL
    •   JPG files using the comment area and, eg, Adobe’s
        XMP technology
•   That can include all the information, or link to fur-
    ther data
33


                      POWDER
•   POWDER (Protocol for Web Description Re-
    sources) provides for more elaborate scenarios
•   Lets you define predicates that are automatically
    “assigned” to a set of resources
34


POWDER scenario: copyright for photos
35


             Some technical details…
•   The “description resource” is an XML file
•   This XML file has a canonical conversion to OWL
•   Specialized POWDER services will be set up:
    – give the URI of a Resource and the corresponding de-
      scription resource, return all RDF statements on that URI
36


Simple Knowledge Organization System
•   Goal: represent and share classifications, glossar-
    ies, thesauri, etc, as developed in the “Print World”.
    •   for example:
         •   Dewey Decimal Classification, Art and Architecture Thesaur-
             us, ACM classification of keywords and terms…
    •   allow for a quick port of this traditional data, combine it
        with other data
•   This is where SKOS comes in: define classes and
    properties to add those structures to an RDF uni-
    verse
37


            Example: entries in a glossary
  Assertion
    (i) Any expression which is claimed to be true.
    (ii) The act of claiming something to be true.
  Class
    A general concept, category or classification. Something
    used primarily to classify or categorize other things.
  Resource
    (i) An entity; anything in the universe.
    (ii) As a class name: the class of everything; the most
        inclusive category possible.
(from the RDF Semantics Glossary)
38


Example: entries in a glossary in SKOS
39


A more complex structure
   (using LCSH terms)
40


                SKOS and digital libraries
•   SKOS plays an important role in “bridging” to digital
    libraries
    •   a huge community with its own traditions, style…
    •   … but huge amount of data to be “linked” to the Se-
        mantic Web!
•   Major library metadata standards are being re-
    defined in terms of RDF (and SKOS),
    •   eg, “Resource Description and Access” (RDA)
         •   a major cataloguing rule set for librarians
         •   potentially, all major library catalogues around the globe could
             be translated into RDF and, eg, linked as an Open Linked
             Data…
41


           Conclusions on data access
•   There are many different data sources around
•   Making them available on the Web and interlinking
    them is essential
    •   “Give your raw data” — Tim Berners-Lee
•   There are number of technologies to do that:
    •   mapping from databases, GRDDL, RDFa, SKOS,
        POWDER, conversion tools
42




Querying Data
43


              Querying RDF: SPARQL
•   Is a W3C Standard since January 2008
    •   it has already become one of the absolutely essential
        technologies on the SW
•   SPARQL is
    •   a query language based on graph patterns
    •   a protocol layer to use SPARQL over, eg, HTTP
    •   an XML return format for the query results
44


SPARQL as a unifying point!
45


             New SPARQL WG: Goals
•   To define a small set of extensions to SPARQL
•   No complex change, backward compatibility
•   Listen to user and implementation experiences of
    the past few years
•   Group started in February 2009
46


                      Planned features
•   Update, ie, ability to change the RDF store
•   Service description framework
    • what type of extensions, inference possibilities, etc, are available
        at the endpoint
•   Addition to the query language
    • aggregate functions
    • subqueries
    • negation
    • project expressions
47


                   Planned features
             (tentative syntax examples)
•    Aggregate functions and project expressions:
•
 SELECT AVG(?age) AS average_age WHERE { .... }
•SELECT (?age < 18) AS minor WHERE { ... }


•    Subqueries:
•
    SELECT ?person (SELECT ?n WHERE { ?person foaf:name ?n } LIMIT 1)
    WHERE { <http://www.ivan-herman.net/me> foaf:knows ?person. }
      •


•    Negation:
    SELECT *
    WHERE { ?x :p ?v. UNSAID { ?x :q ?v. } }
48


     Possible features (time permitting)
•   Definition of “entailment regimes”
    • RDFS, OWL Profiles, RIF
•   Property paths
•   Commonly used functions (eg, string manipulation)
•   Basic control for federated queries
•   Additional query language syntax
    • commas in select lists, some operators in filters
49




Ontologies (OWL)
50


                   Ontologies: OWL
•   This is also a stable specification since 2004
•   Separate layers have been defined, balancing ex-
    pressibility vs. implementability (OWL-Lite, OWL-
    DL, OWL-Full)
•   Looking at the tool list on W3C’s wiki again:
    •   a number programming environments include OWL
        reasoners
    •   stand-alone reasoners (downloadable or on the Web)
    •   ontology editors come to the fore
51


                        Ontologies
•   Large ontologies are being developed (converted
    from other formats or defined in OWL). For ex-
    ample:
    •   eClassOwl: eBusiness ontology for products and ser-
        vices, 75,000 classes and 5,500 properties
    •   National Cancer Institute’s ontology: about 58,000
        classes
    •   Open Biomedical Ontologies Foundry: a collection of
        ontologies, including the Gene Ontology, to describe
        gene and gene product attributes; or UniProt for protein
        sequence and annotation terminology and data
    •   BioPAX: for biological pathway data
    •   ISO 15926: “Integration of life-cycle data for process
        plants including oil and gas production facilities”
52


              OWL in applications
•   An increasing number of applications rely on OWL
    (Pfizer, Nasa, Eli Lilly, Elsevier, FAO, …)
•   Not all use complex reasoning; in many cases a
    small fraction of OWL is used
53


                  OWL Working Group
•   A new Working Group works on the revision of
    OWL (a.ka. OWL 2)
•   The goal of the group:
    1. add a few extensions to current OWL that are useful,
      and is known to be implementable
       •   many things happened in research since 2004
    2. define “profiles” of OWL that are:
       •   smaller, easier to implement and deploy
       •   cover important application areas and are easily understand-
           able to non-expert users
54


          Some new features in OWL 2
•   Syntactic sugars
    – eg, disjoint union of classes
•   New constructs for properties
    – property chains, reflexive properties
•   Extended datatype facilities
    – define a numerical interval as an OWL Datatype class
•   Profiles
55


The overall structure has not changed
56


                        Profiles
•   OWL 2 has the same duality with Full and DL
•   But, for a number of applications, but even OWL
    Lite is too much
•   There is a need for “light” versions of OWL: just a
    few extra possibilities added to RDFS
57


            OWL 2 defines “profiles”
•   Further restrictions on how terms can be used and
    what inferences can be expected
•   The semantic approaches are identical, but restric-
    tions may ensure even more manageable imple-
    mentations
58


                  OWL 2 profiles
•   Classification and instance queries in polynomial
    time: OWL-EL
•   Implementable on top of conventional relational
    database engines: OWL-QL
•   Implementable on top of traditional rule engines:
    OWL-RL
59


                An example: OWL-RL
•   Goal: to be implementable through rule engines
•   Usage follows a similar approach to RDFS:
    − merge the ontology and the instance data into a big RDF
     graph
    − use the rule engine to add new triples (as long as it is pos-
     sible)
    − then, for example, use SPARQL to query the resulting
     (expanded) graph
•   This application model is very important for RDF
    based applications
60




Miscellaneous
61


        Everything has not been solved…
•   There are a number of issues, problems
    •   missing functionalities: encryption/signatures, fuzzy
        reasoning, …
    •   misconceptions, messaging problems
    •   need for more applications, deployment, acceptance
    •   incorporation of rule languages (that is being worked on
        by the RIF Working Group)
    •   etc
62


                       Other items…
•   Security, trust, provenance
    •   combining cryptographic techniques with the RDF mod-
        el, sign a portion of the graph, etc
    •   trust models
•   Quality constraints on graphs
    •   “may I be sure that certain patterns are present in a
        graph?”
•   Ontology merging, alignment, term equivalences,
    versioning, development, …
•   What does reasoning mean on billions of triples?
•   etc
63


              Other items: uncertainty
•   Fuzzy logic
    •   look at alternatives of DL based on fuzzy logic
    •   alternatively, extend RDF(S) with fuzzy notions
•   Probabilistic statements
    •   have an OWL class membership with a specific probab-
        ility
    •   combine reasoners with Bayesian networks
•   A W3C Incubator Group issued a report on the cur-
    rent status, possibilities, directions, etc
    •   report published in April 2008
64


                    Other items: naming
•   The SW infrastructure relies on unique naming of
    “things” via URI-s
•   Lots of discussions are happening that also touch
    upon general Web architecture:
    •   HTTP URI-s or other URN-s?
         •   using non-HTTP unnecessarily complicates the general infra-
             structure
    •   URI-s for “informational resources” and “non informa-
        tional resources”
    •   how to ensure that URI-s used on the SW are derefer-
        encable
    •   etc
65


               Other items: naming (cont)
•   A different aspect of naming: what is the URI for a
    specific entity (regardless of the technical details)
    •   what is the unique URI for, eg, Bach’s Well-Tempered
        Clavier?
         •   obviously important for, eg, music ontologies and data
         •   who has the authority or the means to define and maintain
             such URI-s?
         •   should we define characterizing properties for these and use
             owl:sameAs instead of a URI?
         •   the traditional library community may be of a big help in this
             area
    •   what is the URI of time-dependent entity (e.g., a specific
        point within a video)?
66


          Revision of the RDF model?
•   Some restrictions in RDF may be unnecessary (b-
    Nodes as predicates, literals as subject, …)
•   Issue of “named graph”: possibility to give a URI to
    a set of triplets and make statements on those
•   Syntax issues in RDF/XML
•   Add a time tag to statements?
•   …
67


         A major problem: messaging
•   Some of the messaging on Semantic Web has
    gone terribly wrong over the years
•   This has created lots of (unnecessary) controver-
    sies
•   The whole community should be active in rectifying
    those…
68




       Thank you for your attention!


These slides are also available on the Web:

  http://www.w3.org/2009/Talks/05-Oz-StateOfSW-IH/

Mais conteúdo relacionado

Mais procurados

Describing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyDescribing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyNandana Mihindukulasooriya
 
Linked Data at the National Széchényi Library : road to the publication
Linked Data at the National Széchényi Library : road to the publicationLinked Data at the National Széchényi Library : road to the publication
Linked Data at the National Széchényi Library : road to the publicationhorvadam
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceMarin Dimitrov
 

Mais procurados (6)

Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
 
Describing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyDescribing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core Vocabulary
 
RDFa
RDFaRDFa
RDFa
 
Tese phd
Tese phdTese phd
Tese phd
 
Linked Data at the National Széchényi Library : road to the publication
Linked Data at the National Széchényi Library : road to the publicationLinked Data at the National Széchényi Library : road to the publication
Linked Data at the National Széchényi Library : road to the publication
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business Intelligence
 

Destaque

Social Media is Changing Libraries - Where do you FIT IN?
Social Media is Changing Libraries - Where do you FIT IN?Social Media is Changing Libraries - Where do you FIT IN?
Social Media is Changing Libraries - Where do you FIT IN?Curtis Rogers, MLIS, EdD
 
The role of library in educational development
The role of library in educational developmentThe role of library in educational development
The role of library in educational development08180512000
 
The Role of Libraries and Librarians in Information Literacy
The Role of Libraries and Librarians in Information LiteracyThe Role of Libraries and Librarians in Information Literacy
The Role of Libraries and Librarians in Information LiteracyPLAI STRLC
 
The Library Then and Now: Its Importance and Relevance to the Present Genera...
The Library Then and Now:  Its Importance and Relevance to the Present Genera...The Library Then and Now:  Its Importance and Relevance to the Present Genera...
The Library Then and Now: Its Importance and Relevance to the Present Genera...Fe Angela Verzosa
 
The Role of Librarians in the 21st Century
The Role of Librarians in the 21st CenturyThe Role of Librarians in the 21st Century
The Role of Librarians in the 21st CenturyPLAI STRLC
 

Destaque (6)

Social Media is Changing Libraries - Where do you FIT IN?
Social Media is Changing Libraries - Where do you FIT IN?Social Media is Changing Libraries - Where do you FIT IN?
Social Media is Changing Libraries - Where do you FIT IN?
 
The role of library in educational development
The role of library in educational developmentThe role of library in educational development
The role of library in educational development
 
The Role of Libraries and Librarians in Information Literacy
The Role of Libraries and Librarians in Information LiteracyThe Role of Libraries and Librarians in Information Literacy
The Role of Libraries and Librarians in Information Literacy
 
The Library Then and Now: Its Importance and Relevance to the Present Genera...
The Library Then and Now:  Its Importance and Relevance to the Present Genera...The Library Then and Now:  Its Importance and Relevance to the Present Genera...
The Library Then and Now: Its Importance and Relevance to the Present Genera...
 
The Role of Librarians in the 21st Century
The Role of Librarians in the 21st CenturyThe Role of Librarians in the 21st Century
The Role of Librarians in the 21st Century
 
The Future of Libraries
The Future of LibrariesThe Future of Libraries
The Future of Libraries
 

Semelhante a Some news about the SW

Mark Little Fence Sitting Soa Geek
Mark Little Fence Sitting Soa GeekMark Little Fence Sitting Soa Geek
Mark Little Fence Sitting Soa Geekdeimos
 
WebGUI And The Semantic Web
WebGUI And The Semantic WebWebGUI And The Semantic Web
WebGUI And The Semantic WebWilliam McKee
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 
Semantic Web research anno 2006:main streams, popular falacies, current statu...
Semantic Web research anno 2006:main streams, popular falacies, current statu...Semantic Web research anno 2006:main streams, popular falacies, current statu...
Semantic Web research anno 2006:main streams, popular falacies, current statu...Frank van Harmelen
 
Microblogging: A Semantic Web and Distributed Approach
Microblogging: A Semantic Web and Distributed ApproachMicroblogging: A Semantic Web and Distributed Approach
Microblogging: A Semantic Web and Distributed ApproachAlexandre Passant
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012François Belleau
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic WebIvan Herman
 
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...Olivier DASINI
 
ROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in RubyROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in RubyRakuten Group, Inc.
 
Application Semantics via Rules in Open Vocabulary English
Application Semantics via Rules in Open Vocabulary EnglishApplication Semantics via Rules in Open Vocabulary English
Application Semantics via Rules in Open Vocabulary EnglishAdrian Walker
 
ESWC SS 2013 - Wednesday Tutorial Marko Grobelnik: Introduction to Big Data A...
ESWC SS 2013 - Wednesday Tutorial Marko Grobelnik: Introduction to Big Data A...ESWC SS 2013 - Wednesday Tutorial Marko Grobelnik: Introduction to Big Data A...
ESWC SS 2013 - Wednesday Tutorial Marko Grobelnik: Introduction to Big Data A...eswcsummerschool
 
What is the Semantic Web
What is the Semantic WebWhat is the Semantic Web
What is the Semantic WebIvan Herman
 
Mash-Up Personal Learning Environments (MUPPLE)
Mash-Up Personal Learning Environments (MUPPLE)Mash-Up Personal Learning Environments (MUPPLE)
Mash-Up Personal Learning Environments (MUPPLE)Hannes Ebner
 
Getting Started with Big Data in the Cloud
Getting Started with Big Data in the CloudGetting Started with Big Data in the Cloud
Getting Started with Big Data in the CloudRightScale
 
H2O Big Data Environments
H2O Big Data EnvironmentsH2O Big Data Environments
H2O Big Data EnvironmentsSri Ambati
 

Semelhante a Some news about the SW (20)

When?
When?When?
When?
 
Mark Little Fence Sitting Soa Geek
Mark Little Fence Sitting Soa GeekMark Little Fence Sitting Soa Geek
Mark Little Fence Sitting Soa Geek
 
WebGUI And The Semantic Web
WebGUI And The Semantic WebWebGUI And The Semantic Web
WebGUI And The Semantic Web
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
Semantic Web research anno 2006:main streams, popular falacies, current statu...
Semantic Web research anno 2006:main streams, popular falacies, current statu...Semantic Web research anno 2006:main streams, popular falacies, current statu...
Semantic Web research anno 2006:main streams, popular falacies, current statu...
 
Couch Db
Couch DbCouch Db
Couch Db
 
Microblogging: A Semantic Web and Distributed Approach
Microblogging: A Semantic Web and Distributed ApproachMicroblogging: A Semantic Web and Distributed Approach
Microblogging: A Semantic Web and Distributed Approach
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
 
ROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in RubyROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in Ruby
 
Application Semantics via Rules in Open Vocabulary English
Application Semantics via Rules in Open Vocabulary EnglishApplication Semantics via Rules in Open Vocabulary English
Application Semantics via Rules in Open Vocabulary English
 
ESWC SS 2013 - Wednesday Tutorial Marko Grobelnik: Introduction to Big Data A...
ESWC SS 2013 - Wednesday Tutorial Marko Grobelnik: Introduction to Big Data A...ESWC SS 2013 - Wednesday Tutorial Marko Grobelnik: Introduction to Big Data A...
ESWC SS 2013 - Wednesday Tutorial Marko Grobelnik: Introduction to Big Data A...
 
What is the Semantic Web
What is the Semantic WebWhat is the Semantic Web
What is the Semantic Web
 
Spark
SparkSpark
Spark
 
Mash-Up Personal Learning Environments (MUPPLE)
Mash-Up Personal Learning Environments (MUPPLE)Mash-Up Personal Learning Environments (MUPPLE)
Mash-Up Personal Learning Environments (MUPPLE)
 
Addressing dm-cloud
Addressing dm-cloudAddressing dm-cloud
Addressing dm-cloud
 
Markup As An Api
Markup As An ApiMarkup As An Api
Markup As An Api
 
Getting Started with Big Data in the Cloud
Getting Started with Big Data in the CloudGetting Started with Big Data in the Cloud
Getting Started with Big Data in the Cloud
 
H2O Big Data Environments
H2O Big Data EnvironmentsH2O Big Data Environments
H2O Big Data Environments
 

Mais de Ivan Herman

The convergence of Publishing and the Web
The convergence of Publishing and the WebThe convergence of Publishing and the Web
The convergence of Publishing and the WebIvan Herman
 
Livres Numériques / Web : Construire la Convergence
Livres Numériques / Web : Construire la ConvergenceLivres Numériques / Web : Construire la Convergence
Livres Numériques / Web : Construire la ConvergenceIvan Herman
 
W3C Digital Publishing Interest Group Update
W3C Digital Publishing Interest Group UpdateW3C Digital Publishing Interest Group Update
W3C Digital Publishing Interest Group UpdateIvan Herman
 
Bridging the Web and Digital Publishing: EPUBWEB
Bridging the Web and Digital Publishing: EPUBWEBBridging the Web and Digital Publishing: EPUBWEB
Bridging the Web and Digital Publishing: EPUBWEBIvan Herman
 
W3C and Digital Publishing
W3C and Digital PublishingW3C and Digital Publishing
W3C and Digital PublishingIvan Herman
 
W3C et les publications numériques
W3C et les publications numériquesW3C et les publications numériques
W3C et les publications numériquesIvan Herman
 
Digital Publishing and the Open Web Platform
Digital Publishing and the Open Web PlatformDigital Publishing and the Open Web Platform
Digital Publishing and the Open Web PlatformIvan Herman
 
Standardizing for Open Data
Standardizing for Open DataStandardizing for Open Data
Standardizing for Open DataIvan Herman
 
The W3C Prov Vocabulary
The W3C Prov VocabularyThe W3C Prov Vocabulary
The W3C Prov VocabularyIvan Herman
 
Semantic Web and Related Work at W3C
Semantic Web and Related Work at W3CSemantic Web and Related Work at W3C
Semantic Web and Related Work at W3CIvan Herman
 
On scholarly communication (report of a Dagstuhl workshop)
On scholarly communication (report of a Dagstuhl workshop)On scholarly communication (report of a Dagstuhl workshop)
On scholarly communication (report of a Dagstuhl workshop)Ivan Herman
 
Introduction to RDFa
Introduction to RDFaIntroduction to RDFa
Introduction to RDFaIvan Herman
 
Introduction to Semantic Web Technologies
Introduction to Semantic Web TechnologiesIntroduction to Semantic Web Technologies
Introduction to Semantic Web TechnologiesIvan Herman
 
A year on the Semantic Web @ W3C
A year on the Semantic Web @ W3CA year on the Semantic Web @ W3C
A year on the Semantic Web @ W3CIvan Herman
 
Introduction to Semantic Web
Introduction to Semantic WebIntroduction to Semantic Web
Introduction to Semantic WebIvan Herman
 
What is the Semantic Web (in 15 minutes...)
What is the Semantic Web (in 15 minutes...)What is the Semantic Web (in 15 minutes...)
What is the Semantic Web (in 15 minutes...)Ivan Herman
 
Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008
Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008
Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008Ivan Herman
 
États des lieux du Web sémantique
États des lieux du Web sémantiqueÉtats des lieux du Web sémantique
États des lieux du Web sémantiqueIvan Herman
 

Mais de Ivan Herman (19)

The convergence of Publishing and the Web
The convergence of Publishing and the WebThe convergence of Publishing and the Web
The convergence of Publishing and the Web
 
Livres Numériques / Web : Construire la Convergence
Livres Numériques / Web : Construire la ConvergenceLivres Numériques / Web : Construire la Convergence
Livres Numériques / Web : Construire la Convergence
 
W3C Digital Publishing Interest Group Update
W3C Digital Publishing Interest Group UpdateW3C Digital Publishing Interest Group Update
W3C Digital Publishing Interest Group Update
 
Bridging the Web and Digital Publishing: EPUBWEB
Bridging the Web and Digital Publishing: EPUBWEBBridging the Web and Digital Publishing: EPUBWEB
Bridging the Web and Digital Publishing: EPUBWEB
 
W3C and Digital Publishing
W3C and Digital PublishingW3C and Digital Publishing
W3C and Digital Publishing
 
W3C et les publications numériques
W3C et les publications numériquesW3C et les publications numériques
W3C et les publications numériques
 
Digital Publishing and the Open Web Platform
Digital Publishing and the Open Web PlatformDigital Publishing and the Open Web Platform
Digital Publishing and the Open Web Platform
 
Standardizing for Open Data
Standardizing for Open DataStandardizing for Open Data
Standardizing for Open Data
 
The W3C Prov Vocabulary
The W3C Prov VocabularyThe W3C Prov Vocabulary
The W3C Prov Vocabulary
 
Semantic Web and Related Work at W3C
Semantic Web and Related Work at W3CSemantic Web and Related Work at W3C
Semantic Web and Related Work at W3C
 
On scholarly communication (report of a Dagstuhl workshop)
On scholarly communication (report of a Dagstuhl workshop)On scholarly communication (report of a Dagstuhl workshop)
On scholarly communication (report of a Dagstuhl workshop)
 
Introduction to RDFa
Introduction to RDFaIntroduction to RDFa
Introduction to RDFa
 
RDFa Tutorial
RDFa TutorialRDFa Tutorial
RDFa Tutorial
 
Introduction to Semantic Web Technologies
Introduction to Semantic Web TechnologiesIntroduction to Semantic Web Technologies
Introduction to Semantic Web Technologies
 
A year on the Semantic Web @ W3C
A year on the Semantic Web @ W3CA year on the Semantic Web @ W3C
A year on the Semantic Web @ W3C
 
Introduction to Semantic Web
Introduction to Semantic WebIntroduction to Semantic Web
Introduction to Semantic Web
 
What is the Semantic Web (in 15 minutes...)
What is the Semantic Web (in 15 minutes...)What is the Semantic Web (in 15 minutes...)
What is the Semantic Web (in 15 minutes...)
 
Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008
Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008
Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008
 
États des lieux du Web sémantique
États des lieux du Web sémantiqueÉtats des lieux du Web sémantique
États des lieux du Web sémantique
 

Último

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Último (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Some news about the SW

  • 1. 1 State of the Semantic Web Ivan Herman, W3C May 2009
  • 2. 2 What is the overall status of the Semantic Web?
  • 3. 3 We have the basic technologies • Stable specifications for the basics since 2004: RDF, OWL • Work is being done to properly incorporate rules • We have a standard for query since 2008: SPAR- QL • We have some additional technologies to access/create RDF data: GRDDL, RDFa, POWDER, … • Some fundamental vocabularies became pervasive (FOAF, Dublin Core,…)
  • 4. 4 Lots of Tools (not an exhaustive list!) • Categories: • Some names: • Triple Stores • Jena, AllegroGraph, Mulgara, Sesame, flickurl, … • Inference engines • TopBraid Suite, Virtuoso environ- • Converters ment, Falcon, Drupal 7, Redland, Pellet, … • Search engines • Disco, Oracle 11g, RacerPro, • Middleware IODT, Ontobroker, OWLIM, Tallis Platform, … • CMS • RDF Gateway, RDFLib, Open • Semantic Web browsers Anzo, DartGrid, Zitgist, Ontotext, Protégé, … • Development environments • Thetus publisher, SemanticWorks, SWI-Prolog, RDFStore… • Semantic Wikis • … • …
  • 5. 5 Lots of tools (cont.) • Significant speed, store capacity, etc; improve- ments are reported every day • Some of the tools are open source, some are not; some are very mature, some are not: it is the usual picture of software tools, nothing special any more! • Anybody can start developing RDF-based applica- tions today
  • 6. 6 There is a great community • There are lots of tutorials, overviews, and books around • again, some of them good, some of them bad, just as with any other areas… • Active developers’ communities • blogs, IRC channels, mailing lists, various fora: more than what one person can oversee…
  • 7. 7 Great community… From a presentation given by David Norheim, Computas AS, at the ESTC2008 Conference, Vienna, Austria
  • 8. 8 Some deployment communities • Major communities pick the technology up: digital libraries, defence, eGovernment, energy sector, fin- ancial services, health care, oil and gas industry, life sciences … • Health care and life science sector is now very active • Semantic Web also appears in the “Web 2.0/Web 3.0” world (whatever that means ) • exchange of social data • personal “space” applications • multimedia asset management (video, photos, audio, …) • etc
  • 9. 9 So what is the Semantic Web?
  • 10. 10 • There is a growing number of application patterns referring to the Semantic Web: • data integration using RDF, SKOS, OWL, … • knowledge engineering with complex ontologies • using, eg, OWL and/or rule based reasoning • better data management, archiving, cataloging, etc • eg, digital library applications • managing, coordinating, combining Web services • intelligent software agents • improving search (usually using domain specific vocab- ularies…) • etc
  • 12. 12 But maybe this is where we are?
  • 13. 13 • Maybe, but being an elephant is not necessary bad! • it shows that the Semantic Web is a mature technology • that there is lots of interest, applications • various application areas pick what they need… • e.g., some need sophisticated knowledge management, so they go for complex ontologies… • some concentrate on semantically simpler vocabularies but large volume of data • …and that is fine, there is room for many!
  • 14. 14 • But it is good to (re-)emphasize some principles • The Semantic Web: • extends the principles of the Web from documents to data; create a Web of data
  • 15. 15 • It is the Semantic Web, and not only Semantics! • data, ontologies, vocabularies, etc, can (and should!) be shared, reused, potentially on Web scale • one can use the Web infrastructure to denote “things”… • Eg: http://www.ivan-herman/me denotes, well, me (not my home page, not my foaf file, but me!) • … and add relationships for those, too! • The major importance of the SW is that it provides an abstract integration layer for data on the Web
  • 17. 17 How do I get data out?
  • 18. 18 How to provide RDF data? • Of course, one could create RDF data manually… • … but that is unrealistic on a large scale • Goal is to generate RDF data automatically when possible and “fill in” by hand only when necessary • Various data formats should be considered • databases (relational or otherwise) • data in XML, HTML, in pictures, videos, etc • Details of the process is still subject of very active R&D!
  • 19. 19 Bridge to relational databases • Huge amount of data are stored in (relational) databases • “RDFying” them is impossible • “Bridges” are being defined: • a layer between RDF and the relational data • RDB tables are “mapped” to RDF graphs, possibly on the fly • a number of systems can be used as database as well as triple stores (eg, Oracle, OpenLink, …) • Work for a standard mapping language may start at W3C soon
  • 20. 20 Linking Open Data Project • Goal: “expose” open datasets in RDF • Set RDF links among the data items from different datasets • Set up query endpoints • Altogether billions of triples, millions of links…
  • 21. 21 Example data source: DBpedia • DBpedia is a community effort to • extract structured (“infobox”) information from Wikipedia • provide a query endpoint to the dataset • interlink the DBpedia dataset with other datasets on the Web
  • 22. 22 Extracting Wikipedia structured data @prefix dbpedia <http://dbpedia.org/resource/>. @prefix dbterm <http://dbpedia.org/property/>. dbpedia:Amsterdam dbterm:officialName “Amsterdam” ; dbterm:longd “4” ; dbterm:longm “53” ; dbterm:longs “32” ; ... dbterm:leaderTitle “Mayor” ; dbterm:leaderName dbpedia:Job_Cohen ; ... dbterm:areaTotalKm “219” ; ... dbpedia:ABN_AMRO dbterm:location dbpedia:Amsterdam ; ...
  • 23. 23 Automatic links among open datasets <http://dbpedia.org/resource/Amsterdam> owl:sameAs <http://rdf.freebase.com/ns/...> ; owl:sameAs <http://sws.geonames.org/2759793> ; ... <http://sws.geonames.org/2759793> owl:sameAs <http://dbpedia.org/resource/Amsterdam> wgs84_pos:lat “52.3666667” ; wgs84_pos:long “4.8833333” ; geo:inCountry <http://www.geonames.org/countries/#NL> ; ... Processors can switch automatically from one to the other…
  • 25. 25 The LOD “cloud”, September 2008
  • 27. Generate (meta)data from unstructured 27 data • An emerging approach: • use Natural Language Processing (NLP) to analyse text • services exist (Reuter’s Open Calais and Tagaroo, Zemanta) • these often return URI-s into, eg, Dbpedia • Use these techniques to, eg, automatically “tag” entries • eg: Twine, Faviki • the tag URI-s provide “integration points”
  • 28. 28 Data may be extracted (a.k.a. “scraped”) • Different tools, services, etc, come to the fore: • services to get RDF data from images’ XMP data, from Flickr… • scripts to convert spreadsheets to RDF • etc • Many of these tools are still individual “hacks”, but show a general tendency • Hopefully more tools will emerge • there is a separate wiki page collecting references to ex- isting ones
  • 29. 29 Getting structured data to RDF: GRDDL • Access structured data in XML/XHTML and turn it into RDF: • defines XML attributes to bind a suitable script to trans- form (part of) the data into RDF • script is usually XSLT but not necessarily • has a variant for XHTML • a “GRDDL Processor” runs the script and produces RDF on–the–fly • A way to access existing structured data and “bring” it to RDF • eg, a possible link to microformats • exposing data from large XML use bases, like XBRL
  • 30. 30 Getting structured data to RDF: RDFa • Extends XHTML with a set of attributes to include structured data into XHTML • Makes it easy to “bring” existing RDF vocabularies into XHTML • uses namespaces for an easy mix of terminologies • It can also be used with GRDDL • but: no need to implement a separate transformation per vocabulary
  • 31. 31 How to “assign” RDF data to resources? • This is important when the RDF data is used as “metadata” • Some examples: • copyright information for your photographs • is a Web page usable on a mobile phone and how? • bibliographical data for a publication • annotation of the data resulting from a scientific experi- ment • etc • The issue: if I have the URI of the resource (photo- graph, publication, etc), how do I find the relevant RDF data?
  • 32. 32 The data might be embedded • Some data formats allow the direct inclusion of (RDF) metadata: • SVG (Scalable Vector Graphics) • XHTML+RDFa • microformats+GRDDL • JPG files using the comment area and, eg, Adobe’s XMP technology • That can include all the information, or link to fur- ther data
  • 33. 33 POWDER • POWDER (Protocol for Web Description Re- sources) provides for more elaborate scenarios • Lets you define predicates that are automatically “assigned” to a set of resources
  • 35. 35 Some technical details… • The “description resource” is an XML file • This XML file has a canonical conversion to OWL • Specialized POWDER services will be set up: – give the URI of a Resource and the corresponding de- scription resource, return all RDF statements on that URI
  • 36. 36 Simple Knowledge Organization System • Goal: represent and share classifications, glossar- ies, thesauri, etc, as developed in the “Print World”. • for example: • Dewey Decimal Classification, Art and Architecture Thesaur- us, ACM classification of keywords and terms… • allow for a quick port of this traditional data, combine it with other data • This is where SKOS comes in: define classes and properties to add those structures to an RDF uni- verse
  • 37. 37 Example: entries in a glossary Assertion (i) Any expression which is claimed to be true. (ii) The act of claiming something to be true. Class A general concept, category or classification. Something used primarily to classify or categorize other things. Resource (i) An entity; anything in the universe. (ii) As a class name: the class of everything; the most inclusive category possible. (from the RDF Semantics Glossary)
  • 38. 38 Example: entries in a glossary in SKOS
  • 39. 39 A more complex structure (using LCSH terms)
  • 40. 40 SKOS and digital libraries • SKOS plays an important role in “bridging” to digital libraries • a huge community with its own traditions, style… • … but huge amount of data to be “linked” to the Se- mantic Web! • Major library metadata standards are being re- defined in terms of RDF (and SKOS), • eg, “Resource Description and Access” (RDA) • a major cataloguing rule set for librarians • potentially, all major library catalogues around the globe could be translated into RDF and, eg, linked as an Open Linked Data…
  • 41. 41 Conclusions on data access • There are many different data sources around • Making them available on the Web and interlinking them is essential • “Give your raw data” — Tim Berners-Lee • There are number of technologies to do that: • mapping from databases, GRDDL, RDFa, SKOS, POWDER, conversion tools
  • 43. 43 Querying RDF: SPARQL • Is a W3C Standard since January 2008 • it has already become one of the absolutely essential technologies on the SW • SPARQL is • a query language based on graph patterns • a protocol layer to use SPARQL over, eg, HTTP • an XML return format for the query results
  • 44. 44 SPARQL as a unifying point!
  • 45. 45 New SPARQL WG: Goals • To define a small set of extensions to SPARQL • No complex change, backward compatibility • Listen to user and implementation experiences of the past few years • Group started in February 2009
  • 46. 46 Planned features • Update, ie, ability to change the RDF store • Service description framework • what type of extensions, inference possibilities, etc, are available at the endpoint • Addition to the query language • aggregate functions • subqueries • negation • project expressions
  • 47. 47 Planned features (tentative syntax examples) • Aggregate functions and project expressions: • SELECT AVG(?age) AS average_age WHERE { .... } •SELECT (?age < 18) AS minor WHERE { ... } • Subqueries: • SELECT ?person (SELECT ?n WHERE { ?person foaf:name ?n } LIMIT 1) WHERE { <http://www.ivan-herman.net/me> foaf:knows ?person. } • • Negation: SELECT * WHERE { ?x :p ?v. UNSAID { ?x :q ?v. } }
  • 48. 48 Possible features (time permitting) • Definition of “entailment regimes” • RDFS, OWL Profiles, RIF • Property paths • Commonly used functions (eg, string manipulation) • Basic control for federated queries • Additional query language syntax • commas in select lists, some operators in filters
  • 50. 50 Ontologies: OWL • This is also a stable specification since 2004 • Separate layers have been defined, balancing ex- pressibility vs. implementability (OWL-Lite, OWL- DL, OWL-Full) • Looking at the tool list on W3C’s wiki again: • a number programming environments include OWL reasoners • stand-alone reasoners (downloadable or on the Web) • ontology editors come to the fore
  • 51. 51 Ontologies • Large ontologies are being developed (converted from other formats or defined in OWL). For ex- ample: • eClassOwl: eBusiness ontology for products and ser- vices, 75,000 classes and 5,500 properties • National Cancer Institute’s ontology: about 58,000 classes • Open Biomedical Ontologies Foundry: a collection of ontologies, including the Gene Ontology, to describe gene and gene product attributes; or UniProt for protein sequence and annotation terminology and data • BioPAX: for biological pathway data • ISO 15926: “Integration of life-cycle data for process plants including oil and gas production facilities”
  • 52. 52 OWL in applications • An increasing number of applications rely on OWL (Pfizer, Nasa, Eli Lilly, Elsevier, FAO, …) • Not all use complex reasoning; in many cases a small fraction of OWL is used
  • 53. 53 OWL Working Group • A new Working Group works on the revision of OWL (a.ka. OWL 2) • The goal of the group: 1. add a few extensions to current OWL that are useful, and is known to be implementable • many things happened in research since 2004 2. define “profiles” of OWL that are: • smaller, easier to implement and deploy • cover important application areas and are easily understand- able to non-expert users
  • 54. 54 Some new features in OWL 2 • Syntactic sugars – eg, disjoint union of classes • New constructs for properties – property chains, reflexive properties • Extended datatype facilities – define a numerical interval as an OWL Datatype class • Profiles
  • 55. 55 The overall structure has not changed
  • 56. 56 Profiles • OWL 2 has the same duality with Full and DL • But, for a number of applications, but even OWL Lite is too much • There is a need for “light” versions of OWL: just a few extra possibilities added to RDFS
  • 57. 57 OWL 2 defines “profiles” • Further restrictions on how terms can be used and what inferences can be expected • The semantic approaches are identical, but restric- tions may ensure even more manageable imple- mentations
  • 58. 58 OWL 2 profiles • Classification and instance queries in polynomial time: OWL-EL • Implementable on top of conventional relational database engines: OWL-QL • Implementable on top of traditional rule engines: OWL-RL
  • 59. 59 An example: OWL-RL • Goal: to be implementable through rule engines • Usage follows a similar approach to RDFS: − merge the ontology and the instance data into a big RDF graph − use the rule engine to add new triples (as long as it is pos- sible) − then, for example, use SPARQL to query the resulting (expanded) graph • This application model is very important for RDF based applications
  • 61. 61 Everything has not been solved… • There are a number of issues, problems • missing functionalities: encryption/signatures, fuzzy reasoning, … • misconceptions, messaging problems • need for more applications, deployment, acceptance • incorporation of rule languages (that is being worked on by the RIF Working Group) • etc
  • 62. 62 Other items… • Security, trust, provenance • combining cryptographic techniques with the RDF mod- el, sign a portion of the graph, etc • trust models • Quality constraints on graphs • “may I be sure that certain patterns are present in a graph?” • Ontology merging, alignment, term equivalences, versioning, development, … • What does reasoning mean on billions of triples? • etc
  • 63. 63 Other items: uncertainty • Fuzzy logic • look at alternatives of DL based on fuzzy logic • alternatively, extend RDF(S) with fuzzy notions • Probabilistic statements • have an OWL class membership with a specific probab- ility • combine reasoners with Bayesian networks • A W3C Incubator Group issued a report on the cur- rent status, possibilities, directions, etc • report published in April 2008
  • 64. 64 Other items: naming • The SW infrastructure relies on unique naming of “things” via URI-s • Lots of discussions are happening that also touch upon general Web architecture: • HTTP URI-s or other URN-s? • using non-HTTP unnecessarily complicates the general infra- structure • URI-s for “informational resources” and “non informa- tional resources” • how to ensure that URI-s used on the SW are derefer- encable • etc
  • 65. 65 Other items: naming (cont) • A different aspect of naming: what is the URI for a specific entity (regardless of the technical details) • what is the unique URI for, eg, Bach’s Well-Tempered Clavier? • obviously important for, eg, music ontologies and data • who has the authority or the means to define and maintain such URI-s? • should we define characterizing properties for these and use owl:sameAs instead of a URI? • the traditional library community may be of a big help in this area • what is the URI of time-dependent entity (e.g., a specific point within a video)?
  • 66. 66 Revision of the RDF model? • Some restrictions in RDF may be unnecessary (b- Nodes as predicates, literals as subject, …) • Issue of “named graph”: possibility to give a URI to a set of triplets and make statements on those • Syntax issues in RDF/XML • Add a time tag to statements? • …
  • 67. 67 A major problem: messaging • Some of the messaging on Semantic Web has gone terribly wrong over the years • This has created lots of (unnecessary) controver- sies • The whole community should be active in rectifying those…
  • 68. 68 Thank you for your attention! These slides are also available on the Web: http://www.w3.org/2009/Talks/05-Oz-StateOfSW-IH/