5. What is the Web? “… the Web, is a system of interlinked hypertext documents accessed via the Internet. With a web browser, one can view web pages that may contain text, images […] and navigate between them via hyperlinks” http://en.wikipedia.org/wiki/World_Wide_Web
7. History of the Web Created by Tim Berners-Lee at CERN in 1989 Mosaic browser in 1993 W3C created in 1994 Exponential growth mid 90s Search engines – Google 1998 Dot-com boom 1999 – 2001 Web 2.0 – blogs, Facebook, Twitter, etc
8. What is the problem? The web is full of documents We aren’t always interested in documents We are interested in THINGS These THINGS might be in documents We can read a HTML document rendered in a browser and find what we are searching for This is hard for computers. Computers have to guess (even though they are pretty good at it)
9. The Web is a Data Shredder Structured Data Unstructured Data Thanks Martin Hepp
10. What would we like? Make it easy for computers/software to find THINGS
21. What is the Semantic Web? Besides publishing documents on the web which computers can’t understand easily Let’s publish on the web something that computers can understand
22. What is the Semantic Web? Besides publishing documents on the web which computers can’t understand easily Let’s publish on the web something that computers can understand DATA
23. The Semantic Web is a web of linked data The current web is a web of linked documents
36. Resource Description Framework (RDF) A data model A way to model data i.e. Relational databases use relational data model RDF is a triple data model Labeled Graph Subject, Predicate, Object <Juan> <was born in> <California> <California> <is part of> <the USA> <Juan> <has hobby> <Salsa dancing>
37. RDF can be serialized in different ways RDF/XML RDFa (RDF in HTML) N3 Turtle JSON
38. So does that mean that I have to publish my data in RDF now?
42. Databases back up documents THINGS have PROPERTIES: A Book as a Title, an author, … This is a THING: A book title “Programming the Semantic Web” by Toby Segaran, …
43. Lets represent the data in RDF Programming the Semantic Web title author book Toby Segaran isbn 978-0-596-15381-6 publisher name Publisher O’Reilly
44. Remember that we are on the web Everything on the web is identified by a URI
45. And now let’s link the data to other data Programming the Semantic Web title author http://…/isbn978 Toby Segaran isbn 978-0-596-15381-6 publisher name http://…/publisher1 O’Reilly
46. And now consider the data from Revyu.com hasReview http://…/review1 http://…/isbn978 description reviewer Awesome Book http://…/reviewer name Juan Sequeda
47. Let’s start to link data hasReview http://…/review1 http://…/isbn978 Programming the Semantic Web title description sameAs hasReviewer Awesome Book author http://…/isbn978 Toby Segaran http://…/reviewer name isbn 978-0-596-15381-6 Juan Sequeda publisher name http://…/publisher1 O’Reilly
48. Juan Sequeda publishes data too http://juansequeda.com/id http://dbpedia.org/Austin livesIn name Juan Sequeda
49. Let’s link more data hasReview http://…/review1 http://…/isbn978 description hasReviewer Awesome Book http://…/reviewer name Juan Sequeda sameAs http://juansequeda.com/id http://dbpedia.org/Austin livesIn name Juan Sequeda
50. And more hasReview http://…/review1 http://…/isbn978 Programming the Semantic Web title description sameAs hasReviewer Awesome Book author http://…/isbn978 Toby Segaran http://…/reviewer name isbn 978-0-596-15381-6 Juan Sequeda publisher sameAs http://…/publisher1 name O’Reilly http://juansequeda.com/id http://dbpedia.org/Austin livesIn name Juan Sequeda
51. Data on the Web that is in RDF and is linked to other RDF data is LINKED DATA
52. Linked Data Principles Use URIs as names for things Use HTTP URIs so that people can look up (dereference) those names. When someone looks up a URI, provide useful information. Include links to other URIs so that they can discover more things.
54. I can query a database with SQL. Is there a way to query Linked Data with a query language?
55. Yes! There is actually a standardize language for that SPARQL
56. FIND all the reviews on the book “Programming the Semantic Web” by people who live in Austin
57. hasReview http://…/review1 http://…/isbn978 Programming the Semantic Web title description sameAs hasReviewer Awesome Book author http://…/isbn978 Toby Segaran http://…/reviewer name isbn 978-0-596-15381-6 Juan Sequeda publisher sameAs name http://…/publisher1 O’Reilly http://juansequeda.com http://dbpedia.org/Austin livesIn name Juan Sequeda
58. This looks cool, but let’s be realistic. What is the incentive to publish Linked Data?
59. What was your incentive to publish an HTML page in 1990?
60. 1) Share data in documents2) Because you neighbor was doing it… later on …3) Marketing, Advertising, SEO
62. 1) Share data as data2) Because you neighbor is doing it…3) (Semantic) SEO ++
63. Linked Data Publishers UK Government US Government BBC Open Calais – Thomson Reuters Freebase/Google NY Times Best Buy CNET Dbpedia Overstock.com O’Reilly Media …
64. Publishing Linked Data Legacy Data in Relational Databases D2R Server, Virtuoso, Triplify, Ultrawrap CMS Drupal 7 Native RDF Databases AllegroGraph, Jena, Sesame, Virtuoso, Talis Platform In HTML with RDFa
67. (Semantic) SEO ++ Markup your HTML with RDFa Use standard vocabularies (ontologies) Google Vocabulary Good Relations Dublin Core Google and Yahoo will crawl this data and use it for better rendering
84. Linked Data Browsers Not actually separate browsers. Run inside of HTML browsers View the data that is returned after looking up a URI in tabular form (IMO) No usability
93. Time to create new and innovative ways to interact with Linked Data New and improved search
94. This may be one of the Killer Apps that we have all been waiting for http://en.wikipedia.org/wiki/File:Mosaic_browser_plaque_ncsa.jpg
95. It’s time to partner with HCI community Semantic Web UIs don’t have to be ugly
96. Linked Data Applications Software system that makes use of data on the web from multiple datasets and that benefits from links between the datasets
97.
98. Discover further information by following the links between different data sources: the fourth principle enables this.
99. Combine the consumed linked data with data from sources (not necessarily Linked Data)
100. Expose the combined data back to the web following the Linked Data principles
101.
102. RiBS - Miranker Lab Ultrawrap Virtualizes a RDBMS as Graph (RDF) Automatically generate the ontology from schema Query a RDBMS in SPARQL (language for RDF) Leverage SQL optimizer to do all the hard work Insert arbitrary RDF to your RDBMS without altering schema Diamond Linked Data query engine Linked Traversal based query execution Start with a URI that returns RDF and follow links
103. Ultrawrap enables your RDBMS to be linked with other RDF data Ultrawrap Ultrawrap Specify Ultrawrap NOW WE WANT TO QUERY THIS Morphster Morphbank
104. Query the Web of Linked Data with Diamond SPARQL Query Diamond Ultrawrap Ultrawrap Specify Ultrawrap Morphster Morphbank
105. Example 1 (Specify – DBpedia) Get full name and guid from taxon with id http://tata.csres.utexas.edu:8080/specify/data/taxon51807#thing AND fin any subjects it may have “skos:subject”
106. Result Example 1 Note that http://dbpedia.org/resource/Category:Fish_of_Australia comes from a different data source (dbpedia.org)
107. Example 2 (Specify-Morphbank) Get full name and guid from taxon with id http://tata.csres.utexas.edu:8080/specify/data/taxon42947#thing AND the rank and kingdom from Morphbank
108. Result Example 2 Note that full name and guid come from Specify http://tata.csres.utexas.edu:8080/specify/data/taxon42947 AND rank and kingdom come from Morphbank http://tata.csres.utexas.edu:8080/morphbank/data/taxa398354
109. Hot Research Topics Search and Ranking Interlinking Algorithms Provenance, Trust and Privacy Dataset Dynamics UI Distributed Query Evaluation “You want a good thesis? IR is based on precision and recall. The minute you add semantics, it is a meaningless feature. Logic is based on soundness and completeness. We don’t want soundness and completeness. We want a few good answers quickly.” – Jim Hendler at WWW2009 during the LOD gathering