More Related Content Similar to IPTC and the Semantic Web: Two Paths and Seven Lessons (20) More from Stuart Myles (20) IPTC and the Semantic Web: Two Paths and Seven Lessons1. IPTC and The Semantic Web:Two Paths and Seven Lessons Stuart Myles Associated Press 29th June 2010 2. Semantic Web News Vocabularies © 2010 IPTC (www.iptc.org) All rights reserved 2 IPTC decided to experiment with semantic web and linked data Best known RDF vocabularies are FOAF = Friend of a Friend http://xmlns.com/foaf/spec/ DCMI Terms = Dublin Core Metadata Initiative Terms http://dublincore.org/ Other examples at http://vocab.org/ New York Times, Dow Jones and others have identified a need for a news vocabulary Held a series to teleconferences to make rapid progress 3. Two Paths to the Semantic Web We identified two paths into the Semantic Web world: Create a news ontology, based on NewsML-G2 Formal semantics for news, specified using OWL “RDFization” of IPTC’s family of news standards Turn IPTC subject codes into Linked Data Connect related data across the web using URIs, HTTP & RDF A set of principles from Tim Berners Lee http://www.w3.org/DesignIssues/LinkedData.html We decided to pursue the Linked Data path first © 2010 IPTC (www.iptc.org) All rights reserved 3 4. Following the Linked Data Path The Linked Data principles, as specified by TBL Use URIs as names for things Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) Include links to other URIs, so that they can discover more things Apply the principles to IPTC’s subject codes Already published as XML (G2 Knowledge Items) And as HTML The plan: convert XML into RDF © 2010 IPTC (www.iptc.org) All rights reserved 4 5. Lesson #1One Model, Multiple Vocabularies RDF is a single model - Subject Predicate Object With multiple syntaxes We selected RDF/XML and RDF/Turtle And multiple “vocabularies” Such as SKOS, Dublin Core SKOS = Simple Knowledge Organization System http://www.w3.org/2004/02/skos/ Designed for representing thesauri and classification schemes The Semantic Web “way” is Use existing vocabularies as much as possible When you invent a new term, link it to existing terms We decided to use SKOS and DC as the main vocabs © 2010 IPTC (www.iptc.org) All rights reserved 5 6. Lesson #2Tool Support The approach: Use RDF in general Reuse existing vocabularies in particular The benefit: Tools “just work” We learnt that this is mostly true… We played with Protogee, TopBraid, Sesame Most things worked well in all tools But “transitive” versions of SKOS broader, narrower aren’t supported well Late additions to SKOS standard © 2010 IPTC (www.iptc.org) All rights reserved 6 7. Lesson #3Basics Well Documented In general, IPTC KnowledgeItems map well to RDF SKOS concepts Dublin Core properties Certain KI properties don’t have a direct mapping Created and updated timestamps of KnowledgeItem properties Difficult to determine more advanced mappings SKOS wiki had some documentation http://esw.w3.org/SkosCoreGuideToc/SectionVersioning SKOS email list seems dormant SemanticOverflow a great way to get questions answered http://www.semanticoverflow.com/questions/902/adding-created-modified-properties-to-skos-do-i-need-to-reify © 2010 IPTC (www.iptc.org) All rights reserved 7 8. Lesson #4Pull is Better than Push One possibility is to “push” our model into RDF Try to preserve all the original semantics But you don’t gain as much in out-of-the-box tool support The other possibility is to “pull” the model into RDF May lose some nuances But you gain in reuse – of modeling patterns, vocabularies and tool support (In fact, there was some dispute over the intended model of the IPTC KnowledgeItem properties) © 2010 IPTC (www.iptc.org) All rights reserved 8 9. Lesson #5Linking and Mapping “Include links to other URIs, so that they can discover more things” Linking is the heart of linked data But linking is more like mapping owl:sameas seems to have unintended consequences SKOS’s mapping properties offer a range of options closeMatch, exactMatch, broadMatch, narrowMatch, relatedMatch http://www.w3.org/TR/skos-reference/#mapping We decided to map the 17 top level IPTC subject codes to DBPedia Some top level terms are really “umbrella” terms – difficult to map to a single equivalent © 2010 IPTC (www.iptc.org) All rights reserved 9 10. Lesson #6There’s More to be Done Although we rapidly produced a Linked Data prototype, it is incomplete Content negotiation requires work from the APA hosting We need to think through and approve the details of the mapping The other path remains unexplored Building a news ontology, based on NewsML-G2 Can we leverage the work that EBU have already done? What about other formats? Particularly RDFa © 2010 IPTC (www.iptc.org) All rights reserved 10 11. Lesson #7There’s a Lot of Interest High attendance at the Semantic Web IPTC calls Even though the topic is a bit complex and unfamiliar to most Participation was brisk We rapidly developed RDF/XML and RDF/Turtle representations Occasional mentions on Twitter generated a lot more retweets and replies than other IPTC-related tweets There’s a lot of interest inside and outside the IPTC © 2010 IPTC (www.iptc.org) All rights reserved 11 12. IPTC and Semantic Web:Next Steps Complete Linked Data mapping of IPTC Subject Codes and Media Codes Explore creating a News Ontology Find out more about EBU’s work Start RDFa representation of news metadata Reach out to the broader Semantic Web and news communities for feedback and collaboration REQUEST to Standards Chair: Can we formalize this effort into an official IPTC Working Group? © 2010 IPTC (www.iptc.org) All rights reserved 12