6. The Case For Identifiers http://dbpedia.org/resource/Bobby_Moore
7. The Case For Identifiers <rdf:Description>... <xtm:TopicMap>... http://dbpedia.org/resource/Bobby_Moore <rdf:Description>...
8. Identifiers for Things SHOULD be attributable SHOULD be discoverable SHOULD be possible to declare equivalences SHOULD be hubs for related resources
9. But Identifiers Are URIs... Surely the Web fixes this ? Discoverable ? Not really – web search doesn’t cut it Declare equivalences?Sort of – if you use RDF/OWL Find more informationFor human consumption – YesFor application consumption – Sort Of
10.
11. Subj3ct Core Model Subject Declarations (URI, Title, Description) Equivalence Statments (URI to URI mapping) Resource Statements (Subject to URL mapping) Provenance (who said what)
13. Subj3ct Wants To... Help creators of linked data find existing identifiers Help consumers of linked data find related identifiers Make mash-ups easier Make applications smarter Expandable internal taxonomy Expand knowledge-base
14. Summary If you are publishing Linked Data Use identifiers for the things you describe Consider registering an ATOM feed with Subj3ct If you want to create the next generation of mash-ups, take a look at our API Feedback welcome! http://subj3ct.com/
Editor's Notes
This is Alice. She is thinking about Bobby Moore
And here is Bob, he is also thinking about Bobby Moore...what a coincidence!
And here is Robby. He has been programmed to find stuff about Bobby Moore, but he doesn’t do any thinking...he’s just a machine after all.
How do Alice, Bob and Robby reach some agreement about this Bobby Moore entity ? How does Alice know that Bob is thinking about the World Cup Hero ? How will Robby find the things that Alice and Bob can Google for or write in a Wikipedia page or a blog entry ?
Fortunately the Bobby Moore entity has a URI! In this case an identifier coined by dbpedia. If Alice, Bob and Robby all use that URI to identify the Bobby Moore entity, then they know they are all talking about the same thing. Even Robby – and he doesn’t think (but he is really good at comparing URI strings).
And the web is full of resources about Bobby Moore, some are web pages that Alice and Bob can read, some are data resources that Robby loves to process and that Alice (being a demon coder) can happily mash-up. The identifier for Bobby Moore can act as a gateway to all of these related resources.
Identifiers for things on the semantic web is a Good Thing, but there are some catches:Must be URIs (to play nicely with RDF)Must be discoverable (to enable reuse)Popular subjects often have multiple, independently created identifiers. This can lead to balkanization of knowledge resources and to avoid this it should be possible to declare that identifier X and identifier Y are actually about the same thing.Finally, the web isn’t the web without resources – some of the resources on the web are actually about things (and not just funny pictures of cats). For textual resources such as web pages, full text search engines can often do a good job of helping humans find resources. Machines find it harder and so to link together data in the linked data web we need some hubs that provide resources. Entity identifiers provide the ideal index key to find resources about those entities.
So what’s the problem ? We have the web, we have Linked Data, we have URIs, surely everything is OK and this presentation can be a couple of minutes shorter ?Well...not so muchIdentifier URIs are not really discoverable, the identifiers coined by organizations are buried in an avalanche of content. Without an identifier-specific search engine its difficult to go from the name or description of something (say Bobby Moore) to an identifier that you could plug into a Linked Data consuming application.Equivalences are even harder to find, the only mechanism currently in use is the owl:sameAs property. Which means you have to find the RDF resource that contains an owl:sameAs statement with the identifier you know about in it. Unless you want to crawl the Linked Data web to do that you are kind of stuck.Related resources – its easy for Bob and Alice to use Google. Robby on the other hand has a harder time distinguishing between all the resources that come back from Google. It would be nice if Robby could just get a list of linked data resources for the identifier he knows about for Bobby Moore. And if Alice could have that for her “World Cup Heros” mash-up that would be nice too. And if Robby and Alice could get an idea about who says that this is a resource about Bobby Moore, well that would be just shiny.
Subj3ct is an subject identifier registry. Its goal is to provide the hub services that allow creators and consumers of linked data to find and exchange identifiers for things and the addresses of resources related to things. Subj3ct currently hosts about 16M identifiers from over 70 different sources.
The core model for Subj3ct is pretty simple.Subjects are declared with a URI identifier, a title and a description. The title and description are provided to make the subject discoverable through a text search.Equivalences are statements that identifiers are actually about the same thing. It maps one subject to another.Resources are statements about the address of additional information about a subject. It maps a subject to a resource URL.All three types of statement also have a provenance, disclosing where the assertion has been made enabling consumers to pick and choose who they trust.
Subj3ct accepts input in the form of feeds from an identifier provider via a simple ATOM format or RDF using the SKOS vocabulary. There is also a web front-end to allow registered users to create their own identifiers under the subj3ct.com domain.For consumers, the Subj3ct website provides a search interface to perform full-text search of the registered identifiers, with a number of advanced filtering options. There is also a REST API for applications to search the Subj3ct registry.
The goal of Subj3ct is to make it easier to build Linked Data applications. It makes it easier for providers of Linked data to find existing identifiers and either use them or declare and equivalence with them. It makes it possible for consumers of linked data to find related identifiers and resources. It makes mashups easier and applications smarter.