1. THE EVOLVING SEMANTIC
WORLD
Barbara McGlamery
Taxonomist
Martha Stewart Living Omnimedia
2. ABOUT ME
Masters in Library and Information Science
Long Island University
New York Public Library
Branch librarian
NYPL for the Performing Arts – Drama reference
Entertainment Weekly
Data Manager
Time Inc.
Senior Data Manager, Taxonomist, Metadata Architect, Ontologist
Martha Stewart Living Omnimedia
Taxonomist
4. The Semantic Web is a web of data…. (it) provides a
common framework that allows data to be shared and
reused across applications, enterprise, and community
boundaries.
--w3c
5. "The Semantic Web is not a separate Web but an
extension of the current one, in which information is
given well-defined meaning, better enabling computers
and people to work in cooperation.”
--Tim Berners-Lee, James Hendler, and Ora Lassila,
Scientific American, 2001
6. The Semantic Web is about making knowledge
machine and human-readable
10. BIG S SEMANTIC WEB
…big "S" web technologies provide a
framework for describing data on a web page when
the data on the website is published. If data is read
or captured, because the data's semantic meaning
has already been described, you don't have to go
through the process of understanding the meaning
of the data after the fact.
--Sean Martin, CEO of Cambridge Semantics
11. LITTLE S SEMANTICS
Little "s" web technologies capture and filter data with no
description or understanding of the data provided after
the capture process. The process of understanding the
meaning of that data starts once data capture has
happened. People have to intervene to provide the
context and meaning for language on the web.
--Sean Martin, CEO of Cambridge Semantics
12. Big S–
W3C approved
standard
Little s
Looser groups of unaffiliated
standards
14. ESSENTIALS OF BIG S SEMANTIC WEB
URI – Uniform Resource Identifier
RDF – Resource Description Framework
OWL – Web Ontology Language
Semantic reasoner (inference engine)
15. URI – UNIFORM RESOURCE IDENTIFIER
Way to identify things
Images, pages of text, locations
De-referenceable
Freebase
http://www.freebase.com/view/en/will_smith
• URI’s are unique, no two are the same
• Will Smith
http://www.freebase.com/view/en/will_smith
16. RDF – RESOURCE DESCRIPTION FRAMEWORK
Framework used to describe relationships between
objects
Extends and formalizes XML
Subject>Predicate>Object
17. RDF – RESOURCE DESCRIPTION FRAMEWORK
Subject>Predicate>Object
>> >>>
is the lead
actor
>>>>>>
Will
Smith Bad Boys
http://ew.com/PersonsTax/Will_Smith
http://ew.com/EntertainmentOnt/leadPe
rformanceIn
http://ew.com/EntertainmentTax/Mo
vies/Bad_Boys
18. OWL – WEB ONTOLOGY LANGUAGE
…designed to be used by applications that need to
process the content of information instead of just
presenting it to humans
-- W3C
19. OWL – WEB ONTOLOGY LANGUAGE
Metadata model
Extends RDF to further define properties
Ex: Equivalent relationships
>> >>>
is married to
>>>>>>
>> >>>
is married to
>>>>>>
20. SEMANTIC REASONER
Software able to infer logical consequences from a set
of asserted facts
Follows inference rules specified by OWL properties
Inverse
Transitive
Symmetric
Functional/Inverse functional
Equivalent
21. PUTTING IT ALL TOGETHER
Ontology
Rule set
Classes and Properties
Taxonomy
Application of Rule Set
Tags and Relationships
Everything is a statement
Subject>Predicate>Object
Ex: Will Smith is lead performer
in Bad Boys
22. BENEFITS OF RDF/OWL
Persistent URIs
Verifiable XML
Unambiguous Relationships
Polyhierarchy
Interoperability
23. LIMITATIONS OF RDF/OWL
Difficult to propagate across web
Challenge to integrate with legacy systems
Expensive queries
No “Killer App”
26. RDFa - Resource Description Framework (in) Attributes
W3C recommendation that adds a set
of attribute-level extensions to XHTML
for embedding rich metadata within
Web documents
Easy to implement
Not HTML 5 compliant
32. MICRODATA
A WHATWG HTML5 specification used to nest
semantics within existing content on web pages
Officially supported by Bing, Yahoo, & Google
Can imbed other markup languages like
RDFa, microformats, and Dublin Core
Not well-known (yet)
34. OPEN GRAPH PROTOCOL
Facebook-created markup language that turns any
web page into an Open Graph Objects allowing for
any page to become a Facebook page
I “Like” you
Good for targeted advertising
Limited in scope
36. BACK-OF-THE-NAPKIN COMPARISON
Features RDF/OW RDFa MF MD OGP
L
W3C X X X
standard
Extensible X X X
Pre-existing X X
Vocabs
Uses URIs X X
Easy to X X X X
implement
HMTL 5 X X X
compliant
Inferencing X
37. STATUS REPORT ON S SEMANTIC WEB
Linked Open Data graph growing
Many countries have developed government sites with
rich semantics
Development of Semantic search
More widespread adoption of lighter semantics
38. WHERE WE MIGHT BE GOING
Pharmaceutical industry identifies trends across clinical
studies, and not just within them
News industry better targets content by locale
Department of Defense using it to make better decisions
in the field
Utilized in advertising to drive more and more revenue
40. Barbara McGlamery
Taxonomist
Martha Stewart Living Omnimedia
(212)827-8817
bmcglamery@marthastewart.com
Notas do Editor
**The landscape of the semantic web is changing. Early adopters learned the hard lessons for all of us, that semantic web solutions can be difficult to implement and perhaps not vital to every organization’s interests. Barbara McGlamery, of Martha Stewart Living Omnimedia will share her experiences of building a Semantic Web tool from scratch for Time Inc. and how a smaller more manageable initiative has been undertaken at Martha Stewart. She’ll share case studies and lessons learned as well as give a glimpse as to how she sees the industry evolving.
Hello my name is blah. I am not a technical librarian, I am a librarian and when I was practicing it was in reference, not systems or back-end. So most of you out there have my respect and awe at knowing how the inside of a cataloging terminal works or TK (find out something LITA librarians do). My foray into the more technical aspects of librarianship came through html and web development.
**always refer to acronyms by full names: Resource Description Framework (RDF)Maybe a grid comparing RDF, Microformats, etcLandscape of SW – who created it and why?
In brief, The data is machine readable.
In short
Mention same as
Extends and formalizes XMLLinking structure of the Web to use URIs to name the relationship between thingsEx:
s designed for use by applications that need to process the content of information instead of just presenting information to humans.
Dif between a semantic reasoner and a regular inference engine is that a semantic reasoner knows the rules of owl. It is a more specific use.Inverse – Indicating the reciprocal property. For example, “owner of” is the inverse of the property “is owned by.”Transitive – Indicating that if this property applies between item 1 and item 2, and between item 2 and item 3, then it also applies between item 1 and item 3. For example, if Albuquerque “is located in” New Mexico, and New Mexico “is located in” the USA, then Albuquerque “is located in” the USA.Symmetric – Indicating that the inverse of this property is itself. For example, if the Time/Life Building “is near” Rockefeller Center, then it is also true that Rockefeller Center “is near” the Time/Life Building.Functional – Indicating that there can be only one value for this property for a given resource. For example “has birth mother” – the implication is that if a resource called Bob “has birth mother” Jane and also “has birth mother” Mrs. Smith, then we can assume that Jane and Mrs. Smith are the same person.Inverse Functional – Indicating that only one resource can have a given value for this property – which allows you to make assumptions that if two or more resources have that value, then they are really just two names for the same thing. This is very much like Functional, but in the opposite direction. For example, if there are two names with the same value for “has Social Security number,” we can assume that those are two names for the same person. Equivalent Property – like Equivalent Class, indicates that this property can be extended to the same set of resources that use another property. For example, EW.com’s “lead performance” would be an equivalent property to People’s “starring role.”
The bottom layers contain technologies that are well known from hypertext web and that without change provide basis for the semantic web.Middle layers contain technologies standardized by W3C to enable building semantic web applicationsTop layers contain technologies that are not yet standardized or contain just ideas that should be implemented in order to realize Semantic Web.Rules further extend OWL’s capabilitiesProof and Logic establish truth of statements, infer unstated factsTrust – Cryptology, authentication, trustworthiness of statementsSemantic Web FoundationsURI/IRI URI is an acronym for Uniform Resource Identifier; a compact string of characters used to identify or name a resource. The URL to a web site (e.g. http://www.semanticfocus.com) is a popular example of a URI. IRI is an acronym for Internationalized Resource Identifier which is a form of URI that uses characters beyond ASCII, thus becoming more useful in an international context. Unicode Unicode is the universal standard encoding system and provides a unified system for representing textual data. 1 million characters can be encoded to specify any character in any language without a single escape sequence or control code. Before Unicode, there were several different encoding systems which made communication and integration across borders a big pain. Now it's so much easier. Shout out to my peeps in Bangalore, 'haaaay' (अरे, दोस्त)! XML XML is an acronym for Extensible Markup Language. With XML, we have a standard way to compose information so that it can be more easily shared. At the same time, it still affords the freedom to structure that information however the heck we want. It's kind of like HTML - only, you get to make up your own tags and attributes. How cool is that? Namespaces Namespaces (aka XML Namespaces) are integral to XML. Namespaces provide a means to qualify the tags and attributes in an XML document with URIs which then makes them truly unique on the Web and thus, universal (among other things). XML Schema XML Schema describes the structure of XML documents just like DTDs, only better. An XML Schema is known as an XML Schema Definition (XSD). Basically, if you're going to use XML to invent your own document structures, XSD provides the way to define your rules (like guidelines) so that people and machines can understand them, adhere to them, and integrate with them. XML Query XML Query (aka XQuery) is a standardized language for combining documents, databases, Web pages and almost anything else. It is very widely implemented, powerful, and easy to learn. XQuery is replacing proprietary middleware languages and Web Application development languages. XQuery is replacing complex Java or C++ programs with a few lines of code. Personally, I think it is sufficient to refer to these foundational items with just a few broad concepts: Unicode, URI, and XML. Unicode gives us a universal system for encoding information in all of the world's writing systems. URI gives us a standard way to identify and locate resources. XML gives us a way to model information uniquely, yet still share it and integrate it in consistent ways. All together, they help us integrate content and services throughout the Web.
Really this should exist with the big S semantics, but it’s here bc the implementation is so light and doesn’t require an inference engine or the use of unambiguous relationships at all. IT is basically using URIs and imbedding structured metadata into the htmlAdds structured metadata to any XML based languageXHTML
LOD is a web initiative for orgs to share information in a rdf or rdfa format. It describes the resource that the URI identifies. This makes it possible for a user (or software agent) to "follow your nose" to find out more information related to the identified resource. -- wikipedia
Citation!
The Web Hypertext Application Technology Working Group (WHATWG) is a community of people interested in evolving HTML
Schema.org is the standards body that is promoting the adoption of the microdata format
Semantic search -- contextual meaning of terms as they appear in the searchable dataspace