The document summarizes the development of structured data and schema.org on the web. Early attempts included microformats and ontology models like FOAF, SKOS, and GoodRelations. Schema.org was later introduced in 2010 to provide common vocabularies for search engines and lower the bar for webmasters. It has led to widespread adoption with over 15% of sites using it. This structured data has enabled additional features in search engines like knowledge graphs at Google and rich pins at Pinterest. It has also allowed applications in areas like recommendations, actions, and reservations.
SES Chicago "Developments in Information Retrieval on the Web"
The Web Comes Alive with Data! Schema.org and Structured Data on the Web: Past, Present, Potential
1. The Web Comes Alive with Data!
Schema.org and Structured Data on
the Web: Past, Present, Potential
@jaymyers
Google DevFest Twin Cities
February 8, 2014
2. • Early adopter
• Semantic
Web, Linked &
Open data
enthusiast
• Speaker
3. Web of Today
•
•
•
•
25 million web sites
Trillions of web pages
5 billion web pages change every day
1000x more web pages on the “deep web”
7. Goals
• Create a web for both humans and machines
• Entice webmasters to make metadata
available through structured HTML
• Gain access to the meaning of web sites
11. Microformats example
<div class="item hproduct">
<ol>
<li class="lister vcard"><a class="url fn" href="http://storename.com">Magers and Quinn</a></li>
<li class="category"><a href="http://storename.com/categories/books">Books</a></li>
</ol>
<img src="http://images.storename.com/products/ramsay-fast-food.jpg" class="photo" alt="gordon ramsay fast food
book" />
<p><span class="condition">New:</span> <span class="price">$27.99</span></p>
<p>Pub price: $35.00</p>
<p>Hardcover</p>
<p class="availability">Out of stock</p>
<h1 class="fn">Gordon Ramsay's Fast Food Recipes from the F Word</h1>
<p>By Ramsay, Gordon</p>
<dl class="identifier">
<dt>ISBN:</dt>
<dd>1554700647</dd>
</dl>
<dl class="identifier">
<dt>Publisher:</dt>
<dd>Amer Youth Hostels</dd>
</dl>
<h4>Publishers Comments</h4>
<p class="description">A celebrity host of Hell's Kitchen features more than one hundred accessible recipes that are
organized in accordance with everyday needs and special occasions, in a volume that places an emphasis on fast
preparation and features complementary tips on stocking a pantry.</p>
</div>
20. Value prop:
“Give us your data in a machinereadable format and we’ll make
your stuff more attractive in search
results”
21. Results
• 1000x increase in structured markup
• Increases in user engagement (click throughs)
for SERP objects created from structured
markup
• Small number of interesting applications built
on top of structured data
22. But…
• Too many choices
(syntax, ontology, etc.),
fragmented
• A lot of bad markup –
up to 40%
• Not easy enough for
your average “Joe
Webmaster”
24. schema.org
• Common vocabularies that search engines
can understand
• Lower the bar for webmasters to publish
data on the web
• Improve user experience through data
25. Introducing: Microdata
<div id="pagecontent" itemscope itemtype="http://schema.org/Person">
<a href="/media/rm974696448/nm2578007?ref_=nm_ov_ph"> <img id="name-poster"
alt="Kim Kardashian Picture" title="Kim Kardashian Picture"
src="http://ia.mediaimdb.com/images/M/MV5BMTc0MjkzOTAxNV5BMl5BanBnXkFtZTcwNTk1NjcyNw@@._V1_SX214_CR0,
0,214,317_.jpg" itemprop="image"/>
</a>
<h1 class="header"> <span class="itemprop" itemprop="name">Kim
Kardashian</span></h1>
<div class="infobar" id="name-job-categories">
<span class="itemprop" itemprop="jobTitle">Actress</span>
<span class="itemprop" itemprop="jobTitle">Producer</span>
</div>
<div class="inline" itemprop="description">
TV star, entrepreneur, fashion designer, and author (New York Times bestseller - "Kardashian Konfidential"), Kim Kardashian first burst onto the scene in 2007, after the premiere
of her hit E! Entertainment reality series ...
</div>
<time datetime="1980-10-21" itemprop="birthDate">
<a href="/search/name?birth_monthday=1021&refine=birth_monthday&ref_=nm_ov_bth_monthday" >October 21</a>,
<a href="/search/name?birth_year=1980&ref_=nm_ov_bth_year" >1980</a>
</time>
</div>
26.
27. Looks Like We’ve Got Something Here!
• 15% of all sites contain schema.org markup
• Many major sites
• Adoption by content systems like Drupal and
Wordpress
• Around 1200 object types and growing
• Significant reduction in error rates
33. Other Applications
Gmail “Actions in the Inbox”
• Actions – rent a movie, buy something
• Orders – post transaction order
confirmation, shipping status
• Reservations – restaurant, travel, tickets
36. Credits
Guha, Ramanathan V. “Light at the End of the Tunnel.” 12th International Semantic Web Conference
(ISWC), Sydney, NSW, Australia. 23 October 2013. Keynote Address.