SlideShare uma empresa Scribd logo
1 de 25
The role of Linked Data in Search and Online Media Peter Mika Researcher and Data Architect Yahoo! Inc.
Search
Information box with content from and links to Yahoo! Travel For Yahoo!, search is much more than 10 blue links Points of interest in Vienna, Austria Shopping results from  Yahoo! Shopping Since Aug, 2010, ‘regular’ search results are ‘Powered by Bing’
Not just search: advertizing How could we get publishers to tell us: which pages are about a  person or this particular person?
Creating an ecosystem of publishers, developers and end-users  Publishers provide structured data embedded in HTML Yahoo crawls this data and makes it available to developers Developers help to transform data into rich results displays End users benefit from richer and more relevant search results Yahoo! SearchMonkey Page Extraction RDF/Microformat Markup Acme.com’s  Web Pages Index DataRSS feed Web Services Acme.com’s database
Yahoo has been a first adopter of Semantic Web technology First search engine to support RDFa Promoting the use of standard ontologies Helping publishers to get on the Semantic Web Working with the community on developing and maintaining ontologies VoCamp series of events Working with the W3C Making the data available  Yahoo! BOSS: API for developers  Yahoo! Webscope datasets for research Yahoo! SearchMonkey
Facets and Enhanced Results in Yahoo! Search Restrict search results to pages with product data. Star rating, price, image (where available) displayed as part of the abstract.
Launched in May, 2008, over time rolled out in >20 markets >400% increase in RDFa data >15% increase in click-through rates for some sites User studies confirm that users prefer enhanced results >15,000 developers, >400 applications in the gallery Welcomed by both the traditional search press and the SW community A year later, implemented by Google as “Rich Snippets” No developer tool Google opts to create their own ontology Embedded metadata is now a part of Search Engine Optimization (SEO)  The success of SearchMonkey
Percentage of URLs with certain forms of embedded data RDFa data in over 3.5% of webpages
Future expected benefits Query formulation “Snap to grid” Showing related entities based on an initial query Guiding the user in constructing the query  Making the user aware of the interpretation of the query Ranking Semantic search engines exist as prototypes  Semantic Search workshop series and evaluations ESWC 2008, WWW 2009, WWW 2010 Result presentation Snippet generation Adaptive and interactive presentation Aggregated search Task completion
The Web of Objects
Yahoo! started by cataloguing the best of the Web…
Traditionally, traffic flows from the homepage and search to static content services Homepage Web search
Today, we have a network where new user experiences are created on-demand Homepage Web search
Implicit search: Contextual Shortcuts in Yahoo! News Hovering over anunderlined phrase triggers a search for related news items.
Creating personalized experiences:Content Optimizing Knowledge Engine (COKE) Machine Learning based ‘search’ algorithm selects the main story and the three alternate stories based on the users demographics (age, gender etc.) and previous behavior.   Results in 30-60% increase in CTR compared to editorial. Display advertizing is a similar top-1 search problem on the collection of advertisements. Users can opt-out of the behavioral targeting of ads through AdChoices.
Hyperlocal experiences at Yahoo! Hyperlocal: showing content from across Yahoo that is relevant to a particular neighbourhood.
From topic pages to creating entire sites:Yahoo’s World Cup site  Yahoo’s World Cup website has been almost three times as popular as the second most visited site. (Hitwise, US, June 2010)
Semantic technologies for content As most media companies, Yahoo has a fragmented content landscape Content is acquired from hundreds of data providers each using their own proprietary formats Large amounts of structured data extracted from webpages The role of semantic technology is to unify content Unique identifier for each object RDF-like representation with rich metadata about each attribute value and relationship E.g. licensing, serving restrictions, provider OWL 2 as an ontology language
The Web of Objects A growing graph that will eventually cover the attributes and relationships of all entities known to Yahoo!  Yahoo! Sports Yahoo! Movies Yahoo! News Yahoo! Local  Yahoo! Music
Benefits Increased coverage for existing products that require entity graphs for navigation Dynamic interlinking of content E.g. direct links from Yahoo! News to background information in Yahoo! Music about an artist Dynamic composition of web pages Topic-entity pages Better understanding of user intent Semantic analysis of query logs Semantic analysis of navigation paths
Summary
Linked Data for Search and Online Media Enriching Search Answering specific information needs in vertical domains Helping users to discover related queries/content and to understand search results Linking owned content assets and the best of the Web Content optimization and display advertizing  Recommendations (for example, related articles) Entire new sites generated on demand There is a need for both standards and industry agreements in establishing the data layer of the Web
The End Credits to Yahoos around the world Contact me at pmika@yahoo-inc.com Internships, faculty and student grants available!
The role of Linked Data in Search and Online Media Peter Mika Researcher and Data Architect Yahoo! Inc.

Mais conteúdo relacionado

Mais de FIA2010

Josef Weber (Siemens): Scenarios for Future Internet Business@Energy
Josef Weber (Siemens): Scenarios for Future Internet Business@EnergyJosef Weber (Siemens): Scenarios for Future Internet Business@Energy
Josef Weber (Siemens): Scenarios for Future Internet Business@EnergyFIA2010
 
Keith Popplewell, Jenny Harding: Realising the Digital Opportunity: Redesigni...
Keith Popplewell, Jenny Harding: Realising the Digital Opportunity: Redesigni...Keith Popplewell, Jenny Harding: Realising the Digital Opportunity: Redesigni...
Keith Popplewell, Jenny Harding: Realising the Digital Opportunity: Redesigni...FIA2010
 
Josema Cavanillas: An industry view on Future Internet Businesses
Josema Cavanillas: An industry view on Future Internet BusinessesJosema Cavanillas: An industry view on Future Internet Businesses
Josema Cavanillas: An industry view on Future Internet BusinessesFIA2010
 
Ingrid Moerman, Stefan Bouckaert: IP CREW - Cognitive Radio Experimentation ...
Ingrid Moerman, Stefan Bouckaert:  IP CREW - Cognitive Radio Experimentation ...Ingrid Moerman, Stefan Bouckaert:  IP CREW - Cognitive Radio Experimentation ...
Ingrid Moerman, Stefan Bouckaert: IP CREW - Cognitive Radio Experimentation ...FIA2010
 
Smart Santander
Smart Santander Smart Santander
Smart Santander FIA2010
 
Ofelia open calls
Ofelia open callsOfelia open calls
Ofelia open callsFIA2010
 
Julie Marguerite - Tefis open calls (fia dec 2010)
Julie Marguerite - Tefis open calls  (fia dec 2010)Julie Marguerite - Tefis open calls  (fia dec 2010)
Julie Marguerite - Tefis open calls (fia dec 2010)FIA2010
 
Florian Schreiner: Plans for open calls and offering by BonFIRE
Florian Schreiner: Plans for open calls and offering by BonFIREFlorian Schreiner: Plans for open calls and offering by BonFIRE
Florian Schreiner: Plans for open calls and offering by BonFIREFIA2010
 
Jacques Magen - Future Internet Research and Experimentation (FIRE): Successf...
Jacques Magen - Future Internet Research and Experimentation (FIRE): Successf...Jacques Magen - Future Internet Research and Experimentation (FIRE): Successf...
Jacques Magen - Future Internet Research and Experimentation (FIRE): Successf...FIA2010
 
Obj 1.6 FIRE: Paradiso2 Roger Torrenti
Obj 1.6 FIRE: Paradiso2 Roger Torrenti Obj 1.6 FIRE: Paradiso2 Roger Torrenti
Obj 1.6 FIRE: Paradiso2 Roger Torrenti FIA2010
 
Mikhail Simonov - The enabling role of the information broker: an example
Mikhail Simonov - The enabling role of the information broker: an example Mikhail Simonov - The enabling role of the information broker: an example
Mikhail Simonov - The enabling role of the information broker: an example FIA2010
 
J. Cave - Information as an economic good in the future internet
J. Cave - Information as an economic good in the future internetJ. Cave - Information as an economic good in the future internet
J. Cave - Information as an economic good in the future internetFIA2010
 
Latif Ladid - Ipv6, The two-way internet. The next big thing
Latif Ladid - Ipv6, The two-way internet. The next big thingLatif Ladid - Ipv6, The two-way internet. The next big thing
Latif Ladid - Ipv6, The two-way internet. The next big thingFIA2010
 
Ranganai Chaparadza: Can Autonomicity help Migration, and what could be a pos...
Ranganai Chaparadza: Can Autonomicity help Migration, and what could be a pos...Ranganai Chaparadza: Can Autonomicity help Migration, and what could be a pos...
Ranganai Chaparadza: Can Autonomicity help Migration, and what could be a pos...FIA2010
 
Ultan mulligan - Future Network Standardisation at ETSI
Ultan mulligan - Future Network Standardisation at ETSIUltan mulligan - Future Network Standardisation at ETSI
Ultan mulligan - Future Network Standardisation at ETSIFIA2010
 
Walter Colitti (Vrije Universiteit Brussel): Our vision
Walter Colitti (Vrije Universiteit Brussel): Our visionWalter Colitti (Vrije Universiteit Brussel): Our vision
Walter Colitti (Vrije Universiteit Brussel): Our visionFIA2010
 
Isidro Laso Ballesteros (DG Information Society and Media) Internet Architect...
Isidro Laso Ballesteros (DG Information Society and Media) Internet Architect...Isidro Laso Ballesteros (DG Information Society and Media) Internet Architect...
Isidro Laso Ballesteros (DG Information Society and Media) Internet Architect...FIA2010
 
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...FIA2010
 
Dimitri Papadimitriou (Alcatel-Lucent Bell) - Future Internet Architecture (F...
Dimitri Papadimitriou (Alcatel-Lucent Bell) - Future Internet Architecture (F...Dimitri Papadimitriou (Alcatel-Lucent Bell) - Future Internet Architecture (F...
Dimitri Papadimitriou (Alcatel-Lucent Bell) - Future Internet Architecture (F...FIA2010
 
1430 session agenda
1430 session agenda1430 session agenda
1430 session agendaFIA2010
 

Mais de FIA2010 (20)

Josef Weber (Siemens): Scenarios for Future Internet Business@Energy
Josef Weber (Siemens): Scenarios for Future Internet Business@EnergyJosef Weber (Siemens): Scenarios for Future Internet Business@Energy
Josef Weber (Siemens): Scenarios for Future Internet Business@Energy
 
Keith Popplewell, Jenny Harding: Realising the Digital Opportunity: Redesigni...
Keith Popplewell, Jenny Harding: Realising the Digital Opportunity: Redesigni...Keith Popplewell, Jenny Harding: Realising the Digital Opportunity: Redesigni...
Keith Popplewell, Jenny Harding: Realising the Digital Opportunity: Redesigni...
 
Josema Cavanillas: An industry view on Future Internet Businesses
Josema Cavanillas: An industry view on Future Internet BusinessesJosema Cavanillas: An industry view on Future Internet Businesses
Josema Cavanillas: An industry view on Future Internet Businesses
 
Ingrid Moerman, Stefan Bouckaert: IP CREW - Cognitive Radio Experimentation ...
Ingrid Moerman, Stefan Bouckaert:  IP CREW - Cognitive Radio Experimentation ...Ingrid Moerman, Stefan Bouckaert:  IP CREW - Cognitive Radio Experimentation ...
Ingrid Moerman, Stefan Bouckaert: IP CREW - Cognitive Radio Experimentation ...
 
Smart Santander
Smart Santander Smart Santander
Smart Santander
 
Ofelia open calls
Ofelia open callsOfelia open calls
Ofelia open calls
 
Julie Marguerite - Tefis open calls (fia dec 2010)
Julie Marguerite - Tefis open calls  (fia dec 2010)Julie Marguerite - Tefis open calls  (fia dec 2010)
Julie Marguerite - Tefis open calls (fia dec 2010)
 
Florian Schreiner: Plans for open calls and offering by BonFIRE
Florian Schreiner: Plans for open calls and offering by BonFIREFlorian Schreiner: Plans for open calls and offering by BonFIRE
Florian Schreiner: Plans for open calls and offering by BonFIRE
 
Jacques Magen - Future Internet Research and Experimentation (FIRE): Successf...
Jacques Magen - Future Internet Research and Experimentation (FIRE): Successf...Jacques Magen - Future Internet Research and Experimentation (FIRE): Successf...
Jacques Magen - Future Internet Research and Experimentation (FIRE): Successf...
 
Obj 1.6 FIRE: Paradiso2 Roger Torrenti
Obj 1.6 FIRE: Paradiso2 Roger Torrenti Obj 1.6 FIRE: Paradiso2 Roger Torrenti
Obj 1.6 FIRE: Paradiso2 Roger Torrenti
 
Mikhail Simonov - The enabling role of the information broker: an example
Mikhail Simonov - The enabling role of the information broker: an example Mikhail Simonov - The enabling role of the information broker: an example
Mikhail Simonov - The enabling role of the information broker: an example
 
J. Cave - Information as an economic good in the future internet
J. Cave - Information as an economic good in the future internetJ. Cave - Information as an economic good in the future internet
J. Cave - Information as an economic good in the future internet
 
Latif Ladid - Ipv6, The two-way internet. The next big thing
Latif Ladid - Ipv6, The two-way internet. The next big thingLatif Ladid - Ipv6, The two-way internet. The next big thing
Latif Ladid - Ipv6, The two-way internet. The next big thing
 
Ranganai Chaparadza: Can Autonomicity help Migration, and what could be a pos...
Ranganai Chaparadza: Can Autonomicity help Migration, and what could be a pos...Ranganai Chaparadza: Can Autonomicity help Migration, and what could be a pos...
Ranganai Chaparadza: Can Autonomicity help Migration, and what could be a pos...
 
Ultan mulligan - Future Network Standardisation at ETSI
Ultan mulligan - Future Network Standardisation at ETSIUltan mulligan - Future Network Standardisation at ETSI
Ultan mulligan - Future Network Standardisation at ETSI
 
Walter Colitti (Vrije Universiteit Brussel): Our vision
Walter Colitti (Vrije Universiteit Brussel): Our visionWalter Colitti (Vrije Universiteit Brussel): Our vision
Walter Colitti (Vrije Universiteit Brussel): Our vision
 
Isidro Laso Ballesteros (DG Information Society and Media) Internet Architect...
Isidro Laso Ballesteros (DG Information Society and Media) Internet Architect...Isidro Laso Ballesteros (DG Information Society and Media) Internet Architect...
Isidro Laso Ballesteros (DG Information Society and Media) Internet Architect...
 
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
 
Dimitri Papadimitriou (Alcatel-Lucent Bell) - Future Internet Architecture (F...
Dimitri Papadimitriou (Alcatel-Lucent Bell) - Future Internet Architecture (F...Dimitri Papadimitriou (Alcatel-Lucent Bell) - Future Internet Architecture (F...
Dimitri Papadimitriou (Alcatel-Lucent Bell) - Future Internet Architecture (F...
 
1430 session agenda
1430 session agenda1430 session agenda
1430 session agenda
 

Linked Data, Search and Social Media: Peter Mika (Yahoo! Research, Spain)

  • 1. The role of Linked Data in Search and Online Media Peter Mika Researcher and Data Architect Yahoo! Inc.
  • 3. Information box with content from and links to Yahoo! Travel For Yahoo!, search is much more than 10 blue links Points of interest in Vienna, Austria Shopping results from Yahoo! Shopping Since Aug, 2010, ‘regular’ search results are ‘Powered by Bing’
  • 4. Not just search: advertizing How could we get publishers to tell us: which pages are about a person or this particular person?
  • 5. Creating an ecosystem of publishers, developers and end-users Publishers provide structured data embedded in HTML Yahoo crawls this data and makes it available to developers Developers help to transform data into rich results displays End users benefit from richer and more relevant search results Yahoo! SearchMonkey Page Extraction RDF/Microformat Markup Acme.com’s Web Pages Index DataRSS feed Web Services Acme.com’s database
  • 6. Yahoo has been a first adopter of Semantic Web technology First search engine to support RDFa Promoting the use of standard ontologies Helping publishers to get on the Semantic Web Working with the community on developing and maintaining ontologies VoCamp series of events Working with the W3C Making the data available Yahoo! BOSS: API for developers Yahoo! Webscope datasets for research Yahoo! SearchMonkey
  • 7. Facets and Enhanced Results in Yahoo! Search Restrict search results to pages with product data. Star rating, price, image (where available) displayed as part of the abstract.
  • 8. Launched in May, 2008, over time rolled out in >20 markets >400% increase in RDFa data >15% increase in click-through rates for some sites User studies confirm that users prefer enhanced results >15,000 developers, >400 applications in the gallery Welcomed by both the traditional search press and the SW community A year later, implemented by Google as “Rich Snippets” No developer tool Google opts to create their own ontology Embedded metadata is now a part of Search Engine Optimization (SEO) The success of SearchMonkey
  • 9. Percentage of URLs with certain forms of embedded data RDFa data in over 3.5% of webpages
  • 10. Future expected benefits Query formulation “Snap to grid” Showing related entities based on an initial query Guiding the user in constructing the query Making the user aware of the interpretation of the query Ranking Semantic search engines exist as prototypes Semantic Search workshop series and evaluations ESWC 2008, WWW 2009, WWW 2010 Result presentation Snippet generation Adaptive and interactive presentation Aggregated search Task completion
  • 11. The Web of Objects
  • 12. Yahoo! started by cataloguing the best of the Web…
  • 13. Traditionally, traffic flows from the homepage and search to static content services Homepage Web search
  • 14. Today, we have a network where new user experiences are created on-demand Homepage Web search
  • 15. Implicit search: Contextual Shortcuts in Yahoo! News Hovering over anunderlined phrase triggers a search for related news items.
  • 16. Creating personalized experiences:Content Optimizing Knowledge Engine (COKE) Machine Learning based ‘search’ algorithm selects the main story and the three alternate stories based on the users demographics (age, gender etc.) and previous behavior. Results in 30-60% increase in CTR compared to editorial. Display advertizing is a similar top-1 search problem on the collection of advertisements. Users can opt-out of the behavioral targeting of ads through AdChoices.
  • 17. Hyperlocal experiences at Yahoo! Hyperlocal: showing content from across Yahoo that is relevant to a particular neighbourhood.
  • 18. From topic pages to creating entire sites:Yahoo’s World Cup site Yahoo’s World Cup website has been almost three times as popular as the second most visited site. (Hitwise, US, June 2010)
  • 19. Semantic technologies for content As most media companies, Yahoo has a fragmented content landscape Content is acquired from hundreds of data providers each using their own proprietary formats Large amounts of structured data extracted from webpages The role of semantic technology is to unify content Unique identifier for each object RDF-like representation with rich metadata about each attribute value and relationship E.g. licensing, serving restrictions, provider OWL 2 as an ontology language
  • 20. The Web of Objects A growing graph that will eventually cover the attributes and relationships of all entities known to Yahoo! Yahoo! Sports Yahoo! Movies Yahoo! News Yahoo! Local Yahoo! Music
  • 21. Benefits Increased coverage for existing products that require entity graphs for navigation Dynamic interlinking of content E.g. direct links from Yahoo! News to background information in Yahoo! Music about an artist Dynamic composition of web pages Topic-entity pages Better understanding of user intent Semantic analysis of query logs Semantic analysis of navigation paths
  • 23. Linked Data for Search and Online Media Enriching Search Answering specific information needs in vertical domains Helping users to discover related queries/content and to understand search results Linking owned content assets and the best of the Web Content optimization and display advertizing Recommendations (for example, related articles) Entire new sites generated on demand There is a need for both standards and industry agreements in establishing the data layer of the Web
  • 24. The End Credits to Yahoos around the world Contact me at pmika@yahoo-inc.com Internships, faculty and student grants available!
  • 25. The role of Linked Data in Search and Online Media Peter Mika Researcher and Data Architect Yahoo! Inc.

Notas do Editor

  1. Search is a form of content aggregation
  2. Traditionally, homepage and search are both entry points to content
  3. Again, page optimization and content aggregation are a form of search