The main trends in the use and development

•Transferir como PPT, PDF•

0 gostou•1,726 visualizações

Yuliya Tikhokhod

Presentation for KESW conference

Tecnologia Educação

Yuliya Tikhokhod
Project Manager, Yandex, Russia
The Main Trends in the
Use and Development
of Semantic Markup

• Why does Yandex need semantic markup?
• Basic facts about semantic markup
• Markup development (schema.org example)
Agenda

If you're so smart, why do you need someone to
help?
Why does Yandex need
semantic markup?

Since late 2009, we have
been using structured data
from webmasters

9
Collecting
data
Affiliate
program
Forms
XML
Other file
Semantic
markup

• Syntax and Vocabulary
• Usage
• Statistics
Semantic Markup

14
andexbeganusingsemanticmarkup
EnhancedSnippets
Services
OpenGraph
Improvedsearchalgorithms
Firstcommittoschema.org
Actions,JSON-LD
YandexIslands
Microdata
2009 2010
Schema.org
2011 2013
RDFa
2008
Microformats
2005

15
24% of documents in the
internet contain some
semantic markup

20
Explore
• What?
• Where?
• How often?
• Problems?
Internal discuss
Public-vocabs@w3.orgExternal comments

34
• http://schema.org/
• http://blog.schema.org/
• http://www.w3.org/wiki/WebSchemas
• public-vocabs@w3.org
• http://help.yandex.com/webmaster/?id=1127824
• http://webmaster.yandex.ru/microtest.xml
Useful links

Yuliya Tikhokhod
Project Manager
tilid@yandex-team.ru
@tihohodka
Thank you

Mais conteúdo relacionado

Mais procurados

Distributed Deep Learning (And How to Get Involved)Sina Sheikholeslami

Smart Data Applications powered by the Wikidata Knowledge GraphPeter Haase

Distributed deep learningAlireza Shafaei

Knowledge graphs + Chatbots with Neo4jChristophe Willemsen

Big Data PitfallsAlex Meadows

Indexing, searching, and aggregation with redi search and .netStephen Lorello

JugMarche: Neo4j 2 (Cypher)Onofrio Panzarino

In search of database nirvana - The challenges of delivering Hybrid Transacti...Rohit Jain

ROI in Linking Content to CRM by Applying the Linked Data StackMartin Voigt

Advanced Analytics and Machine Learning with Data VirtualizationDenodo

2017-01-08-scaling tribalknowledgeChristopher Williams

Identify Database User Group Meeting 2017 UKRinggold Inc

TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...LIBER Europe

Approaching graph dbSergey Enin

The Power of Semantic Technologies to Explore Linked Open DataOntotext

JSON-LD and SHACL for Knowledge GraphsFranz Inc. - AllegroGraph

Top 5 Considerations When Evaluating NoSQLMongoDB

The Kasabi Information MarketplaceKnud Möller

LD4KD 2015 - Demos and toolsVrije Universiteit Amsterdam

Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Fabrizio Orlandi

Mais procurados (20)

Distributed Deep Learning (And How to Get Involved)

Smart Data Applications powered by the Wikidata Knowledge Graph

Distributed deep learning

Knowledge graphs + Chatbots with Neo4j

Big Data Pitfalls

Indexing, searching, and aggregation with redi search and .net

JugMarche: Neo4j 2 (Cypher)

In search of database nirvana - The challenges of delivering Hybrid Transacti...

ROI in Linking Content to CRM by Applying the Linked Data Stack

Advanced Analytics and Machine Learning with Data Virtualization

2017-01-08-scaling tribalknowledge

Identify Database User Group Meeting 2017 UK

TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...

Approaching graph db

The Power of Semantic Technologies to Explore Linked Open Data

JSON-LD and SHACL for Knowledge Graphs

Top 5 Considerations When Evaluating NoSQL

The Kasabi Information Marketplace

LD4KD 2015 - Demos and tools

Benchmarking RDF Metadata Representations: Reification, Singleton Property an...

Destaque

Передача дополнительных сведений о сайте с помощью семантической разметкиYuliya Tikhokhod

Energia xerachxeritxu

Правильная семантическая разметка для всехYuliya Tikhokhod

Trabajando con las redesElisabetttt

Gdca broucherAdv Bornak B R

Iraitz paratua power point xerachxeritxu

Life Power Music Mentoring Music Arts Summer Camp Newport News 2k14Sean Slaughter

Iraitz paratua power point xerachxeritxu

Water tank pres.room6nps

Ingurune xerach eta arkaitz compatiblexeritxu

Gdca broucherAdv Bornak B R

Gdca 2015 - broucherAdv Bornak B R

97th Constitutional Amendment Act, 2011 for Co-operative SectorAdv Bornak B R

Roman Republic and Empireluzchavez-gutierrez

G.D.C.&A. and C.H.M.- JOB & CAREER ORIENTED GOVERNMENT COURSES.Adv Bornak B R

Blake snyder savew the c atTomCritch96

Destaque (16)

Передача дополнительных сведений о сайте с помощью семантической разметки

Energia xerach

Правильная семантическая разметка для всех

Trabajando con las redes

Gdca broucher

Iraitz paratua power point xerach

Life Power Music Mentoring Music Arts Summer Camp Newport News 2k14

Iraitz paratua power point xerach

Water tank pres.

Ingurune xerach eta arkaitz compatible

Gdca broucher

Gdca 2015 - broucher

97th Constitutional Amendment Act, 2011 for Co-operative Sector

Roman Republic and Empire

G.D.C.&A. and C.H.M.- JOB & CAREER ORIENTED GOVERNMENT COURSES.

Blake snyder savew the c at

Semelhante a The main trends in the use and development

The SEO Magic of Structured DataKatherine White (McCann)

Knowledge Graph for Machine Learning and Data ScienceCambridge Semantics

The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014Robert Meusel

Metrics Analysis for GLAMsmlascarides

SharePoint & jQuery Guide - SPSTC 5/18/2013 Mark Rackley

20181123 dn2018 graph_analytics_k_patengeKarin Patenge

Graph Databases, The Web of Data Storage EnginesPere Urbón-Bayes

Schema.org Update at ISWC2012Alex Shubin

Disrupting Data Discoverymarkgrover

Lightning Talk: Get Even More Value from MongoDB ApplicationsMongoDB

The Semantic Web and Drupal 7 - Loja 2013scorlosquet

Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesMongoDB

MIGRATION - PAIN OR GAIN?DrupalCamp Kyiv

Neo4j GraphDay Seattle- Sept19- in the enterpriseNeo4j

LDM Slides: Data Modeling for XML and JSONDATAVERSITY

JahiaOne - Semantic Web with JahiaJahia Solutions Group

Frontend performance metricsАртем Захарченко

Semantics and Machine LearningVladimir Alexiev, PhD, PMP

R training at AimiaAli Arsalan Kazmi

2013 Enterprise Track, Building GIS, Decision Support, and Location Intellige...GIS in the Rockies

Semelhante a The main trends in the use and development (20)

The SEO Magic of Structured Data

Knowledge Graph for Machine Learning and Data Science

The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014

Metrics Analysis for GLAMs

SharePoint & jQuery Guide - SPSTC 5/18/2013

20181123 dn2018 graph_analytics_k_patenge

Graph Databases, The Web of Data Storage Engines

Schema.org Update at ISWC2012

Disrupting Data Discovery

Lightning Talk: Get Even More Value from MongoDB Applications

The Semantic Web and Drupal 7 - Loja 2013

Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes

MIGRATION - PAIN OR GAIN?

Neo4j GraphDay Seattle- Sept19- in the enterprise

LDM Slides: Data Modeling for XML and JSON

JahiaOne - Semantic Web with Jahia

Frontend performance metrics

Semantics and Machine Learning

R training at Aimia

2013 Enterprise Track, Building GIS, Decision Support, and Location Intellige...

Último

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3

Training state-of-the-art general text embeddingZilliz

From Family Reminiscence to Scholarly Archive .Alan Dix

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

"ML in Production",Oleksandr BaganFwdays

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett

Sample pptx for embedding into website for demoHarshalMandlekar2

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3

What is Artificial Intelligence?????????blackmambaettijean

unit 4 immunoblotting technique complete.pptxBkGupta21

The main trends in the use and development

1. Yuliya Tikhokhod Project Manager, Yandex, Russia The Main Trends in the Use and Development of Semantic Markup

2. • Why does Yandex need semantic markup? • Basic facts about semantic markup • Markup development (schema.org example) Agenda

3. If you're so smart, why do you need someone to help? Why does Yandex need semantic markup?

4. 4

5. 5 Yande x

6. 6 Data Mining

7. 7 Data from webmasters

8. Since late 2009, we have been using structured data from webmasters

9. 9 Collecting data Affiliate program Forms XML Other file Semantic markup

10. • Why does Yandex need semantic markup? • Basic facts about semantic markup • Markup development (schema.org example) Agenda

11. • Syntax and Vocabulary • Usage • Statistics Semantic Markup

12. 12

13. 13

14. 14 andexbeganusingsemanticmarkup EnhancedSnippets Services OpenGraph Improvedsearchalgorithms Firstcommittoschema.org Actions,JSON-LD YandexIslands Microdata 2009 2010 Schema.org 2011 2013 RDFa 2008 Microformats 2005

15. 15 24% of documents in the internet contain some semantic markup

16. 16 Statistics for September 2013

17. 17 Usage

18. • Why does Yandex need semantic markup? • Basic facts about semantic markup • Markup development (schema.org example) Agenda

19. 19

20. 20 Explore • What? • Where? • How often? • Problems? Internal discuss Public-vocabs@w3.orgExternal comments

21. Latest updates

22. 22 Actions

23. 23 GoodRelations

24. 24 LRMI

25. 25 Health and Medical vocabulary

26. 26 New syntax for schema.org – JSON-LD

27. Future work

28. 28 Potential actions

29. 29 Civic services

30. 30 Reservations

31. 31 Event schema update

32. 32 Accessibility

33. 33 Other schemas

34. 34 • http://schema.org/ • http://blog.schema.org/ • http://www.w3.org/wiki/WebSchemas • public-vocabs@w3.org • http://help.yandex.com/webmaster/?id=1127824 • http://webmaster.yandex.ru/microtest.xml Useful links

35. Yuliya Tikhokhod Project Manager tilid@yandex-team.ru @tihohodka Thank you

Notas do Editor

Hi, my name is Yuliya. I am working for Yandex at Semantic Web Project. Today I intend to discuss The Main Trends in the Use and Development of Semantic Markup
Firstly I want to talk about the reasons for using semantic markup in Yandex. Then we'll talk a little bit about the basic terms. Finally in general discuss the development of semantic markup an example schema.org
So, why do we need all this stuff?
There is a huge pile of raw data in the Internet. But it's not enough for give an answer to our users. To give them good answer we need knowledge rather than raw data.
We can extract knowledge automatically (using machine learning, language technologies or specialized parsers). And we can get knowledge about content of web pages directly from the webmasters. Both methods have their advantages and disadvantages.
Self data mining allows us not be dependent on webmasters. Furthermore, this method is more is technological. But sometimes we need special parser for each web site. An important disadvantage of this method is the lack of webmasters the opportunity to influence our knowledge of their site.
On the other side the receipt of data from the webmasters also have advantages and disadvantages. It is good that we get information about the contents of pages from the people who really know what is written on it. In addition, we need to make less effort to use those knowledge in search. But from the other hand many people is not so honest as I'd wish to. And they may try to fraud the system. And, of course, not all webmasters want to make an effort to give us any information.
In view of the above at the end of 2009 we started to use in our services the additional information sent by webmasters.
How we can collect information from webmasters? First of all by using special tools. Second, by using XML-files special formats. And other files. Even excel. Another variance does not involve something other than HTML code of pages. Semantic markup is included directly in page's source code.
Let's talk about semantic markup.
I want to say some words about syntax and vocabulary, tell about usage of semantic markup and bring some statistics.
Semantic markup consist of syntax and vocabulary. First is about how we put information into pages. Second is about what information we give.
There are for main syntax of semantic markup: RDFa, Microformats, Microdata and the newest - JSON-LD. And then there are some dictionaries that can be used with these syntaxes. The oldest one is DublinCore. Originally it was created in 1995. In Russia there is even a Standard, describing the Dublin Core. It is very simple and contains only 15 elements. Do not be surprised that microformats are listed as a vocabulary.This is because there are mixed form and meaning. GoodRelations is a specialized vocabulary that describes the goods and services. Open Graph Protocol is an initiative of Facebook. It is a simple way to convey the most important information about content of page. Schema.org is the most promising dictionary, supported by Google, Bing, Yahoo, and by Yandex.
Some history. A long long time ago far far away in the Galaxy... wait! It's another story. We begun using semantic markup in late 2009. We start makin rich snippet and services based on semantic markup. In the next year W3C announced HTML5 and microdata. And we started usage this method in our products. We even wrote a dictionary of data about encyclopedias. Than Facebook has announced The Open Graph Protocol. The following year was created schema.org. And the world has changed. We came up with new ways to use this markup. As well as changes in the schema.org. The first Yandex proposal in schema.org was PeopleAudience. Now it is accepted and published, but it takes a lot of time to do this. From the outside it seems that there is nothing easier than to add a few new properties. But you should predict what people think and what they might think. How will webmasters and consumers use this data. Isn't it too difficult? Do you want to specify the gender of the target audience? Be ready to think about that it might offend people belonging to one sex but identify themselves with the other. Do you want to specify the age of the target audience of the content? It's might to offend adults who love to read children's books. To date, we have actions and JSON-LD syntax . And we use it in Yandex.Islands.
According to our base 24% documents in the internet contains some semantic markup. A lot or a little? Of course, this is far from 100%, but over the past three years, the number has risen to more than twice.
Here you can see our statistics of semantic markup distribution. The most popular vocabulary is The Open Graph Protocol. Next is schema.org. And those small bar is GoodRelations.
How can this data be used? The major consumer is Search Engines. It uses this data for creation rich snippets and reception content from webmasters to some services. For example, Yandex creates rich snippets for recipes, dictionary articles, movies, chords, etc. And uses information extracted from microdata in Video, Auto, Images and other services. But not only search engines consume semantic markup. Other internet companies also can do this. For example, pinterest uses OG and Schema.org for creating Rich Pins. Facebook, Google , twitter and other social network can create rich snippets for shared links.
Schema.org does not stand still. There two level of changing: 1) Public feedback and discussion. The most important point from publick discussion goes to work group 2) Work group consist of delegate from 4 search engines (Yandex, Google, Bing and Yahoo). They decide wether to make changes or not.
If you have some idea, problem or question you can send it to Public-vocabs@w3.org You also can read this mail list and reply to the questions and help to solves someone's problems.
If the idea has sense it will work through the working group. First of all we explore the idea. What the idea is? Where we sould place this change? How often is this use case? What are the challenges we face? Than we should discuss this idea. When all are agreed formulated idea sends to Public-vocabs@w3.org. Next step is collecting feedback from community. If there is a significant comments we need to repeat the cycle. It seems that no idea will never be accepted. But it is not true. And here are some new updates.
Actions - it's like a verb in the vocabulary
GoodRelations - this is about integration between schema.org and GoodRelations
Integration with vocabulary for learning resources metadata
Health and Medical vocabulary - this is about including Health and Medical vocabulary
JSON-LD - it's about using schema.org in new syntax.
And there are some future work
Potential actions - how describe an action that will happened in future

The main trends in the use and development

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (16)

Semelhante a The main trends in the use and development

Semelhante a The main trends in the use and development (20)

Último

Último (20)

The main trends in the use and development

Notas do Editor