SlideShare uma empresa Scribd logo
1 de 17
How to get your data into Sindice and Google with sitemap4rdf Boris Villazón-Terrazas (OEG), Richard Cyganiak (DERI)
Publishing Linked Data  from a triple store
Linked Data frontends for triple stores Source: Pubby website, http://www4.wiwiss.fu-berlin.de/pubby/
Search engines
Sindice: the best RDF search engine
Sindice: the best RDF search engine 120M+ documents Continuously updating since 2006 Low-latency search API RDF/XML, Turtle, RDFa, microformats
The Sitemap protocol
Sitemap Protocol Used by web crawlers Efficiently find all your content & discover what has been updated http://sitemaps.org/
Sitemap Protocol: Simple example <?xml version="1.0" encoding="UTF-8"?> <urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9">    <url>       <loc>http://yoursite/</loc>    </url>    <url>       <loc>http://yoursite/products/53546</loc>    </url>    <url>       <loc>http://yoursite/products/98421</loc>    </url>    <url>       <loc>http://yoursite/products/41003</loc>    </url> </urlset>
Sitemap Protocol: Optional parts <?xml version="1.0" encoding="UTF-8"?> <urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9">    <url>       <loc>http://yoursite/</loc>       <lastmod>2010-06-24</lastmod>       <changefreq>daily</changefreq>    </url> </urlset>
Sitemap Protocol: Huge sitemaps Gzip-compress your sitemap Limit: 50k URLs or 10MB split into multiple sitemap files add a sitemap index file
Sitemap Protocol: Discovery Publish the sitemap file Add a line to http://yoursite/robots.txt   Sitemap: http://yoursite/sitemap.xml
sitemap4rdf Generate Sitemap files from a SPARQL endpoint
sitemap4rdf Simple command line tool Sends a SPARQL query to list all URIs Generates sitemap sitemap4rdf http://yoursite/sparql http://yoursite/resource/
Submit the sitemap location - Sindice http://sindice.com/main/submit
Submit the sitemap location - Google https://www.google.com/webmasters/tools/
Summary Sitemap protocol informs search engines about available pages Supported by Sindice! sitemap4rdf generates Sitemap files by listing URIs in a SPARQL endpoint Open source, Java http://lab.linkeddata.deri.ie/2010/sitemap4rdf/

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
 
ORCID cross-sector application and use cases, Funder workflow: National Resea...
ORCID cross-sector application and use cases, Funder workflow: National Resea...ORCID cross-sector application and use cases, Funder workflow: National Resea...
ORCID cross-sector application and use cases, Funder workflow: National Resea...
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge Graph
 
S4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteS4: The Self-Service Semantic Suite
S4: The Self-Service Semantic Suite
 
DataXDay - Real-Time Access log analysis
DataXDay - Real-Time Access log analysis DataXDay - Real-Time Access log analysis
DataXDay - Real-Time Access log analysis
 
New Product Introductions - Minesoft
New Product Introductions - MinesoftNew Product Introductions - Minesoft
New Product Introductions - Minesoft
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing Platform
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
 
5 Ruby Gems in 10 minutes - Faraday, Hashie, Twitter, Diametric, and Adamantium
5 Ruby Gems in 10 minutes - Faraday, Hashie, Twitter, Diametric, and Adamantium5 Ruby Gems in 10 minutes - Faraday, Hashie, Twitter, Diametric, and Adamantium
5 Ruby Gems in 10 minutes - Faraday, Hashie, Twitter, Diametric, and Adamantium
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL Queries
 
Cloud architectures for data science
Cloud architectures for data scienceCloud architectures for data science
Cloud architectures for data science
 
Connected data meetup group - introduction & scope
Connected data meetup group - introduction & scopeConnected data meetup group - introduction & scope
Connected data meetup group - introduction & scope
 
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
 
New Product Introductions - FIZ Karlsruhe
New Product Introductions - FIZ KarlsruheNew Product Introductions - FIZ Karlsruhe
New Product Introductions - FIZ Karlsruhe
 
Smart Data Applications powered by the Wikidata Knowledge Graph
Smart Data Applications powered by the Wikidata Knowledge GraphSmart Data Applications powered by the Wikidata Knowledge Graph
Smart Data Applications powered by the Wikidata Knowledge Graph
 
Fast Data processing with RFX
Fast Data processing with RFXFast Data processing with RFX
Fast Data processing with RFX
 
Discovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data PortalsDiscovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data Portals
 
Using historical open data for family history - and the value of GB1900 data
Using historical open data for family history - and the value of GB1900 dataUsing historical open data for family history - and the value of GB1900 data
Using historical open data for family history - and the value of GB1900 data
 
PID Services for FAIR data
PID Services for FAIR dataPID Services for FAIR data
PID Services for FAIR data
 
PID services - understandability and findability of data
PID services - understandability and findability of dataPID services - understandability and findability of data
PID services - understandability and findability of data
 

Semelhante a How to get your data into Sindice and Google with sitemap4rdf

Open belgium 2015 - open tourism
Open belgium 2015 - open tourismOpen belgium 2015 - open tourism
Open belgium 2015 - open tourism
Raf Buyle
 
Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011
Juan Sequeda
 

Semelhante a How to get your data into Sindice and Google with sitemap4rdf (20)

Sitemap4rdf(v2 boris)
Sitemap4rdf(v2 boris)Sitemap4rdf(v2 boris)
Sitemap4rdf(v2 boris)
 
Semantic Web
Semantic WebSemantic Web
Semantic Web
 
The new CIARD RING , a machine-readable directory of datasets for agriculture
The new CIARD RING, a machine-readable directory of datasets for agricultureThe new CIARD RING, a machine-readable directory of datasets for agriculture
The new CIARD RING , a machine-readable directory of datasets for agriculture
 
Datasets, APIs, and Web Scraping
Datasets, APIs, and Web ScrapingDatasets, APIs, and Web Scraping
Datasets, APIs, and Web Scraping
 
Drupal and the Semantic Web
Drupal and the Semantic WebDrupal and the Semantic Web
Drupal and the Semantic Web
 
Dsp bbc-jem rayfield-semtech2011
Dsp bbc-jem rayfield-semtech2011Dsp bbc-jem rayfield-semtech2011
Dsp bbc-jem rayfield-semtech2011
 
Web 3 0
Web 3 0Web 3 0
Web 3 0
 
Semantic web and Drupal: an introduction
Semantic web and Drupal: an introductionSemantic web and Drupal: an introduction
Semantic web and Drupal: an introduction
 
The CIARD RING , a global directory of datasets for agriculture, by Valeria P...
The CIARD RING, a global directory of datasets for agriculture, by Valeria P...The CIARD RING, a global directory of datasets for agriculture, by Valeria P...
The CIARD RING , a global directory of datasets for agriculture, by Valeria P...
 
Getting Started With The Talis Platform
Getting Started With The Talis PlatformGetting Started With The Talis Platform
Getting Started With The Talis Platform
 
Microformats
MicroformatsMicroformats
Microformats
 
JahiaOne - Semantic Web with Jahia
JahiaOne - Semantic Web with JahiaJahiaOne - Semantic Web with Jahia
JahiaOne - Semantic Web with Jahia
 
Open belgium 2015 - open tourism
Open belgium 2015 - open tourismOpen belgium 2015 - open tourism
Open belgium 2015 - open tourism
 
Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011
 
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
 
E017624043
E017624043E017624043
E017624043
 
NCompass Live: RSS: Feed Me
NCompass Live: RSS: Feed MeNCompass Live: RSS: Feed Me
NCompass Live: RSS: Feed Me
 
Reto2.011 APEX API
Reto2.011 APEX APIReto2.011 APEX API
Reto2.011 APEX API
 
The Semantic Web Client Library - Consuming Linked Data in Your Applications
The Semantic Web Client Library - Consuming Linked Data in Your ApplicationsThe Semantic Web Client Library - Consuming Linked Data in Your Applications
The Semantic Web Client Library - Consuming Linked Data in Your Applications
 
LOD技術解説
LOD技術解説LOD技術解説
LOD技術解説
 

Mais de Richard Cyganiak

EDF2012: The Web of Data and its Five Stars
EDF2012: The Web of Data and its Five StarsEDF2012: The Web of Data and its Five Stars
EDF2012: The Web of Data and its Five Stars
Richard Cyganiak
 
VoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsVoID: Metadata for RDF Datasets
VoID: Metadata for RDF Datasets
Richard Cyganiak
 
Self-Service Linked Government Data with dcat and Gridworks
Self-Service Linked Government Data with dcat and GridworksSelf-Service Linked Government Data with dcat and Gridworks
Self-Service Linked Government Data with dcat and Gridworks
Richard Cyganiak
 

Mais de Richard Cyganiak (12)

SHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudSHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data Mud
 
What's New in RDF 1.1?
What's New in RDF 1.1?What's New in RDF 1.1?
What's New in RDF 1.1?
 
EDF2012: The Web of Data and its Five Stars
EDF2012: The Web of Data and its Five StarsEDF2012: The Web of Data and its Five Stars
EDF2012: The Web of Data and its Five Stars
 
VoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsVoID: Metadata for RDF Datasets
VoID: Metadata for RDF Datasets
 
Practical Cross-Dataset Queries with SPARQL (Introduction)
Practical Cross-Dataset Queries with SPARQL (Introduction)Practical Cross-Dataset Queries with SPARQL (Introduction)
Practical Cross-Dataset Queries with SPARQL (Introduction)
 
How to Publish Open Data
How to Publish Open DataHow to Publish Open Data
How to Publish Open Data
 
Sigma EE: Reaping low-hanging fruits in RDF-based data integration
Sigma EE: Reaping low-hanging fruits in RDF-based data integrationSigma EE: Reaping low-hanging fruits in RDF-based data integration
Sigma EE: Reaping low-hanging fruits in RDF-based data integration
 
Investigating Community Implementation of the GoodRelations Ontology
Investigating Community Implementation of the GoodRelations OntologyInvestigating Community Implementation of the GoodRelations Ontology
Investigating Community Implementation of the GoodRelations Ontology
 
Self-Service Linked Government Data with dcat and Gridworks
Self-Service Linked Government Data with dcat and GridworksSelf-Service Linked Government Data with dcat and Gridworks
Self-Service Linked Government Data with dcat and Gridworks
 
The State of Linked Government Data
The State of Linked Government DataThe State of Linked Government Data
The State of Linked Government Data
 
What is SDMX-RDF?
What is SDMX-RDF?What is SDMX-RDF?
What is SDMX-RDF?
 
dcat: An RDF vocabulary for interoperability of data catalogues
dcat: An RDF vocabulary for interoperability of data cataloguesdcat: An RDF vocabulary for interoperability of data catalogues
dcat: An RDF vocabulary for interoperability of data catalogues
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

How to get your data into Sindice and Google with sitemap4rdf

  • 1. How to get your data into Sindice and Google with sitemap4rdf Boris Villazón-Terrazas (OEG), Richard Cyganiak (DERI)
  • 2. Publishing Linked Data from a triple store
  • 3. Linked Data frontends for triple stores Source: Pubby website, http://www4.wiwiss.fu-berlin.de/pubby/
  • 5. Sindice: the best RDF search engine
  • 6. Sindice: the best RDF search engine 120M+ documents Continuously updating since 2006 Low-latency search API RDF/XML, Turtle, RDFa, microformats
  • 8. Sitemap Protocol Used by web crawlers Efficiently find all your content & discover what has been updated http://sitemaps.org/
  • 9. Sitemap Protocol: Simple example <?xml version="1.0" encoding="UTF-8"?> <urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>http://yoursite/</loc> </url> <url> <loc>http://yoursite/products/53546</loc> </url> <url> <loc>http://yoursite/products/98421</loc> </url> <url> <loc>http://yoursite/products/41003</loc> </url> </urlset>
  • 10. Sitemap Protocol: Optional parts <?xml version="1.0" encoding="UTF-8"?> <urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>http://yoursite/</loc> <lastmod>2010-06-24</lastmod> <changefreq>daily</changefreq> </url> </urlset>
  • 11. Sitemap Protocol: Huge sitemaps Gzip-compress your sitemap Limit: 50k URLs or 10MB split into multiple sitemap files add a sitemap index file
  • 12. Sitemap Protocol: Discovery Publish the sitemap file Add a line to http://yoursite/robots.txt Sitemap: http://yoursite/sitemap.xml
  • 13. sitemap4rdf Generate Sitemap files from a SPARQL endpoint
  • 14. sitemap4rdf Simple command line tool Sends a SPARQL query to list all URIs Generates sitemap sitemap4rdf http://yoursite/sparql http://yoursite/resource/
  • 15. Submit the sitemap location - Sindice http://sindice.com/main/submit
  • 16. Submit the sitemap location - Google https://www.google.com/webmasters/tools/
  • 17. Summary Sitemap protocol informs search engines about available pages Supported by Sindice! sitemap4rdf generates Sitemap files by listing URIs in a SPARQL endpoint Open source, Java http://lab.linkeddata.deri.ie/2010/sitemap4rdf/