SlideShare uma empresa Scribd logo
1 de 34
SEMANTIC WEB
WITH JAHIA
February 2014

www.sigma.fr
SUMMARY

• WHY ?
• Background
• Web 2.0 is not enough
• WHAT ?
• Definitions
• It’s real
• HOW ?
• JAHIA fits
• Integration

www.sigma.fr
WHY ?

• Background
• Web 2.0 is not enough

www.sigma.fr
Background : who we are ?
Thomas Delerm and Adrien Di Mascio from Logilab will explain the interest of
web semantics in modern web applications for the best use of your data.
They’ll give the recipes that make Jahia an appropriate CMS for the semantic
and linked data web, a.k.a. "web 3.0"



Adrien DI MASCIO - Semantic Web Director
Company : Logilab



Thomas DELERM - Web Architect
Company : SIGMA
Worked in cell and IPTV content startups

www.sigma.fr
How the web evolved


Web « 1 » was about documents
and links



Web « 2.0 » is about social and
users

https://web.archive.org/web/19991116151216/http://www4.yahoo.com/

www.sigma.fr
WHY ?

• Background

• Web 2.0 is not enough

www.sigma.fr
Failures of Web 2.0


All the databases and APIs are in “silo”  searches are limited



Results are documents, not objects



Are my results up to date and reliable ?

Example : Renault : Too many combinations when you want to buy a car : more than 10^20

[1]

[1] http://www.semweb.pro/talk/2474
www.sigma.fr
Failures of Web 2.0


Web 2.0 is far from perfect :



User tag
– Different orthography
– Different meanings for the
same orthography (Hollande)
– No relationships between
tags



You cannot (in one request)
answer complex queries like “List
on my website 10 products
whose producer is Samsung and
price under $50”

www.sigma.fr
We have a solution


There is always a technical evolution
– From PC to Web : WWW and links

– From Web to Web 2.0 : AJAX (dynamic web sites)

– From Web 2.0 to Web 3.0 : Semantic properties and Linked
data

So let’s learn what the semantic web is !
www.sigma.fr
WHAT ?

• Definitions
• It’s real

www.sigma.fr
Semantic Web – (Anti)definitions

Today, Semantic Web is not:
Magic
Natural Language Processing
Image Automatic Processing
A new protocol
It's a worldwide network of data built upon a set of interoperable standards that
use URLs to identify data and link them together.

www.sigma.fr
No Natural Language Processing
A human reads:

<h1>Semantic Web</h1>
 <p>Semantic Web is worldwide network of data invented by <a
href="http://w3.org/People/Berners-Lee">Tim Berners Lee</a> in
1994.</p>

A machine reads:

<h1> ????????????</h1>
 <p> ??????????????????????????????????????????????????
?????<a href="http://w3.org/People/BernersLee"> ???????????????</a> ????????</p>

www.sigma.fr
If only ...
… The machine could read:



SemanticWeb is_a network



SemanticWeb was_created_by TimBernersLee



SemanticWeb was_created_in 1994

www.sigma.fr
Annotate your document
Use rdfa or schema.org

<p itemtype="Concept">
<span itemprop="name">Semantic Web</span> is
<span itemprop="description">worldwide network of data</span>
invented by
<a itemprop="creator" href="http://w3.org/People/Berners-Lee">
Tim Berners Lee</a>
in <span="creation_date">1994</span>.</p>

www.sigma.fr
Publish another representation
Publish RDF and use HTTP content-negotiation
<http://mysite.com/SemanticWeb>
a <http://www.w3.org/2004/02/skos/core#Concept>;
skos:closeMatch <http://data.bnf.fr/ark:/12148/cb119328992> ;
dc:creator <http://w3.org/People/Berners-Lee/> ;
dc:date "1994".

More familiar with JSON ? Take a look at JSON-LD

www.sigma.fr
Vocabularies, ontologies



An ontology is a structured set of terms and concepts.



Each term and concept is also identified by a URL

 There are quite a few standard ontologies for various domains
(social interactions, libraries, music, events, etc.)

www.sigma.fr
Make it happen now !



RDF is nice



Some database engines store RDF graphs
- You can query them with the SPARQL language



Standardized by W3C



You don't necessarily need to change your technology stack



If your data is structured, publishing RDF is easy
- Choosing an ontology or a vocabulary can be hard
- Make your relational database answer a SPARQL query is hard

www.sigma.fr
WHAT ?

• Definitions

• It’s real

www.sigma.fr
It's all about data
Publishing structured data:

Helps search engines
Better indexation
Better page rank
Eases external data integration
Importing a CSV file requires a preliminary agreement on its structure
Maintaining data is expensive, reuse published data (dbpedia, freebase,
geonames)

www.sigma.fr
Examples
GoodRelations annotations

Schema.org annotations

www.sigma.fr
HOW ?

• Jahia fits
• Integration

www.sigma.fr
Client case : Bpi


One goal : use state-of-the art Semantic Web since they are a library
(Bibliothèque Publique d’information)



3 main needs:
– Input data easily for contents and within contents
– Store data in a safe, RDF-friendly manner
– Output data
• On every page for SEO (RDFa)
• In searches
• In exports (RDF)



Good news : Jahia fits !

www.sigma.fr
The choice of Jahia


Input :
- Jahia allows to define clear content definitions (CND files) with
inheritance.
- Jahia is content-centric



Enrich within contents : CKEditor



On contents : contribution or edition (GWT) modes

www.sigma.fr
The choice of Jahia : storage and output
Storage : you need a framework than can abstract different sources of data :
enter JCR
– Unique repository for all content
– External data are abstract : LDAP, Files, other DB…
Output:
– Graph structure + XML format  fit for meta data
– JSP views can be easily tailored for special export formats

www.sigma.fr
HOW ?

• Jahia fits

• Integration

www.sigma.fr
Input : CKEditor and categories


Make sure text data is stored as plain HTML
- Properties file to map schema.org  HTML code
- In-content schema.org properties  Created a CKEditor Plugin



Triple categorization of contents
–Categories (closed list)
–Tags (open)
–Authorities (closed – linked with BnF)



Next steps
–Need for a triple store ?
–Categorization through automatic spider browsing ?

www.sigma.fr
Content structure


Directories per category



The semantic mapping is transparent :
no additional field to fill in



Properties files to map a field and its
semantic exports (Dublin Core, FOAF..)

 Kind of challenges met
– Where to store meta data of a file 
extend jnt:file
– How to create a sub content while
creating its parents  edit Spring GWT
XML

www.sigma.fr
Vocabularies used
Page

Schema.org

OpenGraph

Dublin Core

FOAF

Lists
Details on short and
long contents

No
Yes

No
Yes

No
Yes

No
Partial

Details : events, IT
resource [file]

Yes

No

Yes

No

Auteurs
Place

No
 

No
 

Yes
 

Yes
 

In HTML

Everywhere

Header

Header

Everywhere

Format in HTML

RDFa

Meta

Meta

RDFa

In RDF

Yes

Yes, one line per 
meta
 
Automatic 
(mapping)

Yes, native

Contributed
By

Yes, one line per 
meta
 
 
Automatic +  Automatic 
Manual Bpi
(mapping)

 
Automatic 
(mapping)
www.sigma.fr
Output


We chose RDFa because more widely used for now (than microdata)



Debate : shall enrichment be made manually ? Automatically ? Though a
mixed technology ?



The field  dc:xxx mapping will be used to improve search results



“ARK” URIs are used to exchange objects between repositories (internal,
Jahia, external like BnF)

www.sigma.fr
Future




Free your data !
Put them together
Share them between applications and
externally



Forces you to organize your IT
differently

www.sigma.fr
Future : Facebook


Facebook is gradually promoting the
posts that contain Opengraph data [1]



« Facebook testing more uses for
Open Graph » [2]

[1] http://newsroom.fb.com/News/787/News-Feed-FYI-WhatHappens-When-You-See-More-Updates-fromFriends(January 21, 2014)
[2] http://allfacebook.com/add-to-my-movies-link_b128387

www.sigma.fr
Future : Web 3.0

www.sigma.fr
Conclusion


“If you’re not paying for it, you are the product” [1]



Semantic Web is going to be imposed by internet giants because they need it
to know you better



Make the first step to enrich your data, don’t miss the train !



Jahia 7 catches it :
– External data provider
– Quality, extendable editor

[1] http://blogs.law.harvard.edu/futureoftheinternet/2012/03/21/meme-patrol-when-something-online-is-free-youre-not-the-customer-youre-the-product/

www.sigma.fr
Questions & Answers



Webography:
New W3C Blog on Semantic Web & linked data : http://www.w3.org/blog/data/
http://fr.slideshare.net/AntidotNet/time2-market-lyon-13nov2013-slideshare#
http://fr.slideshare.net/terraces/technologies-du-web-smantique-pour-lentreprise-20
http://fr.slideshare.net/AntidotNet/web-smantique-web-de-donnes-web-30-linked-dataquelques-repres-pour-sy-retrouver

www.sigma.fr

Mais conteúdo relacionado

Último

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Último (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Destaque

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 

Destaque (20)

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 

Web Semantics with Jahia

  • 2. SUMMARY • WHY ? • Background • Web 2.0 is not enough • WHAT ? • Definitions • It’s real • HOW ? • JAHIA fits • Integration www.sigma.fr
  • 3. WHY ? • Background • Web 2.0 is not enough www.sigma.fr
  • 4. Background : who we are ? Thomas Delerm and Adrien Di Mascio from Logilab will explain the interest of web semantics in modern web applications for the best use of your data. They’ll give the recipes that make Jahia an appropriate CMS for the semantic and linked data web, a.k.a. "web 3.0"  Adrien DI MASCIO - Semantic Web Director Company : Logilab  Thomas DELERM - Web Architect Company : SIGMA Worked in cell and IPTV content startups www.sigma.fr
  • 5. How the web evolved  Web « 1 » was about documents and links  Web « 2.0 » is about social and users https://web.archive.org/web/19991116151216/http://www4.yahoo.com/ www.sigma.fr
  • 6. WHY ? • Background • Web 2.0 is not enough www.sigma.fr
  • 7. Failures of Web 2.0  All the databases and APIs are in “silo”  searches are limited  Results are documents, not objects  Are my results up to date and reliable ? Example : Renault : Too many combinations when you want to buy a car : more than 10^20 [1] [1] http://www.semweb.pro/talk/2474 www.sigma.fr
  • 8. Failures of Web 2.0  Web 2.0 is far from perfect :  User tag – Different orthography – Different meanings for the same orthography (Hollande) – No relationships between tags  You cannot (in one request) answer complex queries like “List on my website 10 products whose producer is Samsung and price under $50” www.sigma.fr
  • 9. We have a solution  There is always a technical evolution – From PC to Web : WWW and links – From Web to Web 2.0 : AJAX (dynamic web sites) – From Web 2.0 to Web 3.0 : Semantic properties and Linked data So let’s learn what the semantic web is ! www.sigma.fr
  • 10. WHAT ? • Definitions • It’s real www.sigma.fr
  • 11. Semantic Web – (Anti)definitions Today, Semantic Web is not: Magic Natural Language Processing Image Automatic Processing A new protocol It's a worldwide network of data built upon a set of interoperable standards that use URLs to identify data and link them together. www.sigma.fr
  • 12. No Natural Language Processing A human reads: <h1>Semantic Web</h1>  <p>Semantic Web is worldwide network of data invented by <a href="http://w3.org/People/Berners-Lee">Tim Berners Lee</a> in 1994.</p> A machine reads: <h1> ????????????</h1>  <p> ?????????????????????????????????????????????????? ?????<a href="http://w3.org/People/BernersLee"> ???????????????</a> ????????</p> www.sigma.fr
  • 13. If only ... … The machine could read:  SemanticWeb is_a network  SemanticWeb was_created_by TimBernersLee  SemanticWeb was_created_in 1994 www.sigma.fr
  • 14. Annotate your document Use rdfa or schema.org <p itemtype="Concept"> <span itemprop="name">Semantic Web</span> is <span itemprop="description">worldwide network of data</span> invented by <a itemprop="creator" href="http://w3.org/People/Berners-Lee"> Tim Berners Lee</a> in <span="creation_date">1994</span>.</p> www.sigma.fr
  • 15. Publish another representation Publish RDF and use HTTP content-negotiation <http://mysite.com/SemanticWeb> a <http://www.w3.org/2004/02/skos/core#Concept>; skos:closeMatch <http://data.bnf.fr/ark:/12148/cb119328992> ; dc:creator <http://w3.org/People/Berners-Lee/> ; dc:date "1994". More familiar with JSON ? Take a look at JSON-LD www.sigma.fr
  • 16. Vocabularies, ontologies  An ontology is a structured set of terms and concepts.  Each term and concept is also identified by a URL  There are quite a few standard ontologies for various domains (social interactions, libraries, music, events, etc.) www.sigma.fr
  • 17. Make it happen now !  RDF is nice  Some database engines store RDF graphs - You can query them with the SPARQL language  Standardized by W3C  You don't necessarily need to change your technology stack  If your data is structured, publishing RDF is easy - Choosing an ontology or a vocabulary can be hard - Make your relational database answer a SPARQL query is hard www.sigma.fr
  • 18. WHAT ? • Definitions • It’s real www.sigma.fr
  • 19. It's all about data Publishing structured data: Helps search engines Better indexation Better page rank Eases external data integration Importing a CSV file requires a preliminary agreement on its structure Maintaining data is expensive, reuse published data (dbpedia, freebase, geonames) www.sigma.fr
  • 21. HOW ? • Jahia fits • Integration www.sigma.fr
  • 22. Client case : Bpi  One goal : use state-of-the art Semantic Web since they are a library (Bibliothèque Publique d’information)  3 main needs: – Input data easily for contents and within contents – Store data in a safe, RDF-friendly manner – Output data • On every page for SEO (RDFa) • In searches • In exports (RDF)  Good news : Jahia fits ! www.sigma.fr
  • 23. The choice of Jahia  Input : - Jahia allows to define clear content definitions (CND files) with inheritance. - Jahia is content-centric  Enrich within contents : CKEditor  On contents : contribution or edition (GWT) modes www.sigma.fr
  • 24. The choice of Jahia : storage and output Storage : you need a framework than can abstract different sources of data : enter JCR – Unique repository for all content – External data are abstract : LDAP, Files, other DB… Output: – Graph structure + XML format  fit for meta data – JSP views can be easily tailored for special export formats www.sigma.fr
  • 25. HOW ? • Jahia fits • Integration www.sigma.fr
  • 26. Input : CKEditor and categories  Make sure text data is stored as plain HTML - Properties file to map schema.org  HTML code - In-content schema.org properties  Created a CKEditor Plugin  Triple categorization of contents –Categories (closed list) –Tags (open) –Authorities (closed – linked with BnF)  Next steps –Need for a triple store ? –Categorization through automatic spider browsing ? www.sigma.fr
  • 27. Content structure  Directories per category  The semantic mapping is transparent : no additional field to fill in  Properties files to map a field and its semantic exports (Dublin Core, FOAF..)  Kind of challenges met – Where to store meta data of a file  extend jnt:file – How to create a sub content while creating its parents  edit Spring GWT XML www.sigma.fr
  • 28. Vocabularies used Page Schema.org OpenGraph Dublin Core FOAF Lists Details on short and long contents No Yes No Yes No Yes No Partial Details : events, IT resource [file] Yes No Yes No Auteurs Place No   No   Yes   Yes   In HTML Everywhere Header Header Everywhere Format in HTML RDFa Meta Meta RDFa In RDF Yes Yes, one line per  meta   Automatic  (mapping) Yes, native Contributed By Yes, one line per  meta     Automatic +  Automatic  Manual Bpi (mapping)   Automatic  (mapping) www.sigma.fr
  • 29. Output  We chose RDFa because more widely used for now (than microdata)  Debate : shall enrichment be made manually ? Automatically ? Though a mixed technology ?  The field  dc:xxx mapping will be used to improve search results  “ARK” URIs are used to exchange objects between repositories (internal, Jahia, external like BnF) www.sigma.fr
  • 30. Future    Free your data ! Put them together Share them between applications and externally  Forces you to organize your IT differently www.sigma.fr
  • 31. Future : Facebook  Facebook is gradually promoting the posts that contain Opengraph data [1]  « Facebook testing more uses for Open Graph » [2] [1] http://newsroom.fb.com/News/787/News-Feed-FYI-WhatHappens-When-You-See-More-Updates-fromFriends(January 21, 2014) [2] http://allfacebook.com/add-to-my-movies-link_b128387 www.sigma.fr
  • 32. Future : Web 3.0 www.sigma.fr
  • 33. Conclusion  “If you’re not paying for it, you are the product” [1]  Semantic Web is going to be imposed by internet giants because they need it to know you better  Make the first step to enrich your data, don’t miss the train !  Jahia 7 catches it : – External data provider – Quality, extendable editor [1] http://blogs.law.harvard.edu/futureoftheinternet/2012/03/21/meme-patrol-when-something-online-is-free-youre-not-the-customer-youre-the-product/ www.sigma.fr
  • 34. Questions & Answers  Webography: New W3C Blog on Semantic Web & linked data : http://www.w3.org/blog/data/ http://fr.slideshare.net/AntidotNet/time2-market-lyon-13nov2013-slideshare# http://fr.slideshare.net/terraces/technologies-du-web-smantique-pour-lentreprise-20 http://fr.slideshare.net/AntidotNet/web-smantique-web-de-donnes-web-30-linked-dataquelques-repres-pour-sy-retrouver www.sigma.fr

Notas do Editor

  1. 19 July 2013 at Google : Knowledge Graph expansion – More than a quarter of all searches started showing some kind of knwoledge graph after this date20 August 2013 Google Hummingbird foces on conversational and semantic search to try and delivery correct answers to broad meanung questions
  2. We chose not to output semantics on lists pages on purpose