Why Teams call analytics are critical to your entire business
Sd sem weboct252010
1. Leveraging
the
growth
of
the
Seman1c
Web
-‐
from
Seman1c
SEO
to
.....
San
Diego
Seman+c
Web
Meetup
Oct
25,
2010
Barbara Starr
Email: bstarr@Ontologica.us
Twitter: @BarbaraStarr
2. So
…
Let
us
begin
to
take
a
look
at
how
the
Seman+c
Web
is
being
used
and
leveraged
in
the
real
world
of
late
(feel
free
to
add:
…..
And
of
course,
who
is
using
it
,
how,
........
3. Seman+c
Search/SEO
The
major
Search
Engines
&
Social
Networks
are
currently
leveraging
Seman+c
Web
Technology
4. What
is
Seman+c
Search
• Semantic Search is basically the notion of improving search
by using metadata or searching on that metadata.
• There are several ways that the Search engines on the web
may use this to enhance search results.
– FIND, rather than SEARCH.
• Searching directly on the metadata directly can yield specific
answers or results as demonstrated in the following example:
Query
“Barack Obama Birthday”
Results on
10. What
is
Seman+c
Search
(cont)
• Semantic Search is basically the notion of improving search
by using metadata or searching on that metadata.
• There are several ways that the Search engines on the web
may use this to enhance search results.
– FIND, rather than SEARCH.
• Searching directly on the metadata directly can yield specific
answers or results as demonstrated in the following example:
• Ran the query “Barack Obama Birthday” on both google, and
bing. Obtained the following:
– Answer
engines
rather
than
Search
Engines?
11. What
is
Seman+c
Search
(Cont)
• Semantic Search is basically the notion of improving search
by using metadata or searching on that metadata.
• There are several ways that the Search engines on the web
may use this to enhance search results.
– FIND, rather than SEARCH.
– Another aspect of using metadata such as embedding
metadata or semantic markup in web pages could be
demonstrated by enhanced displays in search
results (e.g. rich snippets
in
google).
Both
Google
and
Yahoo
support
enhanced
displays
for
RDFa
markup.
12. Rich
Snippets
• Google
now
supports
Rich
snippets
for
– People
– Events
– Businesses
and
organiza+ons
– Reviews
– Recipes
– Products
when
related
to
a
review
– Breadcrumbs
– Local
Search
h[p://rdf.data-‐vocabulary.org/#
15. Sept
2,
2010
now
see
more
than
twice
as
many
searches
with
rich
snippets
in
the
results
in
the
US,
and
a
four-‐fold
increase
globally,
compared
to
one
year
ago.
18. Social
Networks
• While
search
engines
can
benefit
from
access
to
social
networks,
social
networks
can
benefit
from
seman+c
metadata
in
web
pages
–Example
is
Facebook’s
Open
Graph
Protocol
(also
supports
RDFa)
which
allows
users
to
share
&
like
objects
(such
as
products)
as
opposed
to
web
pages.
Enables
“Seman+c
Profiling”
of
the
users
by
facebook.
(Japanese
MIXI
now
using
it)
19. Web
Benefits
/
Uses
• Yahoo stated 15% increase in CTR as a result of
enhanced displays, rich snippets in Google
• Definitive answers enabled by understanding and
leveraging how search engines are searching
directly on metadata
• Semantic Profiling and adoption by social networks
• Embedding semantic markup in web pages and
product pages ultimately makes information “findable”
by search engines, enabling them to provide
improvements such as definitive answers, enhanced
displays, etc
21. Consuming
RDFa
• Previously
indicated
increase
of
RDFa
in
general
and
produc+on
of
RDFa
• Available
consumers/parsers
– Sindice
(any23)
– Rdfa
dis+ller
Sindice.com
22. Handy
Validators
• RDFA
VALIDATORS
AND
TESTERS
• New
RDFa
Validator:
h[p://check.rdfa.info/
• Sindice
Inspector:
h[p://inspector.sindice.com/
• Yahoo
Objeclinder:
h[p://
developer.search.yahoo.com/help/objeclinder
• Google
rich
snippets
tester:
h[p://
www.google.com/webmasters/tools/richsnippets
23. Adopters?
• UK
Government
• US
Government
• BBC
(FIFA
world
cup
site
dynamically
generated
using
linked
data)
• Thomson
Reuters
• Freebase
• NY
Times
• Best
Buy
• Google
(More
to
follow
h[p://rdf.data-‐vocabulary.org/#)
• Yahoo
• Facebook
• Mixi
• Oracle
• Overstock
• Drug
research
and
discovery
companies,
pfizer,
….
• Tons
more
–
Just
look
at
the
diversity
in
the
LOD
data
cloud
(genng
there)
24. Spectrum
of
Applica+ons
• Seman+c
Wiki’s
(Seman+c
media
Wiki)
• Seman+cs
as
a
Service
(e.g.
SIRI)
–
interoperability
of
web
services,
underlying
service
Ontologies
• Enterprise
data
integra+on
(Anzo,
• Seman+cs
in
publishing
– Open
Calais
now
has
Openpublish
– Zemanta,
primal
pages
– Drupal
and
other
CMS
systems
• Contextual
Adver+sing
• Sen+ment
Analysis
(COGITO)
• Seman+c
Search
(documents
&
structured
data
sources)
• Seman+c
Social
Networks
25. LOD
Cloud
Evolu+on
The
rate
of
growth
has
been
remarkable
Source
maintained
by:
Richard
Cygniak
and
Anja
Jentsch.
h[p://lod-‐cloud.net
34. March
5
-‐
2009
As of March 2009
LinkedCT
Reactome
Taxonomy
KEGG
PubMed
GeneID
Pfam
UniProt
OMIM
PDB
Symbol
ChEBI
Daily
Med
Disea-
some
CAS
HGNC
Inter
Pro
Drug
Bank
UniParc
UniRef
ProDom
PROSITE
Gene
Ontology
Homolo
Gene
Pub
Chem
MGI
UniSTS
GEO
Species
Jamendo
BBC
Programm
es
Music-
brainz
Magna-
tune
BBC
Later +
TOTP
Surge
Radio
MySpace
Wrapper
Audio-
Scrobbler
Linked
MDB
BBC
John
Peel
BBC
Playcount
Data
Gov-
Track
US
Census
Data
riese
Geo-
names
lingvoj
World
Fact-
book
Euro-
stat
IRIT
Toulouse
SW
Conference
Corpus
RDF Book
Mashup
Project
Guten-
berg
DBLP
Hannover
DBLP
Berlin
LAAS-
CNRS
Buda-
pest
BME
IEEE
IBM
Resex
Pisa
New-
castle
RAE
2001
CiteSeer
ACM
DBLP
RKB
Explorer
eprints
LIBRIS
Semantic
Web.org Eurécom
ECS
South-
ampton
RevyuSIOC
Sites
Doap-
space
Flickr
exporter
FOAF
profiles
flickr
wrappr
Crunch
Base
Sem-
Web-
Central
Open-
Guides
Wiki-
company
QDOS
Pub
Guide
Open
Calais
RDF
ohloh
W3C
WordNet
Open
Cyc
UMBEL
Yago
DBpedia
Freebase
Virtuoso
Sponger
35. March
27
-‐
2009
As of March 2009
LinkedCT
Reactome
Taxonomy
KEGG
PubMed
GeneID
Pfam
UniProt
OMIM
PDB
Symbol
ChEBI
Daily
Med
Disea-
some
CAS
HGNC
Inter
Pro
Drug
Bank
UniParc
UniRef
ProDom
PROSITE
Gene
Ontology
Homolo
Gene
Pub
Chem
MGI
UniSTS
GEO
Species
Jamendo
BBC
Programm
es
Music-
brainz
Magna-
tune
BBC
Later +
TOTP
Surge
Radio
MySpace
Wrapper
Audio-
Scrobbler
Linked
MDB
BBC
John
Peel
BBC
Playcount
Data
Gov-
Track
US
Census
Data
riese
Geo-
names
lingvoj
World
Fact-
book
Euro-
stat
flickr
wrappr
Open
Calais
RevyuSIOC
Sites
Doap-
space
Flickr
exporter
FOAF
profiles
Crunch
Base
Sem-
Web-
Central
Open-
Guides
Wiki-
company
QDOS
Pub
Guide
RDF
ohloh
W3C
WordNet
Open
Cyc
UMBEL
Yago
DBpedia
Freebase
Virtuoso
Sponger
DBLP
Hannover
IRIT
Toulouse
SW
Conference
Corpus
RDF Book
Mashup
Project
Guten-
berg
DBLP
Berlin
LAAS-
CNRS
Buda-
pest
BME
IEEE
IBM
Resex
Pisa
New-
castle
RAE
2001
CiteSeer
ACM
DBLP
RKB
Explorer
eprints
LIBRIS
Semantic
Web.org
Eurécom
RKB
ECS
South-
ampton
CORDIS
ReSIST
Project
Wiki
National
Science
Foundation
ECS
South-
ampton
37. Sept
22
-‐
2010
As of September 2010
Music
Brainz
(zitgist)
P20
YAGO
World
Fact-
book
(FUB)
WordNet
(W3C)
WordNet
(VUA)
VIVO UF
VIVO
Indiana
VIVO
Cornell
VIAF
URI
Burner
Sussex
Reading
Lists
Plymouth
Reading
Lists
UMBEL
UK Post-
codes
legislation
.gov.uk
Uberblic
UB
Mann-
heim
TWC LOGD
Twarql
transport
data.gov
.uk
totl.net
Tele-
graphis
TCM
Gene
DIT
Taxon
Concept
The Open
Library
(Talis)
t4gm
Surge
Radio
STW
RAMEAU
SH
statistics
data.gov
.uk
St.
Andrews
Resource
Lists
ECS
South-
ampton
EPrints
Semantic
Crunch
Base
semantic
web.org
Semantic
XBRL
SW
Dog
Food
rdfabout
US SEC
Wiki
UN/
LOCODE
Ulm
ECS
(RKB
Explorer)
Roma
RISKS
RESEX
RAE2001
Pisa
OS
OAI
NSF
New-
castle
LAAS
KISTI
JISC
IRIT
IEEE
IBM
Eurécom
ERA
ePrints
dotAC
DEPLOY
DBLP
(RKB
Explorer)
Course-
ware
CORDIS
CiteSeer
Budapest
ACM
riese
Revyu
research
data.gov
.uk
reference
data.gov
.uk
Recht-
spraak.
nl
RDF
ohloh
Last.FM
(rdfize)
RDF
Book
Mashup
PSH
Product
DB
PBAC
Poké-
pédia
Ord-
nance
Survey
Openly
Local
The Open
Library
Open
Cyc
OpenCal
ais
OpenEI
New
York
Times
NTU
Resource
Lists
NDL
subjects
MARC
Codes
List
Man-
chester
Reading
Lists
Lotico
The
London
Gazette
LOIUS
lobid
Resources
lobid
Organi-
sations
Linked
MDB
Linked
LCCN
Linked
GeoData
Linked
CT
Linked
Open
Numbers
lingvoj
LIBRIS
Lexvo
LCSH
DBLP
(L3S)
Linked
Sensor Data
(Kno.e.sis)
Good-
win
Family
Jamendo
iServe
NSZL
Catalog
GovTrack
GESIS
Geo
Species
Geo
Names
Geo
Linked
Data
(es)
GTAA
STITCH
SIDER
Project
Guten-
berg
(FUB)
Medi
Care
Euro-
stat
(FUB)
Drug
Bank
Disea-
some
DBLP
(FU
Berlin)
Daily
Med
Freebase
flickr
wrappr
Fishes
of Texas
FanHubz
Event-
Media
EUTC
Produc-
tions
Eurostat
EUNIS
ESD
stan-
dards
Popula-
tion (En-
AKTing)
NHS
(EnAKTing)
Mortality
(En-
AKTing)
Energy
(En-
AKTing)
CO2
(En-
AKTing)
education
data.gov
.uk
ECS
South-
ampton
Gem.
Norm-
datei
data
dcs
MySpace
(DBTune)
Music
Brainz
(DBTune)
Magna-
tune
John
Peel
(DB
Tune)
classical
(DB
Tune)
Audio-
scrobbler
(DBTune)
Last.fm
Artists
(DBTune)
DB
Tropes
dbpedia
lite
DBpedia
Pokedex
Airports
NASA
(Data
Incu-
bator)
Music
Brainz
(Data
Incubator)
Moseley
Folk
Discogs
(Data In-
cubator)
Climbing
Linked Data
for Intervals
Cornetto
Chronic-
ling
America
Chem2
Bio2RDF
biz.
data.
gov.uk
UniSTS
UniRef
Uni
Path-
way
UniParc
Taxo-
nomy
UniProt
SGD
Reactome
PubMed
Pub
Chem
PRO-
SITE
ProDom
Pfam PDB
OMIM
OBO
MGI
KEGG
Reaction
KEGG
Pathway
KEGG
Glycan
KEGG
Enzyme
KEGG
Drug
KEGG
Cpd
InterPro
Homolo
Gene
HGNC
Gene
Ontology
GeneID
Gen
Bank
ChEBI
CAS
Affy-
metrix
BibBase
BBC
Wildlife
Finder
BBC
Program
mes
BBC
Music
rdfabout
US Census
38. LOD
cloud
–
Sept
22
2010
As of September 2010
Music
Brainz
(zitgist)
P20
YAGO
World
Fact-
book
(FUB)
WordNet
(W3C)
WordNet
(VUA)
VIVO UF
VIVO
Indiana
VIVO
Cornell
VIAF
URI
Burner
Sussex
Reading
Lists
Plymouth
Reading
Lists
UMBEL
UK Post-
codes
legislation
.gov.uk
Uberblic
UB
Mann-
heim
TWC LOGD
Twarql
transport
data.gov
.uk
totl.net
Tele-
graphis
TCM
Gene
DIT
Taxon
Concept
The Open
Library
(Talis)
t4gm
Surge
Radio
STW
RAMEAU
SH
statistics
data.gov
.uk
St.
Andrews
Resource
Lists
ECS
South-
ampton
EPrints
Semantic
Crunch
Base
semantic
web.org
Semantic
XBRL
SW
Dog
Food
rdfabout
US SEC
Wiki
UN/
LOCODE
Ulm
ECS
(RKB
Explorer)
Roma
RISKS
RESEX
RAE2001
Pisa
OS
OAI
NSF
New-
castle
LAAS
KISTI
JISC
IRIT
IEEE
IBM
Eurécom
ERA
ePrints
dotAC
DEPLOY
DBLP
(RKB
Explorer)
Course-
ware
CORDIS
CiteSeer
Budapest
ACM
riese
Revyu
research
data.gov
.uk
reference
data.gov
.uk
Recht-
spraak.
nl
RDF
ohloh
Last.FM
(rdfize)
RDF
Book
Mashup
PSH
Product
DB
PBAC
Poké-
pédia
Ord-
nance
Survey
Openly
Local
The Open
Library
Open
Cyc
Open
Calais
OpenEI
New
York
Times
NTU
Resource
Lists
NDL
subjects
MARC
Codes
List
Man-
chester
Reading
Lists
Lotico
The
London
Gazette
LOIUS
lobid
Resources
lobid
Organi-
sations
Linked
MDB
Linked
LCCN
Linked
GeoData
Linked
CT
Linked
Open
Numbers
lingvoj
LIBRIS
Lexvo
LCSH
DBLP
(L3S)
Linked
Sensor Data
(Kno.e.sis)
Good-
win
Family
Jamendo
iServe
NSZL
Catalog
GovTrack
GESIS
Geo
Species
Geo
Names
Geo
Linked
Data
(es)
GTAA
STITCH
SIDER
Project
Guten-
berg
(FUB)
Medi
Care
Euro-
stat
(FUB)
Drug
Bank
Disea-
some
DBLP
(FU
Berlin)
Daily
Med
Freebase
flickr
wrappr
Fishes
of Texas
FanHubz
Event-
Media
EUTC
Produc-
tions
Eurostat
EUNIS
ESD
stan-
dards
Popula-
tion (En-
AKTing)
NHS
(EnAKTing)
Mortality
(En-
AKTing)
Energy
(En-
AKTing)
CO2
(En-
AKTing)
education
data.gov
.uk
ECS
South-
ampton
Gem.
Norm-
datei
data
dcs
MySpace
(DBTune)
Music
Brainz
(DBTune)
Magna-
tune
John
Peel
(DB
Tune)
classical
(DB
Tune)
Audio-
scrobbler
(DBTune)
Last.fm
Artists
(DBTune)
DB
Tropes
dbpedia
lite
DBpedia
Pokedex
Airports
NASA
(Data
Incu-
bator)
Music
Brainz
(Data
Incubator)
Moseley
Folk
Discogs
(Data In-
cubator)
Climbing
Linked Data
for Intervals
Cornetto
Chronic-
ling
America
Chem2
Bio2RDF
biz.
data.
gov.uk
UniSTS
UniRef
Uni
Path-
way
UniParc
Taxo-
nomy
UniProt
SGD
Reactome
PubMed
Pub
Chem
PRO-
SITE
ProDom
Pfam PDB
OMIM
OBO
MGI
KEGG
Reaction
KEGG
Pathway
KEGG
Glycan
KEGG
Enzyme
KEGG
Drug
KEGG
Cpd
InterPro
Homolo
Gene
HGNC
Gene
Ontology
GeneID
Gen
Bank
ChEBI
CAS
Affy-
metrix
BibBase
BBC
Wildlife
Finder
BBC
Program
mes
BBC
Music
rdfabout
US Census
Media
Geographic
Publications
Government
Cross-domain
Life sciences
User-generated content
latest
LOD
cloud
39. Leveraging
Linked
Datasets
Pharmaceu+cal
example
• There
are
many
ways
to
leverage
exis+ng
informa+on
and
to
perform
knowledge
discovery
within
them.
• This
example
makes
use
of
the
allegrograph
plalorm
and
query
interface
supported
by
Franz
Inc,
A
web
3.0
database
provider.
• Allegrograph
can
be
downloaded
from
their
website
at
h[p://www.franz.com
40. Leveraging
Linked
Datasets
Pharmaceu+cal
example
• Facilitates
informa+on
sharing
between
knowledge
bases
and
between
researchers
• The
graphical
viewers
and
browsers
provide
by
Franz
enable
visualiza+on
of
rela+onships
between
en++es
(GRUFF
displays
rela+onships
between
en++es
as
well
as
providing
a
query
interface)
41. Life
Sciences
Example
-‐
Allegrograph
• Drugs from Drug Bank
• Looked them up in the text of the clinical trials
LinkedCT
• Looked up all side effects in SIDER and
looked them up in the texts in the clinical trials.
• Resulted in about a million new triples.
• Ability to now search for a drug, find all the
clinical trials that mention them and then also
find all the side effects also mentioned in the
same trials.
43. Life
Sciences
Example
-‐
Allegrograph
Namely, we took a look at information dealing
with:
- drugs
- targets
- diseases
- side-effects
And ran a query to find all clinical trials for
Atorvastatin where side effect of Atorvastatin
(or lipitor) is type 2 diabetes
44. Life
Sciences
Example
-‐
Allegrograph
SPARQL query:
SELECT ?drug ?sideeffect ?trial WHERE {
?drug rdfs:label 'Atorvastatin' .
?sideeffect rdfs:label 'Type 2 Diabetes' .
?trial franz:discusses-drug ?drug .
?trial franz:discusses-side-effect ?sideeffect .
} limit 10
Translated
into
English,
the
SPARQL
query
reads:
“find
every,
drug,
sideffect
and
clinical
trial
where
the
label
of
the
drug
is
Atorvasta+n,
the
side
effect
is
type
2
diabetes,
restrict
output
to
10
”
Example
by:
(Jans
Aasman
–
Franz
Inc)
Web
3.0’s
database
47. Online
Commerce
• BEST
BUY
and
other
retailers
are
using
seman+c
technologies
to
improve
visibility
of
of
products
and
services
leveraging:
– Goodrela+ons
Ontology
for
e-‐Commerce
– RDFa
53. Summary
• Significant
adop+on
in
many
arenas
and
by
many
of
the
“major
players”
• Growing
number
of
Vendor’s
providing
services
and
tools
• Many
open
source
tools
&
resources
(“RDFizers”,
SPARQL
endpoints,
SINDICE
–
Seman+c
Web
index)
• Technology
mature
enough
at
this
point
to
provide
compe++ve
advantage
in
many
arenas.