7. Monday, November 21, 11
Old Silos
We in the library and publishing trades force readers, some of them who are authors as
well, to search iteratively for information they want or need or thinks might exist, in
many different silos, using many different search engines, forms, and vocabularies. We
do not make it easy for them to discover what is locally available, what is more or less
easy to get, or everything that might be available.
No wonder the young and foolish depend upon and believe in Google’s searches.
Google is quick...and in terms of search terms of relevance, very, very dirty.
8. Monday, November 21, 11
We give them better interfaces, ones that permit refinement of results, to our holdings at
the title level, BUT...
9. Monday, November 21, 11
Simulateneously, we show them many other tools, each excellent in some ways, to
continue their exploration of the literature. No single tool is comprehensive. We do not
refer our clients to the Web, at least not on our own web sites! // Our OPACs refer to our
holdings. While Indices and abstracts refer our readers to articles in journals to which
we may have licensed. SFX and similar provide readers with links to titles revealed to
which we have subscribed. Neither our opacs nor the secondary databases directly to
more than a tiny, percentage of the vast collection of pages that is the World Wide Web.
The Web, of course, refers in fragmentary fashion to information resources we might, I
emphasize, MIGHT have on hand for our readers.
10. Monday, November 21, 11
And the results of using other, often very good, discovery tools differ in relevance
ranking, format, and options than the ones we provide for our OPAcs, thus adding
confusion.
11. Monday, November 21, 11
some of us provide our readers with lots of databases to search. Too many really, for all
but a few are not forensic-level scholars.
12. Monday, November 21, 11
Selecting a licensed data base is an art in itself!
Once again notice that we rarely offer a web search engine as an option, and for good
reasons. Nevertheless, the discoverable relevant information resources on the web
apparently are not part of our repertory.
13. !!!
Monday, November 21, 11
We have not conspired to make the search for relevant information objects difficult. We
just have not yet had the tools, the methods, the vision, and yes, the gumption to try
something new.
14. ATLAS at LHC -- 150*106 sensors
Ntl Cntr for
Biotech Info
NSF CyberInfrastructure
quake engineering simulation
Monday, November 21, 11
Here’s a teensy slice of the information and communication environment in which our
faculty and students find themselves. And it gets more complex every day. Alas the
larger the number of websites indexed by Bing or Google or whatever search engine du
jour, the more likely it is that the relevance of the returns will be less pointed and
precisely matched to what the searcher hoped to find.
17. One size fits all???
17
Monday, November 21, 11
Does
one
size
fit
all?
18. 18
Monday, November 21, 11
Not
quite.
Even
Google
has
silos
and
uses,
as
do
others,
clever
interfaces
to
hide
the
fact
of
the
silos.
19. Monday, November 21, 11
Given all these silos and search engines, our users, our authors, and readers, and
teachers, and students, people on the street, our nations...need us to find a better way.
Facts about the information objects we have acquired or leased, facts about books,
articles, films, and so forth that we have published need to be found in the wild, on the
web. Ideally, we, librarians and publishers will get the facts about what we have and
what we are making public, for fun or profit, discoverable on the Web.
20. Discovery & Access
... the problems
Monday, November 21, 11
Let’s dwell on the problems
briefly...
21. 1. Too many stovepipe systems
2. Too little precision
with inadequate recall
3
3. Too far removed from W
Web
Wide
World
Monday, November 21, 11
22. 1. Too many stovepipe systems
Monday, November 21, 11
23. 1. Too many stovepipe systems
The landscape of discovery & access
services is a shambles
Monday, November 21, 11
24. 1. Too many stovepipe systems
The landscape of discovery & access
services is a shambles
It can’t be mapped in any logical way
Monday, November 21, 11
25. 1. Too many stovepipe systems
The landscape of discovery & access
services is a shambles
It can’t be mapped in any logical way
• not by us (the supposed information pros)
• not by the faculty & students who must navigate the chaos
Monday, November 21, 11
26. 1. Too many stovepipe systems
The landscape of discovery & access
services is a shambles
It can’t be mapped in any logical way
• not by us (the supposed information pros)
• not by the faculty & students who must navigate the chaos
This state of affairs shouldn’t be a surprise
Monday, November 21, 11
27. 2. Too little precision
with inadequate recall
Monday, November 21, 11
28. 2. Too little precision
with inadequate recall
Some of the problem ... too many stovepipe systems
Monday, November 21, 11
29. 2. Too little precision
with inadequate recall
Some of the problem ... too many stovepipe systems
• dumbing-down effects of federation often hinder explicit searches
• each interface has its own search-refinement tricks
• numerous, overlapping discovery paths hamper full recall
Monday, November 21, 11
30. 2. Too little precision
with inadequate recall
Some of the problem ... too many systems
• dumbing down effects of federation often hinder explicit searches
• each interface has its own search-refinement tricks
• numerous, overlapping discovery paths hamper full recall
Most of the problem ...
limitations in the design & execution of infrastructure
that supports discovery & access
Monday, November 21, 11
32. the 1st limiting factor ... ambiguity
Most of our metadata uses a string of bytes
to label a semantic entity [people, places, things, events, ...]
Monday, November 21, 11
33. the 1st limiting factor ... ambiguity
Most of our metadata uses a string of bytes
to label a semantic entity [person, place, thing, event, ...]
• discovery based on matching text labels
• not on the gist of semantic entities
Monday, November 21, 11
34. the 1st limiting factor ... ambiguity
Most of our metadata uses a string of bytes
to label a semantic entity [person, place, thing, event, ...]
• discovery based on matching text labels
• not on the gist of semantic entities
For libraries, the fix is authorities
• authoritative forms of strings
(names, organization, titles, places, events, topics, etc.)
Monday, November 21, 11
35. the 1st limiting factor ... ambiguity
Most of our metadata uses a string of bytes
to label a semantic entity [person, place, thing, event, ...]
• discovery based on matching text labels
• not on the gist of semantic entities
For libraries, the fix is authorities
• authoritative forms of strings (names, organization, titles,
places, events, topics, etc.) work to improve precision and recall
hold on
... what about cases where no one-to-one relationship exists
between a string-of-text label & the underlying semantic entity
Monday, November 21, 11
36. the 1st limiting factor ... ambiguity
Most of our metadata uses a string of bytes
to label a semantic entity [person, place, thing, event, ...]
• discovery based on matching text labels
• not on the gist of semantic entities
For libraries, the fix is authorities
• authoritative forms of strings (names, organization, titles,
places, events, topics, etc.) work to improve precision and recall
hold on
... what about cases where no one-to-one relationship exists
between a string-of-text label & the underlying semantic entity
Take for example the text string: jaguar
byte string: 4a 61 67 75 61 72
Monday, November 21, 11
37. ... a rose is a rose is a rose
company
Ltd.
cars
XK series, in pro-
duction since 1996
E-Type (UK) or
XK-E (US) mftg
1961 to 1974
etc.
hardware & software
Atari video
game console
Macintosh
OS X 10.2
John Giannandrea, CTO, Metaweb
Monday, November 21, 11
Imagine this keyword search and realize the ambiguity of the term “jaquar”
inspired by John Giannandrea, CTO, Metaweb ... from his presentation at PARC in
April, 2008
38. ... a rose is a rose is a rose
company music
Ltd. heavy metal band formed
in Bristol, England. Dec 1979
cars
Fender electric guitar,
XK series, in pro- introduced in 1962
duction since 1996
Philadelphia-based
singer/songwriter
E-Type (UK) or Jaguar Wright
XK-E (US) mftg
1961 to 1974
etc. military
type 140 Jaguar
class fast attack
craft [torpedo],
hardware & software Germany WWII
Atari video
game console Anglo-French ground
attack aircraft
Macintosh XF10F prototype swing-wing
OS X 10.2 fighter, early 1950s, Grumman
John Giannandrea, CTO, Metaweb
Monday, November 21, 11
inspired by John Giannandrea, CTO, Metaweb
... from his presentation at PARC in April, 2008
39. ... a rose is a rose is a rose
company music
Ltd. heavy metal band formed
in Bristol, England. Dec 1979
cars
Fender electric guitar, heros
XK series, in pro- introduced in 1962
duction since 1996
The Jaguar is a superhero
published by Archie Comics
Philadelphia-based
singer/songwriter
E-Type (UK) or Jaguar Wright
XK-E (US) mftg
1961 to 1974 DC Comics' Impact series,
... loosely based on Archie
Comics' character
etc. military
type 140 Jaguar
class fast attack pro footbal
craft [torpedo],
hardware & software Germany WWII
Jacksonville
Atari video
game console Anglo-French ground
attack aircraft
Macintosh XF10F prototype swing-wing
OS X 10.2 fighter, early 1950s, Grumman
John Giannandrea, CTO, Metaweb
Monday, November 21, 11
inspired by John Giannandrea, CTO, Metaweb
... from his presentation at PARC in April, 2008
40. Prrrrr
... a rose is a rose is a rose
company music
Ltd. heavy metal band formed
in Bristol, England. Dec 1979
cars
Fender electric guitar, heros
XK series, in pro- introduced in 1962
duction since 1996
The Jaguar is a superhero
published by Archie Comics
Philadelphia-based
singer/songwriter
E-Type (UK) or Jaguar Wright
XK-E (US) mftg
1961 to 1974 DC Comics' Impact series,
... loosely based on Archie
Comics' character
etc. military
type 140 Jaguar
class fast attack pro footbal
craft [torpedo],
hardware & software Germany WWII
Jacksonville
Atari video
game console Anglo-French ground
attack aircraft
Macintosh XF10F prototype swing-wing
OS X 10.2 fighter, early 1950s, Grumman
John Giannandrea, CTO, Metaweb
Monday, November 21, 11
inspired by John Giannandrea, CTO, Metaweb
... from his presentation at PARC in April, 2008
41. the 2nd limiting factor
... instance-based metadata
Monday, November 21, 11
42. the 2nd limiting factor
... instance-based metadata
Most of our metadata uses focuses
on publication artifacts
• identify responsibility for its creation
• list topical headings
Monday, November 21, 11
43. the 2nd limiting factor
... instance-based metadata
Most of our metadata uses focuses
on publication artifacts
• identify responsibility for its creation
• list topical headings
For simple cases ... few worries
• as with ambiguity, one-to-one relationships pose few problems
• things work for authors with a few books in several editions
Monday, November 21, 11
44. the 2nd limiting factor
... instance-based metadata
Most of our metadata uses focuses
on publication artifacts
• identify responsibility for its creation
• list topical headings
For simple cases ... few worries
• as with ambiguity, one-to-one relationships pose few problems
• things work for authors with a few books in several editions
But, as complexity increases,
precision & recall suffer
Monday, November 21, 11
45. Prolific authors ... search:
Shakespeare’s Hamlet
Wading thru search results for authors 811 entries
like Shakespeare shows clearly the
effects that instance-based metadata
has on precision & recall
Monday, November 21, 11
A Socrates (Stanford Libraries OPAC) keyword search for the terms shakespeare and
hamlet
46. Prolific authors ... search:
Shakespeare’s Hamlet
Wading thru search results for authors 811 entries
like Shakespeare shows clearly the
effects that instance-based metadata
has on precision & recall
Unflagging patience marks the task of
flipping back & forth between hundreds
of brief and full records to sort thru
the varied instances of a single entity
Monday, November 21, 11
47. Prolific authors ... search:
Shakespeare’s Hamlet
Wading thru search results for authors 811 entries
like Shakespeare shows clearly the
effects that instance-based metadata
has on precision & recall
Unflagging patience marks the task of
flipping back & forth between hundreds
of brief and full records to sort thru
the varied instances of a single entity, e.g.
• critical editions based on primary sources
• 18th & 19th century collections of the plays
• social, historical and literary essays
• histories & critiques of such writings
• video and audio recordings of performances
• reviews and indices of the same
• treatments of stagecraft, costumes, music
• life & works of notables associated with the
plays (e.g., performers, directors)
• other art forms inspired by the plays
Monday, November 21, 11
48. 3
3. Too far removed from W
Web
Wide
World
Monday, November 21, 11
49. 3
3. Too far removed from W
Web
Wide
World
Together, our metadata & collections
make up a big chunk of the “dark web”
[ info resources that search-engine spiders can’t see ]
Monday, November 21, 11
50. 3
3. Too far removed from W
Web
Wide
World
Together, our metadata & collections
make up a big chunk of the “dark web”
[ info resources that search-engine spiders can’t see ]
It’s clear that visibility on the web promotes
dramatic increases in discovery and access
Monday, November 21, 11
51. 3
3. Too far removed from W
Web
Wide
World
Together, our metadata & collections
make up a big chunk of the “dark web”
[ info resources that search-engine spiders can’t see ]
It’s clear that visibility on the web promotes
dramatic increases in discovery and access
• Library of Congress & Smithsonian images (FLICKR)
Monday, November 21, 11
52. 3
3. Too far removed from W
Web
Wide
World
Together, our metadata & collections
make up a big chunk of the “dark web”
[ info resources that search-engine spiders can’t see ]
It’s clear that visibility on the web promotes
dramatic increases in discovery and access
• Library of Congress & Smithsonian images (FLICKR)
• SULAIR’s Highwire Press ( > 2x increase via Google)
Monday, November 21, 11
53. 3
3. Too far removed from W
Web
Wide
World
Together, our metadata & collections
make up a big chunk of the “dark web”
[ info resources that search-engine spiders can’t see ]
It’s clear that visibility on the web promotes
dramatic increases in discovery and access
• Library of Congress & Smithsonian images (FLICKR)
• SULAIR’s Highwire Press ( > 2x increase via Google)
The state of affairs is well known ...
Monday, November 21, 11
55. academy
publisher
pr
od
ce u
library
pr
Scholars
ov
&
students
e id
Monday, November 21, 11
Here is a schematic to suggest how our ecosystem works. It is more complex, of
course, but the basics are embodied here.
56. Once
upon
a
&me…the
Internet
internet
Monday, November 21, 11
And here is the way the e-discovery and e-communication environment is developing.
First there was the Internet. Prophets such as Vannevar Bush, Ted Nelson, and Doug
Englebart showed us the way.
57. Then…the
World
Wide
Web
web
of
pages
internet
Monday, November 21, 11
Thanks to another profit, Tim Berners-Lee, the Internet, a network of communicating
computers, became a web of pages of information. Scholarly journal publishers and some
librarians realized early on that there were functional advantages to scholarship and to
publishing in the web of pages. Yahoo, Google, and others realized that mining the web o
pages by words on those pages, could make the rapidly growing web of pages reveal mor
through indexing and cataloging the web. Indexing won out as we now know over catalog
The next thing is the subject of this talk. It is the web of data. It is the web of relationships
constructed and expressed so that both computers and humans can identify and understa
relationships in that web. The web of data lives with the web of pages and is carried on th
Internet, the global carrier.
58. web
Under
construc&on
of
data
web
of
pages
internet
Monday, November 21, 11
This web of data is the next big thing in discovering relevant information objects and the n
big thing in empowering individuals, communities, and industries in making better use of
information that they or others create. What distinguishes this web of data, this linked dat
environment, is the principal of identifying entities, virtual & real by statements of relations
and descriptions in machine readable form. More about this as we go along.
59. web
Under
construc&on
of
data
web
of
pages
internet
aka Linked Data
Monday, November 21, 11
We
are
calling
this
next
phase
the
Linked
Data
phase,
because
it
is
enGrely
dependent
upon
statements
of
relaGonships
and
descripGons
in
machine
readable
form,
but
this
phase
may
be
onl
a
pre-‐cursor
to
another,
more
complex
and
more
difficult
web
world
to
engineer.
The
next
phase
i
the
SemanGc
Web,
which
in
theory
allows
the
machine
readable
relaGonships
and
descripGons
to
interoperate
to
saGsfy
a
person’s
requirements,
albeit
without
constant
interacGon.
In
short,
in
th
SemanGc
Web,
the
machines
will
understand
meaning
and
presumably
act
on
it.
Scarey,
eh?
60. ConstrucGon
Tools
60
Monday, November 21, 11
How
to
we
work
to
alleviate
our
problems
as
informaGon
professionals,
librarians
and
publishers?
61. Recipe
for
crea+ng
the
web
of
data
• identify people, places, things, events,
and other entities embedded in the
knowledge resources that a research
university consumes and produces
Monday, November 21, 11
62. Recipe
for
crea+ng
the
web
of
data
• identify people, places, things, events,
and other entities embedded in the
knowledge resources that a research
university consumes and produces
• tie those facts together with
named connections
Monday, November 21, 11
63. Recipe
for
crea+ng
the
web
of
data
• identify people, places, things, events,
and other entities embedded in the
knowledge resources that a research
university consumes and produces
• tie those facts together with
named connections
• publish the relationships as
crawl-able links on the web
Monday, November 21, 11
64. Recipe
for
crea+ng
the
web
of
data
• identify people, places, things, events,
and other entities embedded in the
knowledge resources that a research
university consumes and produces
• tie those facts together with
named connections
• publish the relationships as
crawl-able links on the web
Build/use apps supporting discovery
via the web of data
Monday, November 21, 11
65. 65
Monday, November 21, 11
Here
is
a
pile
of
words
represenGng
all
the
words
on
the
web
that
most
search
engines
index
constantly.
Good
search
engines
today
can
do
a
lot
with
this
pile.
BUT,
the
search
engines
create
the
percepGon
of
relaGonships,
not
based
on
meaning,
but
on
other
factors,
such
as
number
of
links
to
a
site
containing
the
words
of
interest
OR
the
traffic
to
a
site.
66. From
this
pile
of
words,
structure! 66
Monday, November 21, 11
The
Linked
Data
approach
aSempts
to
structure
the
pile
in
anGcipaGon
of
the
need
for
discovery.
That
structure
is
based
on
meaning,
on
relaGonships.
I
will
make
this
clearer
in
the
next
slides.
67. 67
Monday, November 21, 11
Here’s
a
graph
of
a
very
few
relaGonships
to
Yo
Yo
Ma,
the
great
‘cellist.
68. Linked
Data
Web 68
Monday, November 21, 11
Here’s
a
graph
of
relaGonships
to
Haggis,
just
a
fun
one
I
could
not
resist
throwing
in.
Meaning
is
provided
by
understanding
relaGonships.
69. RDF$triples$&$URIs$
• RDF$triples$=$subject$–$object$–$predicate$
– A$way$to$describe$objects$or$even$ideas$on$the$web$
– An$object$or$idea$might$have$many$RDF$triples$describing$it$
– Objects$or$ideas$need$not$exist$on$the$web!$
• URIs$=$Uniform$Resource$IdenDfiers$
– Allows$machine$interacDon$among$Web$objects$
– Various$syntacDcal$schemes$&$protocols$used$to$construct$
URIs$
– At$least$3$needed$to$support$an$RDF$(subject$–$objectJ$
predicate)$
69
Monday, November 21, 11
Geek
ingredients
to
the
construcGon
of
the
Linked
DAta
Web.
RDF
means
Resource
DescripGon
Framework,
always
expressed
as
a
simple
sentence,
though
mulGple
such
statements
might
aSach
to
a
single
enGty.
In
fact,
we
need
mulGple
RDFs
in
this
scheme.
71. The Linked Data Principles
1. Use Resource Description Frameworks as
names of things (people, places, times, objects,
ideas...anything really)
2. Use HTTP URIs so that people can look up
those names
3. When someone looks up a URI, provide useful
RDF information
4. Include RDF statements that link to other
URIs so that they can discover related things
71
Monday, November 21, 11
The
really
great
aspect
of
RDFs
is
that
they
can
refer
to
ideas,
not
just
to
physical
or
virtual
enGGes.
Any
kind
of
idea
could
be
treated.
72. Library'Metadata'
• Library'metadata'standards'closed'
• '
“Passive”'metadata,'searchable,'but…'
• In'Silos ''
• Readable,'but'not'ac=onable'
• Search'results'refinable,'but'final'
72
Monday, November 21, 11
These
are
some
of
the
edges
of
the
problem
of
library
metadata.
73. Library'Metadata' Seman/c'Web'Metadata'
Library'Metadata' Seman/c'Web'Metadata'
• Library'metadata'standards' • Open'
• Library'metadata'standards' • Open'
closed'
closed'
• “Passive”'metadata,' • Dynamic,'Contextualized'
• “Passive”'metadata,' • Dynamic,'Contextualized'
searchable,'but…'
searchable,'but…'
• In'Silos '' • In'the'wild'
• In'Silos '' • In'the'wild'
• Readable,'but'not' • Interac<ve,'Responsive'
• Readable,'but'not' • Interac<ve,'Responsive'
ac<onable'
ac<onable'
• Search'results'refinable,'but'
• Search'results'refinable,'but' • Leading'to'other'queries'&'
final' • Leading'to'other'queries'&'
final' views'
views'
73
Monday, November 21, 11
And
here
is
the
comparison
between
the
library
metadata
scene
now
and
the
one
we
advocate
for
the
Linked
Data/SemanGc
Web.
Library
metadata
in
the
Linked
Data
Web
should
be
freely
available,
constantly
updated,
o[en
reconciled
with
RDF
triple
statements
from
non-‐library
sources.
Library
Linked
Data
should
be
enGrely
open
on
the
web.
74. Make
Library
bibliographic
facts
in
to
RDFs
&
URIs;
Release
them
into
the
wild.
Make
Library
Linked
Data
OPEN.
74
Monday, November 21, 11
I
should
add
that
accounGng
for
physical
objects
in
our
collecGons,
locaGng
them,
making
our
collecGons
auditable,
and
managing
our
collecGons
seems
to
be
possible
using
Linked
Data
too,
at
least
in
principal.
76. Publishers*&*Socie/es**
making*use*of*Linked*Data*
• Aggregate*content*in*their*own*realms*&*beyond*
• Aggregate*informa/on*about*
– Conferences*
– Career*building*&*employment*opportuni/es*
– Communi/es*in*collabora/on*
– Commercial*&*other*services*suppor/ng*research*with*
specimens,*source*material,*processing,*trials*
– Produc/ve*rela/onships*with*others*
• Provide*ac/onable,*constantly*updated*links*in*
support*of*scholars,*teachers,*and*learners*
• Provide*compelling*services*tying*users*to*them*
76
Monday, November 21, 11
Libraries
too
can
use
Linked
Data
to
reveal
and
adverGse
compelling
services
offered
to
their
clients.
77. Seman4c
Web
adopters 77
Monday, November 21, 11
Here
are
some
of
the
big
players
in
the
Linked
Data
/
SemanGc
Web
world.
The
BriGsh
Library
has
released
RDFs/URIs
for
the
enGre
BriGsh
NaGonal
Bibliography.
The
Library
of
Congress
has
released
the
same
for
LCSH
&
Name
Authority
Files.
LCSH
includes
links
to
AGROVOC,
RAMEAU,
DNB,
GLIN
Subject
Thesaurus,
and
the
NaGonal
Agriculture
Library's
Subject
Index.
Every
Personal
and
Corporate
entry
in
LC/NAF
links
to
VIAF,
the
Virtual
InternaGonal
Authority
File
based
at
OCLC.
The
N
Y
Times
18
months
ago
made
all
500,000
(and
growing)
of
its
index
terms
available
in
the
wild
as
RDFs
and
URIs.
78. 78
Monday, November 21, 11
For
publishers
and
libraries...though
we
should
not
neglect
services.
79. ...if
users
can
find
it
in
their
own
context
79
Monday, November 21, 11
80. Context
Users Content
Users
=
readers,
authors,
teachers,
students 80
Monday, November 21, 11
81. Context
Users Content
Publishers
must
make
content
VISIBLE 81
Monday, November 21, 11
I
am
using
the
imperaGve
here,
because
invisible
published
content
means
invisible
benefit
to
the
author
and/or
the
publisher.
82. 82
Monday, November 21, 11
Here
is
a
recent
PLoS
arGcle
from
PLoS
Neglected
Tropical
Diseases.
83. 83
Monday, November 21, 11
And
here
is
the
semanGcally
enhanced
version
of
this
arGcle,
enhancements
provided
by
David
ShoSen
et
al.
in
the
form
of
links
to
further
informaGon,
interacGve
figures,
re-‐orderable
reference
list,
citaGons
in
context
and
tag
trees.
These
enhancements
took
10
man
weeks
in
2009!
However,
with
the
growing
ecology
of
linked
data,
much
of
this
could
be
accomplished
by
auto-‐tagging
and
algorithmic
construcGon
of
the
basic
RDFs
&
URIs
for
the
unique
arGcle.
Microdata
submiSed
by
some
publishers
and
their
supporGng
services
to
schema.org
lead
to
these
exciGng
possibiliGes.
84. aggrega+on
84
Monday, November 21, 11
AggregaGon
counts,
but
think
how
much
more
we
would
get
if
we
could
aggregate
from
libraries,
publishers,
and
the
wild
and
weird
variety
of
sources
on
the
web?
86. Disambigua4on
86
Monday, November 21, 11
RDFs
and
URIs
can
operate
in
many
languages
and
relaGonships
can
be
expressed
across
languages,
a
potenGal
big
benefit
to
research
and
collaboraGon
in
research.
87. Web
of
Data
Progress
87
Monday, November 21, 11
88. 2007
88
Monday, November 21, 11
FOAF
=
Friend
of
a
Friend.
Hundreds
of
millions
of
RDFs/URIs.
Fortunately
they
do
not
take
much
space
in
memory!
89. 89
Monday, November 21, 11
This
is
the
2011
graph
of
enGGes
supplying
RDFs
and
URIs.
Now
the
populaGon
is
in
the
hundreds
of
billions,
heading
to
trillions.
90. 2011
90
hSp://inkdroid.org/lod-‐graph/
Monday, November 21, 11
92. Linked'Open'Data'Value'Proposi4on'
• Linked'open'data'(LOD)'puts'informa4on'where'people'are'looking'for'it'–'on'
the'Web;''
• LOD'can'expands'discoverability'of'our'content;''
• LOD'opens'opportuni4es'for'crea4ve'innova4on'in'digital'scholarship'and'
par4cipa4on;''
• LOD'allows'for'open'con4nuous'improvement'of'data;''
• LOD'creates'a'store'of'machineDac4onable'data'on'which'improved'services'can'
be'built;''
• Library'linked'open'data'might'facilitate'the'break'down'the'tyranny'of'domain'
silos;''
• LOD'can'provide'direct'access'to'data'in'ways'that'are'not'currently'possible;''
• LOD'provides'unan4cipated'benefits'that'will'emerge'later'as'the'stores'of'LOD'
expand'exponen4ally.''
'
A"product"of"the"Stanford/CLIR"Linked"Data"Workshop"June"2011."
92
Monday, November 21, 11
25
ParGcipants
from
the
BriGsh
Library,
the
Bibliothèque
naGonale
de
France,
the
Deutsch
NaGonalbibliothek,
the
Royal
Library
of
Denmark,
Aalto
University
in
Finland,
the
Library
of
Congress,
the
Bibliotheca
Alexandrina,
the
NaGonal
InsGtute
of
InformaGcs
of
Japan,
Google,
Seme4,
Emory,
University
of
Virginia,
University
of
Michigan,
California
Digital
Library,
Knowledge
MoGfs,
CLIR,
and
Stanford.
93. Google
using
Stanford
bib
facts
+
web
resources 93
Monday, November 21, 11
This
is
a
movie
of
a
live
interacGon
with
Freebase
using
bibliographic
facts
from
Stanford,
and
linked
informaGon
resources
from
the
web.
It
shows
in
a
limited
way
the
potenGal
for
discovery
and
retrieval
in
the
Linked
Data
Web.
94. BnF
using
data
only
from
its
catalogs
&
Gallica
94
Monday, November 21, 11
This
is
another
movie
of
the
Linked
Data
prototype
based
enGrely
on
bibliographic
facts
from
the
BnF
catalogs
and
digital
texts
in
Gallica.
There
are
no
other
web
resources
drawn
into
this
prototype...yet.
97. Value
Proposi-on
for
LAM’s
We
in
the
cultural
heritage
and
knowledge
management
institutions
are
discovering
better
ways
of
publishing,
sharing,
and
using
information
by
linking
data
and
helping
others
do
the
same.
Through
this
work,
we
have
come
to
value
and
to
promote
the
following
practices:
1.
Publishing
data
on
the
web
for
discovery
and
use,
rather
than
preserving
it
in
dark,
more
or
less
unreachable
archives
that
are
often
proprietary
and
pro?it
driven;
2.
Continuously
improving
data
and
Linked
Data,
rather
than
waiting
to
publish
“perfect”
data;
3.
Structuring
data
semantically,
rather
than
preparing
?lat,
unstructured
data;
4.
Collaborating,
rather
than
working
alone;
5.
Adopting
Web
standards,
rather
than
domain
speci?ic
ones;
6.
Using
open,
commonly
understood
licenses,
rather
than
closed
and/or
local
licenses.
from
the
Stanford/CLIR
Workshop
on
Linked
Data,
June
2011
97
Monday, November 21, 11
In
each
couplet,
we
emphasize
the
second
half,
a[er
“rather
than”,
admitng
that
someGmes
the
first
half
of
the
couplet
has
to
be
operaGve.
98. DARPA
Internet
98
Monday, November 21, 11
This
is
where
we
started
2.5
decades
ago.
99. World
Wide
Web 99
Monday, November 21, 11
Thanks
to
Tim
Berners-‐Lee
and
many
others,
we
advanced
in
this
environment
from
the
early
1990s
unGl
today.
100. SOCIAL
WEB
100
Monday, November 21, 11
We
cannot
ignore
the
social
web
that
exists
in
the
current
WWW,
but
think
how
much
more,
some
of
it
scarey,
could
be
done
in
the
Linked
Data
Web
with
the
behaviors
of
the
Social
Web.
101. Linked
Data
Web 101
Monday, November 21, 11
Just
that
funny
reminder
of
the
fundamental
nature
of
the
Linked
Data
Web:
expressing
machine
acGonable
relaGonships.
102. Seman+c
Web 102
Monday, November 21, 11
And
in
the
next
web,
the
SemanGc
Web,
who
knows
what
may
be
possible.
103. Ubiquitous
compu+ng
103
Monday, November 21, 11
To
the
progression
of
network
types,
we
need
to
add
a
couple
of
enormously
important
environmental
factors.
Ubiquitous
compuGng
is
a
very
important
one.
Having
lots
of
computers
on
the
net
makes
the
possibility
of
an
open
global
linked
data
web
very
strong.
104. Mobility
104
Monday, November 21, 11
And
our
ability
to
communicate
by
voice
(how
about
that
Siri?)
and
by
bits/bytes
from
everywhere,
is,
perhaps,
just
another
aspect
of
ubiquitous
compuGng.
105. Ubiquitous
Compu4ng
Linked
Web
M
o
b
i
l
e
Web
Social
Web
Internet
105
Monday, November 21, 11
The
black
box
in
the
upper
right
corner
is
the
SemanGc
Web,
a
level
of
sophisGcaGon
yet
to
be
achieved.
The
linked
data
web
is
at
hand,
though.
Will
Librarians
and
Publishers
join
the
development
of
the
Linked
Open
Data
web?
I
certainly
think
we
should.
107. W3C Library Linked Data Incubator
Group
http://www.w3.org/2005/Incubator/lld/
A Bibliographic Framework
Initiative General Plan for the
Digital Age (October 31, 2011)
http://www.loc.gov/marc/
transition/news/
framework-103111.html
Linked
Data
Survey
&
Workshop
June
2011
hSp://www.clir.org/pubs/archives/linked-‐data-‐
survey/ 107
Monday, November 21, 11