The document discusses new metaphors for data papers and data citations. It notes that metaphors are pervasive in thought and language, and digital objects like files and folders are based on metaphors. It then provides an overview of the California Digital Library and how their environment and focus has changed from preservation to include curation and support for data producers. Forces like rising journal costs, increased research publication and declining budgets create structural problems for libraries. The document advocates a practical incremental approach to the complex problem of data curation, including initiatives like DataONE and the use of data papers and citations.
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
New Insights on Data Curation: Data Papers and Data Citations
1. New
Metaphors:
Data
Papers
and
Data
Cita4ons
2 7
F e b r u a r y
2 0 1 2
U C
C u r a 4 o n
C e n t e r
C a l i f o r n i a
D i g i t a l
L i b r a r y
2. Metaphors
we
live
by
“...
metaphor
is
pervasive
in
everyday
life,
not
just
in
language
but
in
thought
and
ac4on.
Our
ordinary
conceptual
system,
in
terms
of
which
we
both
think
and
act,
is
fundamentally
metaphorical
in
nature.”
From
Lakoff
and
Johnson,
Metaphors
We
Live
By,
1980
(thanks
to
Parsons
&
Fox,
Is
Data
Publica8on
the
Right
Metaphor?,
2011)
3. Digital
=
Metaphorical
Everything
is
a
story
on
top
of
sequences
of
bits
• Fonts,
files,
folders,
formaXng,
phone
calls
• Programs,
protocols,
data,
tweets,
even
bits
Old
metaphors
can
impede
technical
change
Disrup4ve
technical
change
is
inevitable
4. Roadmap
for
today’s
talk
• Who
we
are
• What’s
changed
• Forced
incrementalism
• Data
cita4on
• Tradi4onal
ar4cles
• Data
papers
• Closing
metaphor
6. California
Digital
Library
–
born
1997
University
of
California
stakeholders
CDL
supports
the
research
lifecycle
• 10
campuses
• Collec4ons
• 226K
students,
134K
faculty
&
staff
• Digital
Special
Collec4ons
• 100’s
of
museums,
art
galleries,
• Discovery
&
Delivery
observatories,
marine
centers,
• Publishing
Group
botanical
gardens
• UC
Cura4on
Center
(UC3)
• 5
medical
centers
• 5
law
schools
• 3
Dept.
of
Energy
na4onal
labs
7. Our
environment
circa
2002-‐2008
Focus
on
preserva4on
For
memory
organiza4ons
Infrastructure:
sta4c
Services:
hosted
Content:
museum
&
library
Sustainability:
?
8. Our
environment
since
2008
Focus
on
preserva4on
cura8on
(lifecycle)
For
memory
organiza4ons
and
now
data
producers
Infrastructure:
sta4c
+
cloud,
vm,
bitbucket
Services:
hosted
+
partnered,
self-‐serve
Content:
museum
&
library
data,
web
crawls
Sustainability:
?
cost
recovery,
pay
once
9. The
Library
Reality
• Journal
expenditures
rising
Journal
expenditures
are
outpacing
library
• Increase
in
budgets
research
publica4on
• Increase
in
researchers
• Declining
budgets
10. The
Library
Reality
• Journal
expenditures
rising
• Increase
in
research
publica4on
• Increase
in
researchers
• Declining
budgets
The
growth
of
acEve,
peer
reviewed
learned
journals
since
1665
(Mabe,
2003)
11. The
Library
Reality
• Journal
expenditures
rising
• Increase
in
research
publica4on
• Increase
in
researchers
• Declining
budgets
(Mabe
2004,
based
on
data
from
ISI
and
NSF)
12. The
Library
Reality
• Journal
expenditures
rising
• Increase
in
research
publica4on
• Increase
in
researchers
• Declining
budgets
13. Trends
create
a
structural
problem;
calls
on
libraries
to
do
more
with
less
14. Trends
create
a
structural
problem;
climb
the
mountain
step
by
step
...
16. Prac8cal
incrementalism
for
the
complex
problem
of
data
cura8on
• Baby
steps
–
data
paper/cita4on
metaphors
• Chipping
away
–
making
the
problem
smaller
• DataONE
global
data
network
[NSF]
• Merrio
data
repository
• EZID
for
crea4ng
DOIs,
ARKs,
and
URNs
• Data
management
plans
(DMPTool)
• Web
archiving
service
(WAS)
[Library
of
Congress]
• Open-‐source
Excel
add-‐in
[MS
Research
&
GBMF]
17. Prac8cal
incrementalism
for
the
complex
problem
of
data
cura8on
• Baby
steps
–
data
paper/cita4on
metaphors
• Chipping
away
–
making
the
problem
smaller
• DataONE
global
data
network
[NSF]
• Merrio
data
repository
• EZID
for
crea4ng
DOIs,
ARKs,
and
URNs
• Data
management
plans
(DMPTool)
• Web
archiving
service
(WAS)
[Library
of
Congress]
• Open-‐source
Excel
add-‐in
[MS
Research
&
GBMF]
18. The
scien4fic
record
is
at
risk
Data
dissemina4on
is
rare,
risky,
expensive,
labor-‐intensive,
domain-‐specific,
and
receives
liole
credit
as
research
output
Global
Change
Galac4c
Change
19. What
data
cita4on
offers
• Credit
• Discovery
• Impact
tracking
– Helping
data
authors
verify
use
of
their
data
and
– Helping
iden4fy
how
others
have
used
the
data
• With
archiving:
re-‐use
and
reproducibility
25. Need
to
save
data
+
processing
Algorithms
+
Data
Structures
=
Programs
26. Vision
for
a
“data
paper”
• Wrap
the
unfamiliar
in
a
familiar
façade
• A
“data
paper”
is
minimally
a
cover
sheet
and
a
set
of
links
to
archived
ar4facts
• Cover
sheet
contains
familiar
elements:
4tle,
date,
authors,
abstract,
and
persistent
iden4fier
(DOI,
ARK,
etc.)
• Just
enough
to
permit
basic
exposure
and
discovery
– Building
a
basic
data
cita4on
– Indexing
by
services
such
as
Web
of
Science,
Google
Scholar
– Ins4lling
confidence
in
the
iden4fier’s
stability
27. Data
Papers
at
the
CDL
UC
CuraEon
Center
Publishing
Services
Program
• Merrio
Cura4on
repository
• Online
journals,
with
peer
review
• EZID:
Persistent
id
management
• Scholarly
communica4on:
grey
and
resolu4on
(ARKs,
DOIs,
et
al.)
literature
to
post-‐prints
• Search
and
display
tools
(XTF)
29. Data
paper:
envisioned
outcomes
• Familiar
look
and
feel
eases
adop4on
and
indexing
• Aoribu4on
mo4vates
deposit
• Stable
storage
and
ids
leads
to
cita4on
and
impact
• Data
products
enter
the
record
instead
of
being
lost
• Data
journals
spring
up
around
disciplines
30. Metaphors
we
close
with
“Our
ordinary
conceptual
system,
in
terms
of
which
we
both
think
and
act,
is
fundamentally
metaphorical
in
nature.”
OTOH,
“the
more
things
change
the
more
they
remain
the
same”
31. Ques4ons?
John.Kunze@ucop.edu
California
Digital
Library
hop://www.cdlib.org/
“Data
Paper”
Paper:
hop://escholarship.org/uc/item/9jw4964t