NISO Working Group Connection LIVE! Research Data Metrics Landscape

NISO Working Group Connection LIVE!
Research Data Metrics Landscape:
An update from the NISO Altmetrics Working Group B: Output Types &
Identifiers
Monday, November 16, 2015
Presenters:
Kristi Holmes, PhD, Director, Galter Health Sciences Library, Northwestern University
Mike Taylor, Senior Product Manager, Informetrics, Elsevier
Philippe Rocca-Serra, Ph.D., Technical Project Leader, Oxford
Tom Demeranville, THOR Senior Project Officer & ORCiD Software Engineer
Martin Fenner, Technical Director, DataCite
Dr. Sarah Callaghan, Senior Researcher and Project Manager, British Atmospheric Data Centre
Dr. Melissa Haendel, Associate Professor, Ontology Development Group, OHSU Library, Dept of
Medical Informatics and Clinical Epidemiology, Oregon Health & Science University
http://www.niso.org/news/events/2015/wg_connections_live/altmetrics_wgb/

Data-LevelMetrics
MartinFenner
DataCiteTechnicalDirector
http://orcid.org/0000-0003-1419-
2405

ProjectPartners
California Digital Library,PLOS,DataONE
National Science Foundation Grant 1448821
http://www.nsf.gov/awardsearch/showAward
? AWD_ID=1448821
ProjectPage
http://mdc.lagotto.io
MakingDataCount

MDCTeam
StephenAbrams
Matt Jones
Peter Slaughter
John Kratz
DaveVieglais
Projectends February 29, 2016
Jennifer Lin
John Chodacki
Patricia Cruse
Martin Fenner
Kristen Ratan
CarlyStrasser

Goals
What metrics for research data do
researchers and datamanagers want?
Dodata repositories make these
metrics available?
If not, build services to collect these
metrics for DataONErepository
network

How interested would yoube to know eachof
the following about the impact of yourdata?
http://doi.org/10.1038/sdata.2015.39
http://www.dx.doi.org/10.5060/D8H59D

What metrics/statistics doesyourrepository
currently track andexpose?
http://doi.org/10.1038/sdata.2015.39
http://www.dx.doi.org/10.5060/D8H59D

Metadata ofdatasets
https://search.labs.datacite.org/?q=10.5061%2FDRYAD.KG943

Metadata ofarticles
References are part of the
metadata deposited to CrossRef
Cited-by service aggregates these
citations for CrossRefDOIs
Work is underway to exchange DOI<->
DOI linksbetween CrossRef and DataCite

https://cls.labs.datacite.org htts://det.crossref.org
DOI<->DOIlinks are stored outside of
the DataCite and CrossRef Metadata
Stores

Fulltextsearch
http://dlm.labs.datacite.org/works/http://doi.org/10.5061/dryad.f1cb2

Secondorderevents
http://dlm.labs.datacite.org/sources/pmceurope

UsageStats
aggregate DataOne usagelog files
from DataOne member nodes
parse logs, applyingCOUNTERrules
•
•
double-click
intervals whitelist
useragents
two versions of usage
stats•
•
COUNTER-compliant
partial compliant (include some

Average%
of not
filtered
since2005COUNTER 63.57%
Partial 63.59%
this pastyearCOUNTER 44.88%
Partial 47.05%
UsageStats

FutureWork
• Collect data citations from CrossRef
• Analyze usage statistics in more detail and
provide input to COUNTERand NISO
• Analyzenetwork graph, e.g.linked datasets
and second ordercitations
• Turn researchproject into service, including
integration of client applications for search and
reporting

Introducing the
Metadata Model v1
Philippe Rocca-Serra PhD,
University of Oxford e-Research Centre
on behalf of WG3 Metadata WG
Supported by the NIH grant 1U24 AI117966-01 to the University of

A trans-NIH funding initiative established
to enable biomedical research as a digital
research enterprise
• Facilitate broad use of biomedical digital assets by making them discoverable,
accessible, and citable ->
• Conduct research and develop the methods, software, and tools needed to
analyze biomedical Big Data ->
catalog to enable researchers to find, cite research datasets
ease the use community standards to annotate datasets

Lucila Ohno-Machado (PI)
Jeff Grethe
Supported by the NIH grant 1U24 AI117966-01 to the University of California, San Diego

Supported by the NIH grant 1U24 AI117966-01 to the University of California, San Diego

Pilot applications that ‘dock’ with the prototype and
community-driven activities via Working Groups:
1. BD2K Centers of Excellence Collaboration
2. Data Identifiers Recommendation
3. Metadata Specifications
4. Use Cases and Testing Benchmarks
5. Dataset Citation Metrics
6. Criteria for Being Included in the DDI
7. Machine Actionable Licenses
8. Ranking Algorithm
9. End User Evaluation Criteria
10. Repository Collaboration
11. Outreach Meeting: Repository Operators
12. Standard-driven Curation Best Practices
13. Evaluation of Harvesting and NLP Pilot Projects
All this by
August 2017!

 Joint effort with BD2K Center for Expanded
Data Annotation and Retrieval (CEDAR)
 Synergies with BD2K cross-centers
Metadata WG (co-chaired by M
Musen/CEDAR, G Alter/bioCADDIE) and
ELIXIR activities

WG3 Metadata - Goals
 Define a set of metadata specifications that support intended
capability of the Data Discovery Index prototype - being designed by
the bioCADDIE Core Development Team - as outlined in the White
Paper
 Core metadata, designed to be future-proofed for progressive
extensions (phase 1: May-July 2015)
 Followed by test and implementation phase
 Domain specific metadata for more specialized data types (phase 2)
 Use cases and the competency questions have been used
throughout the process
 To define the appropriate boundaries and level of granularity:
which queries will be answered in full, which only partially, and
which are out of scope

WG3 Metadata – work to date
with contributions, comments from several WG 3 members and colleagues,
in particular: Joan Starr, George Alter, Ian Fore, Kevin Read, Stian Soyland-
Reyes, Muhammad Amith, Michel Dumontier…
By:

 Contains lists of material reviewed
• data discovery initiatives and metadata initiatives
• existing meta-models for representing metadata elements
 Outlines the approach used to identify metadata descriptors
• Via use cases and competency questions (top-down
approach)
• Mapping generic and life science-specific metadata
schemas (bottom-up approach)
 Listed in the BioSharing collection for bioCADDIE
 The results of both approaches has been compared and
converged on the core set of metadata
Standard Operating Procedure (SOP)

List of Metadata Schema considered
• schema.org
• datacite
• hcls dataset descriptors
• biosample
• geo miniml
• prideml
• isatab/magetab
• ga4gh metadata schema
• sra xml
• bioproject
• cdisc sdm / element of bridge model

Bottom-up approach:
survey of existing models

 Selected competency questions
 representative set from use cases workshop, white paper, submitted by the
community and from Phil Bourne
 questions have been abstracted and key metadata elements have been
highlighted and color-coded and categorized
 as the set of core and extended metadata elements are defined, it will
become clearer which questions the Data Discovery Index will not be able
to answers if full and which only in part.
Use Cases and Derived Metadata

Processing use cases
All use cases on equal
footing
Term Binning
Material
Process
Information
Property
Relation identification

 Core metadata elements and initial model
 the result of the combined approaches has delivered a set of core metadata
elements and progressively these will/could be extended to domain specific
ones, in phase two, as needed
 we aim to have maximum coverage of use cases with minimal number of
data elements, but we do foresee that not all questions can be answered in
full
Initial Set of Metadata Elements

Initial Set of Metadata Elements

Everything is on github

Formal specifications
metadata schema in JSON
• https://github.com/biocaddie/WG3-
MetadataSpecifications/tree/master/json-schemas

What’s next ?
 With this work phase 1 has been completed
 We have entered the evaluation phase
 the model will be implemented and tested by the
bioCADDIE Development Team with a number of data
sources
 the results will inform the activities in phase 2, where
the metadata elements and the model may be
revised, simplified and/or enriched, as needed

Take Home Message
• primary goal: provide a general purpose metadata
schema allow harvesting of key experimental and
data descriptors from a variety of resources and
enable indexing to support data discovery
– relations between authors, datasets, publication
and funding sources
– nature of biological signal, nature of perturbation,

Outstanding issues
• prioritizing the use cases
• defining mechanisms to deal with domain specific,
granular data
• moving into phase2 and devising data ingesters
– ETL activities
– interact with other modeling efforts
• incorporate feedback from users and developers

Question Time

orcid.orContact Info: p. +1-301-922-9062 a. 10411 Motor City Drive, Suite 750, Bethesda, MD
20817 USA
ORCID, Metrics
andProjectTHOR
Tom Demeranville
SeniorTechnical Officier – Project
THOR
NISOWebinar, November 2015
Start Here

What isORCID?
orcid.o16 November
2015
5
5
ORCID is an infrastructure that provides unique Person
Identifiers. ORCID is a hub for linking identifiers for people
with their activities. ORCID is researcher centric with 1.7
million registered identifiers.
ORCID records are managed by the researcher themselves.
ORCID is open source, community governed and non-profit.
ORCID has a public API that allows querying of non-private
data. ORCID has a member API that enables updating and
notifications. ORCID IDs are associated with over 4 million
unique DOIs

347 members, 4 national
consortia,
over 200 integrations
researc
h
inst
68%
publishe
r
12%
funde
r
5%
9
%
associatio
n 6%
repository
ME
A
3%
orcid.o16 November
2015
5
6
Europ
e
58%
Latin
Americ
a 1%
North
Americ
a 26%
Pacifi
c
7%
Asi
a
5%

What ORCID isn’t
orcid.o16 November
2015
5
7
ORCID is not a CRIS system
ORCID is not a researcher profile system
ORCID is not a research activity metadata
store

Research outputs
orcid.o16 November
2015
5
8
• ORCID includes links to
publications, patents, datasets,
software and more.
• ORCID uses the CASRAI Output
vocabulary for work types
• ORCID references over 20 other output
identifiers (more are being added!)

Other researcher activities
orcid.o16 November
2015
5
9
• Peer
review
• Education
•
Employment

ORCID andMetrics
orcid.o16 November
2015
6
0
ORCID doesn’t track metrics – it’s not our focus
ORCID is an enabling infrastructure
ORCID improves robustness of metrics

ORCID andMetrics
orcid.o16 November
2015
6
1
• ORCID improves the quality of research
information and makes gathering it and
disseminating it easier.
• Other services use ORCID IDs to improve their
data
• ORCID IDs are found in DOI metadata, funder
systems, publishers, CRIS systems, national
reporting frameworks and more
• Institutions can discover researcher curated
standard and non-standard outputs or be
notified when added

Project THOR
http://project-thor.eu
EC funded H2020 2.5 year project
Establish seamless integration between articles, data,
and researchers across the research lifecycle
Make persistent identifier use for people and research
artefacts the default
Both human and technical in scope

Better identifiers == Better Metrics

What THOR are up to
Research - Deciding what needs to be done
Integration - Doing what needs to be done
Outreach - Getting others involved
Sustainability - Making sure it lasts

Organisation identifiers
Organisation identifiers are important for all
areas of scholarly communication, including
metrics.
The organisation identifier landscape is
fragmented. There are gaps.
It’s a hard problem. Everyone knows this.

Organisation identifiers
Community driven consensus on requirements is
needed.
We need a way forward.
THOR will help by convening meetings with all
interested parties in the community, including research
institutions, funders, datacentres, publishers,
standards bodies, existing organisation identifier and
other identifier providers.

Thanks
orcid.o16 November
2015
1
5
@tomdemeranville
t.demeranville@orcid-
eu.org

What isORCID?
orcid.o16 November
2015
7
0
ORCID is an infrastructure that provides unique Person
Identifiers. ORCID is a hub for linking identifiers for people
with their activities. ORCID is researcher centric with 1.7
million registered identifiers.
ORCID records are managed by the researcher themselves.
ORCID is open source, community governed and non-profit.
ORCID has a public API that allows querying of non-private
data. ORCID has a member API that enables updating and
notifications. ORCID IDs are associated with over 4 million
unique DOIs

347 members, 4 national
consortia,
over 200 integrations
researc
h
inst
68%
publishe
r
12%
funde
r
5%
9
%
associatio
n 6%
repository
ME
A
3%
orcid.o16 November
2015
7
1
Europ
e
58%
Latin
Americ
a 1%
North
Americ
a 26%
Pacifi
c
7%
Asi
a
5%

What ORCID isn’t
orcid.o16 November
2015
7
2
ORCID is not a CRIS system
ORCID is not a researcher profile system
ORCID is not a research activity metadata
store

Research outputs
orcid.o16 November
2015
7
3
• ORCID includes links to
publications, patents, datasets,
software and more.
• ORCID uses the CASRAI Output
vocabulary for work types
• ORCID references over 20 other output
identifiers (more are being added!)

Other researcher activities
orcid.o16 November
2015
7
4
• Peer
review
• Education
•
Employment

ORCID andMetrics
orcid.o16 November
2015
7
5
ORCID doesn’t track metrics – it’s not our focus
ORCID is an enabling infrastructure
ORCID improves robustness of metrics

ORCID andMetrics
orcid.o16 November
2015
7
6
• ORCID improves the quality of research
information and makes gathering it and
disseminating it easier.
• Other services use ORCID IDs to improve their
data
• ORCID IDs are found in DOI metadata, funder
systems, publishers, CRIS systems, national
reporting frameworks and more
• Institutions can discover researcher curated
standard and non-standard outputs or be
notified when added

VO Sandpit, November 2009
Bibliometrics for Data – what
counts and what doesn’t?
Sarah Callaghan
sarah.callaghan@stfc.ac.uk
@sorcha_ni
NISO Working Group Connections LIVE!
Research Data Metrics Landscape:
An update from the NISO Altmetrics Working Group B: Output Types &
Identifiers
Monday, November 16 from 11:00 a.m. - 1:00 p.m. (ET)

The UK’s Natural Environment Research Council (NERC)
funds six data centres which between them have
responsibility for the long-term management of NERC's
environmental data holdings.
We deal with a variety of environmental measurements,
along with the results of model simulations in:
•Atmospheric science
•Earth sciences
•Earth observation
•Marine Science
•Polar Science
•Terrestrial & freshwater science, Hydrology and
Bioinformatics
•Space Weather
Who are we and why do we
care about data?

Data, Reproducibility and Science
Science should be reproducible –
other people doing the same
experiments in the same way
should get the same results.
Observational data is not
reproducible (unless you have a
time machine!)
Therefore we need to have access
to the data to confirm the science is
valid! http://www.flickr.com/photos/31333486@N00/1893012324/sizes/
o/in/photostream/

It used to be “easy”…
Suber cells and mimosa leaves. Robert
Hooke, Micrographia, 1665
The Scientific Papers of William Parsons,
Third Earl of Rosse 1800-1867
…but datasets have gotten so big, it’s not
useful to publish them in hard copy anymore

Hard copy of the Human Genome at
the Wellcome Collection

Creating a dataset is hard
work!
"Piled Higher and Deeper" by Jorge Cham
www.phdcomics.com
Managing and archiving data so that it’s understandable by other
researchers is difficult and time consuming too.
We want to reward researchers for putting that effort in!

Most people have an idea of what a
publication is

Some examples of data (just from
the Earth Sciences)
1. Time series, some still being updated
e.g. meteorological measurements
2. Large 4D synthesised datasets, e.g.
Climate, Oceanographic, Hydrological
and Numerical Weather Prediction
model data generated on a
supercomputer
3. 2D scans e.g. satellite data, weather
radar data
4. 2D snapshots, e.g. cloud camera
5. Traces through a changing medium,
e.g. radiosonde launches, aircraft
flights, ocean salinity and temperature
6. Datasets consisting of data from
multiple instruments as part of the
same measurement campaign
7. Physical samples, e.g. fossils

What is a Dataset?
DataCite’s definition
(http://www.datacite.org/sites/default/files/Bu
siness_Models_Principles_v1.0.pdf):
Dataset: "Recorded information, regardless of
the form or medium on which it may be
recorded including writings, films, sound
recordings, pictorial reproductions,
drawings, designs, or other graphic
representations, procedural manuals, forms,
diagrams, work flow, charts, equipment
descriptions, data files, data processing or
computer programs (software), statistical
records, and other research data."
(from the U.S. National Institutes of Health (NIH)
Grants Policy Statement via DataCite's Best
Practice Guide for Data Citation).
In my opinion a dataset is
something that is:
•The result of a defined process
•Scientifically meaningful
•Well-defined (i.e. clear
definition of what is in the
dataset and what isn’t)

What metrics do we use for our data?

Metric Breakdown
CEDA
numbers
Notes
Number of
discovery
dataset
records in the
DCS
Quarterly NEODC 26
BADC 242
UKSSDC 11
Compliance with NERC data management
policy. Reflects how many data sets NERC
has. The number of dataset discovery
records visible from the NERC data
discovery service.
Web site visits Quarterly BADC:
61,600
NEODC:
10,200
Active use and visibility of the data centre.
Site visits from standard web log analysis
systems, such as webaliser. Sensible web
crawler filters should have been applied.
Web site page
views
Quarterly BADC:
219,900
NEODC:
25,800
See web visits notes.
Queries
closed this
period
Quarterly 362 helpdesk
queries
838 dataset
applications
Queries marked as resolved within the
quarter. A query is a request for information,
a problem or ad hoc data request.
Queries
received in
period
Quarterly 388 helpdesk
queries
860 dataset
applications
See closed query notes.
Data
centre
metrics –
produced
15th July
2014

Metric Breakdown CEDA numbers Notes
Percent queries
dealt with in 3
working days
Quarterly 84.06 (11.57% resolved after 3
days)
87.67 (10.23% resolved after 3
days)
Queries receiving initial response
within 1 working day
Helpdesk - 93.57 %
Dataset applications - 97.91%
Responsiveness. See
closed query notes
Identifiable users
actively
downloading
None Over year to date: BADC: 4065
NEODC: 362
Use and visibility of the
data centre. An estimate of
the number of users using
data access services over
the year.
Number of
metadata
records in data
centre web site
None BADC: 240
NEODC:33
INSPIRE compliance.
Reflects how many data
sets NERC has.
Number of
datasets
available to view
via the data
centre web site
None (Metric in development) INSPIRE compliance.
Usable services.
Number of
datasets
available to
download via the
data centre web
site
None (Metric in development) INSPIRE compliance.
Usable services.
Data
centre
metrics –
produced
15th July
2014

Metric Breakdown CEDA numbers Notes
NERC funded Data centre
staff (FTE)
None 14 (estimate for FY
14/15)
Data management costs. Efficiency.
Number of full time equivalent posts
employed to perform data centre
functions.
Direct costs of Data
Stewardship in data centre
None (reportable at end of
financial year)
Data management costs. Efficiency. Cost
to NERC
Capital Expenditure directly
related to Data Stewardship
at data centre
None (reportable at end
financial year)
Data management costs. Efficiency.
Direct Receipts from Data
Licenses and Sales
None £0
(CEDA does not
charge for data)
Commercial value of data products and
services
Number of projects with
Outline Data Management
Plans
None (Metric in
development)
Means of tracking projects’ adoption of
good DM practice. Outline DMP is at
proposal stage
Number of projects with
Full Data Management
Plans
None (Metric in
development)
Means of tracking projects’ adoption of
good DM practice. Full DMP is at funded
stage
Users by area UK 2534 61% Active use. Visibility of the data centre
internationally. Percentage of user base
in terms of geographical spread.
Europe 494 12%
Rest of the
world
1024 25%
Unknown 79 2%
Users by institute type University 2934 71% Active use. Visibility of the data centre
sectorially. Percentage of users base in
terms of the users host institute type.
Government 694 17%
NERC 160 4%
Other 277 7%
Commercial 42 1%
School 35 1%

Short answer:
We don’t know!!
Unless the data user comes back to us to tell us.
Or we stumble across a paper which
•Cites us
•Or mentions us in a way that we can find
• And tells us what the dataset the
authors used was.
This is why we’re working with other groups (like
CODATA, Force11, RDA, DataCite, Thompson
Reuters,…) to promote data citation.
After the data is downloaded,
what happens then?

How we (NERC) cite
data
We using digital object identifiers (DOIs)
as part of our dataset citation
because:
• They are actionable, interoperable,
persistent links for (digital) objects
• Scientists are already used to citing
papers using DOIs (and they trust
them)
• Academic journal publishers are
starting to require datasets be cited in
a stable way, i.e. using DOIs.
• We have a good working relationship
with the British Library and DataCite
NERC’s guidance on citing data and assigning DOIs can be found at:
http://www.nerc.ac.uk/research/sites/data/doi.asp

Dataset
catalogue page
(and DOI landing
page)
Dataset citation
Clickable link to
Dataset in the archive

Another example
of a cited dataset

Data metrics – the state of the art!
Data citation isn’t common practice
(unfortunately)
Data citation counts don’t exist yet
To count how often BADC data is used
we have to:
1. Search Google Scholar for “BADC”,
“British Atmospheric Data Centre”
2. Scan the results and weed out false
positives
3. Read the papers to figure out what
datasets the authors are talking
about (if we can)
4. Count the mentions and citations (if
any)
http://www.lol-cat.org/little-lovely-lolcat-and-big-work/
We’re working with DataCite and
Thompson Reuters to get data
citation counts.

Altmetrics and social media for data?
Mainly focussing on citation as a first
step, as it’s most commonly
accepted by researchers.
We have a social media presence
@CEDAnews
- Mainly used for announcements about
service availability
We definitely want ways of showing our
funders that we provide a good
service to our users and the research
community.
And we want to be able to
tell our depositors what
impact their data has had!

RDA/WDS WG Bibliometrics Survey
Results: Mostly Expected
Citations are preferred metrics,
downloads next.
Standards are missing.
Culture change is needed.
0 10 20 30 40 50 60 70
Nothing
Data citation counts
Downloads
Social media (likes/shares/tweets)
Mentions in peer-reviewed papers
Hits in search engines
Mentions in blogs
Bookmarks in Zotero and/or Mendeley
Other (please specify)
31.5%
68.5%
Are the methods you use to evaluate impact
adequate for your needs?
Yes
No
What do you currently use
to evaluate the impact of
data?

Other projects in the data metrics
space
1. CASRAI data level metrics
2. PLOS Making Data Count
3. NISO altmetrics
4. Jisc Giving Researchers Credit for their Data

Next steps for Bibliometrics for
Data WG
Will be based on:
• WG survey results (presented RDA P4 and P5)
• Spreadsheet of metrics being collected by repositories - Still open
for contributions! http://bit.ly/1MpyW4K
• Shared results from other projects – understanding the challenges
and answering the questions posed in the case statement
• Preliminary analysis of data DOI resolutions
• Supporting and evaluating tools from other projects
• Preliminary guidance for the community - “minimal” rather than
“best” practice – get people discussing the issues and coming up
with solutions!

Thanks!
Any questions?
sarah.callaghan@stfc.ac.uk
@sorcha_ni
http://citingbytes.blogspot.co.uk/
Image credit: Borepatch http://borepatch.blogspot.com/2010/06/its-
not-what-you-dont-know-that-hurts.html
“Publishing research without data is simply
advertising, not science” - Graham Steel
http://blog.okfn.org/2013/09/03/publishing-research-without-data-is-simply-advertising-not-science/

Title: Getting (and giving) credit for all that we do
Melissa Haendel
NISO Research Data Metrics Landscape:
An update from the NISO Altmetrics Working Group B:
Output Types & Identifiers
11.16.2015
@ontowonka

What *IS* “success”?

https://goo.gl/b60moX
It’s not always what you see

What is attribution???

Over 1000 authors

Project CRediT
http://projectcredit.net

Many contributions don’t lead to authorship
BD2K co-authorship
D.Eichmann
N.Vasilevsky
20% key personnel are not adequately profiled using publications

Some contributions are anonymous
Data deposition
Image credit: http://disruptiveviews.com/is-your-data-anonymous-or-just-encrypted/
Anonymous review

The Research Life Cycle
EXPERIMENT
CONSULT
PUBLISHDATA
FUND

• Measurement instruments
• Continuing education materials
• Cost-effective intervention
• Change in delivery of healthcare services
• Quality measure guidelines
• Gray literature
Evidence of meaningful impact
• New experimental methods, data
models, databases, software tools
• New diagnostic criteria
• New standards of care
• Biological materials, animal models
• Consent documents
• Clinical/practice guidelines
https://becker.wustl.edu/impact-assessment
http://nucats.northwestern.edu/
Diverse outputs
Diverse impacts
Diverse roles
Each a critical component of the
research process

EXAMPLE OUTPUTS related to software:
Outputs: binary redistribution package (installer), algorithm, data analytic software tool,
analysis scripts, data cleaning, APIs, codebook (for content analysis), source code,
software to make metadata for libraries archives and museums, data analytic software
tool, source code, program codes (for modeling), commentary in code(thinking of open
source-need to attribute code authors and commentator/enhancers/hackers, who can
document what they did and why), computer language (a syntax to describe a set of
operations or activities), software patch (set of changes to code to fix bugs, add
features, etc.), digital workflow (automated sequence of programs, steps to an outcome),
software library (non-stand alone code that can be incorporated into something larger),
software application (computer code that accomplishes something)
Roles: catalog, design, develop, test, hacker, bug finder, software developer, software
engineer, developer, programmer, system administrator, execute, document, software
package maintainer, project manager, database administrator
Attribution workshop results - >500
scholarly products

Connecting people to their “stuff”

Modeling & implementation
VIVO-ISF: Suite of ontologies that integrates and
extends community standards

Credit extends beyond the original
contribution
 Stacy creates mouse1
 Kristi creates mouse2
 Karen uses performs RNAseq analysis on mouse1
and mouse2 to generate dataset3, which
she subsequently curates and analyzes
 Karen writes publication pmid:12345 about
the results of her analysis
 Karen explicitly credits Stacy as an author but not Kristi.

Credit is connected
Credit to Stacy is asserted, but credit to Kristi can be

Introducing openRIF
The Open Research Information Framework
openRI
F
SciENc
v
eagle-i
VIVO-
ISF

Ensuring an openRIF that meets
community needs
Interoperability
A domain configurable suite of ontologies to enable interoperability
across systems
A community of developers, tools, data providers, and end-users

Developing a computable research
ecosystem
Research information is scattered amongst:
Research networking tools
Citation databases (e.g., PubMED)
Award databases (e.g., NIH Reporter)
Curated archives (e.g., GenBank)
Locked up in text (the research literature)
Map SciENcv data model to VIVO-
ISF/openRIF
Enable bi-directional data exchange
Integrate SciENcv, ORCID data into
CTSAsearch
http://research.icts.uiowa.edu/polyglot/
CTSAsearch:
The Open Research Information Framework
David Eichmann

Thank you!
Join the Force Attribution Working Group at:
https://www.force11.org/group/attributionwg
Join the openRIF listserv at: http://group.openrif.org

Identifying those scholarly outputs
Identifiers for things that are not publications, or
documents, need to get beyond thinking about DOIs

NISO Working Group Connection LIVE! Research Data Metrics Landscape

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a NISO Working Group Connection LIVE! Research Data Metrics Landscape

Semelhante a NISO Working Group Connection LIVE! Research Data Metrics Landscape (20)

Mais de National Information Standards Organization (NISO)

Mais de National Information Standards Organization (NISO) (20)

Último

Último (20)

NISO Working Group Connection LIVE! Research Data Metrics Landscape