1. | 1
Nikhil Joshi, MS
Consultant, Research Data Management
Research Solution Sales | Elsevier
n.joshi.1@Elsevier.com
Council of Graduate Schools Annual Meeting
December 7, 2018
Effective Research Data
Management
2. | 2
• Refers to the result of observations or
experimentation that validate research findings,
data that often underlies, but which exists outside
of research articles
• Can include but are not limited to: raw data,
processed data, software, algorithms, protocols,
methods, materials, and which are not already
published as part of a journal article
Research Data Defined
3. | 3
Driving Questions for Research Data Management
(RDM) for Maximizing Data Sharing Outcomes
1. What methods and metrics exist and which could be better
developed for measuring data sharing, re-use, and re-
analysis, and success/value with data sharing?
2. How could effects of data sharing on reproducibility of science
be measured, or integrated into existing attempts to measure
rigor and reproducibility?
3. How can we design a future data sharing ecosystem that
incorporates the capacity for easier analysis and mid-stream
adjustment?
4. | 4
Open Data: A Key Element of Open Science
Open Access
Improving access and
sharing of research
publications
Research Data
Improving access to
and use of research
data
Research Integrity
Improve reproducibility
and transparency of
research
Science & Society
Encouraging citizen
involvement &
translating science for
the public
Metrics
Developing metrics
which show the full
impact of research
Open Science has the
overarching goal of enhanced
research performance, that
aims to make science more:
• Accessible
• Collaborative
• Transparent
• Effective
• Efficient
Through:
• Encouraging a culture of
openness and sharing
• Leveraging and developing
new technologies
• Developing and adapting
reward and metric systems
5. | 5
Frameworks for Effective Open Research Data
https://www.force11.org/group/fairgroup/fairprinciples
Maslow’s Hierarchy
of Research Data
6. | 6
Differing priorities with regards to RDM practices exist between
faculty and students
Managing operation of
the lab
Funder compliance
Data storage transfer &
access
Keeping
Keeping with disciplinary
norms
Assistance w/long term
storage & preservation
Ethics in research
DMPs
Upskilling opportunities for
students
Collaborating with data
librarians
Broad sharing of
research datasets
Breadth of data science &
management skills needed
for career
Integration of systems with
existing workflows
Writing DMPs
Documentation & versioning
Sources: Open Data IGERT, Developing a Data Management Course, Managing Research Data: Grad Student & Postdoc experiences
Faculty priority
Student priority
7. | 7
Goal: Metric: How to measure:
Research Data is Saved:
1. Stored, i.e. safely available in long‐
term repository)
Nr of datasets stored in long‐term storage Mendeley Data (& 20+ repositories
indexed), DANS (dark archiving)
2. Published, i.e. long‐term preserved,
accessible via web, have a GUID,
citable, with proper metadata
Nr of datasets published, in some form Scholix, ScienceDirect, Scopus
3. Linked, to articles or other datasets Nr of datasets linked to articles Scholix, Scopus
4. Validated, by a reviewer/curated Nr of datasets in curated databases/peer
reviewed in data articles
ScienceDirect, DataSearch (across curated
DB’s)
Research Data is Seen and Used:
5. Discovered Nr of datasets viewed in
databases/websites/search engines
DataSearch, metrics from other search
engines/repositories
6. Identified DOI is resolved DataCite ‐ DOI resolution & minting
7. Mentioned Social media and news mentions Plum and Newsflo
8. Cited Nr of datasets cited in articles Scopus
9. Downloaded Downloaded from repositories Downloads from Mendeley Data or other
repositories
10. Reused Mention of usage in article or other dataset ScienceDirect, access to other data
repositories
Source: https://rdmi.uchicago.edu/papers/08212017144742_deWaard082117.pdf
Credit for Sharing and Reuse of Research Data should be defined
8. | 8
On average, all researchers & institutions benefit from the
greater impact of published datasets
Source: Scival: publications for Pennsylvania, October 2018
9. | 9
Challenges exist regarding data ownership
https://data.mendeley.com/datasets/bwrnfb4bvh/1
Data sharing survey (with 1167 respondents):
• Although 69% of respondents found that sharing data was
very important in their field
• And 73% wanted to have access to other people’s data,
• Only 37% believe there was credit in doing so,
• And only 25% felt they had adequate training to properly
share their data with others.
The main barriers for sharing data were:
• privacy concerns,
• ethical issues,
• intellectual property rights issues
10. | 10
Source: JISC: How and why you should manage your research data: a guide for researchers, Caroline Ingram, Published: 7 January 2016
SoftwareX
Data Rescue &
Software Rescue
Reproducibility Papers
Data Management
Plans
Inputs in the Research Data Cycle
11. | 11
Elsevier believes RDM needs a holistic approach
All forms of research data,
which includes everything
needed to reproduce and
reuse
Raw data Processed data
Machine &
environment settings
Protocols, methods,
workflows
Scripts, analyses, algorithms
12. | 12
12
Re-using research data improves outcomes for the research life cycle
• This means improving the research data life-cycles: (1) within the lab and (2) to the world at large
• This also means keeping track of the institutional data lifecycles, and (3) reporting on them
Three interlocking data cycles should be captured
3. ‘Metrics on data’
Monitoring and
reporting on institutional
data
• Benchmark • Rank
Evaluate
• Manage • Preserve
Institution
Find
Topic
Design
Identify
gaps
Plan &
Fund
Discover data,
people, methods &
protocols
Collect, analyze
& visualize
Prepare, reproduce,
re-use & benchmark
Store &
Share
Publish
Disseminat
e
1. Lab data
Execute
Research
2. Open data: data publicly available
13. | 13
Solutions should integrate with the broader RDM
ecosystem via open APIs
existing integration
planned integration
Index
datasets
metadata
Mint DOIs Import/export datasets,
notebooks, experiments
Repository
indexed by
OpenAIRE
Zenodo indexed
by DataSearch
Publish links
between
articles and
datasets
Datasets indexed by
DataSearchLong-term
preservation
of published
datasets
+ 30 repositories
Integrate with
machine
readable
DMPs
Open API with
any other tool
14. | 14
How we deliver:
1. Open system & open API’s; modular
approach enables integrations across many
research data solutions
2. Data remains owned by institution
3. System is integrated with the researcher
workflows: we make it simple & obvious
4. Your researchers maintain much of their
existing workflow
Mendeley Data
Benefits for researchers:
• Prevent re-work: save time searching,
collecting and sharing data
• Comply with funders' mandates
• Improve impact: increase data reuse
Benefits for institutions:
• Keep track of your data inside and outside your
institution
• Showcase institutional research outputs
• Improve collaborations within/across
institutions
14
15. | 15
Cross-platform tracking of data
15
Repositories &
ELNs
Researcher
Find & re-use data
Manage
active data
Runadoptioncampaignsandkeeptrackofdata
Institutions, labs,
research offices
Funding agencies
Grant applications, performance reporting
Mandates for sharing &
publication of research data
Data Journals
Share & publish
open data
Data Manager
Data Monitor
Receive recommendations
Collect information about data
Data Search
16. Mendeley Data Search enables
researchers to discover data:
• 22 repositories indexed to date, growing all
the time (ambition is 100+)
• Keyword search within data files
• In-line file previews
• Filter search results by specific author,
institution, journal, subject category
And retrieve active data:
• Researchers can navigate your institution’s
locally held data
• Project collaborators can retrieve project data
through powerful keyword search and filtering
Unlike other search solutions, Data Search:
• Deeply indexes data (not just metadata),
making it easier and faster to find relevant data
• Allows researchers to preview data, making it
easier and faster to find relevant data
Data Search powers all modules
Retrieve active data, discover public data
17. data.mendeley.com
Data Repository
Store results in a trusted data repository
Store up to 100 GB of data per
dataset in many formats
Describe how experiment
can be reproduced
Long-term storage
Link back to protocols
Create DOI
for Citation
(or university prefix)
Keep track of
versions of dataset
On your S3
Or on DANS
On your (local) S3 or on Elsevier cloud
Metadata:
Dublin Core and Google Science Datasets markup
Open licences & indexed in OpenAire
18. With Mendeley Data Manager,
researchers can:
• Share data privately in your research
group, or project
• Also works for collaborators outside
the institution (they can take part in
projects but not start new projects)
• Gather research data from all your
data sources as it’s generated,
including ELNs, instruments etc
• Annotate research data with detailed,
subject-specific metadata (helped by
automated annotation tools)
• Curate data according to project or
institutional workflows
• Prepare to publish data on your
repository of choice
• Open APIs allow: tailored upload
forms, automated workflows, and
workflows to download, analyse and
re-upload data files
Manager helps researchers move from raw files to datasets
Data Manager
Active research data collaboration and workflow tool, which enables research
groups to gather/organize, annotate and share data all in one place.
Note: leftmost active/external data column
will be completed before June 2018
19. • Achieve credibility, visibility and integrity of key research outputs
• Keep track of your data inside and outside your institution
• Maintain visibility of events in the research data management space
• Improve adoption of data sharing tools by researchers
• Communicate the value of data sharing to researchers during the
research process
Research
article
published
Share,
publish or
link data
Monitor
progress and
provide
guidance
Generate
dashboards
Initial
inquiry
about data
Data Monitor
Proactively engage with researchers in the RDM space
20. | 20
For more information, please visit: About RDM , Open Data: The researcher perspective, Mendeley
Data platform
Thank you
Nikhil Joshi, Consultant, Research Data Management
Research Solution Sales | Elsevier
n.joshi.1@Elsevier.com
(917) 435-4806
21. | 21
UMAMI Framework for Data Sharing
• Uptake: integration throughout the research
workflow/across the research data lifecycle
• Metadata: Enables search & discovery, linking
b/t systems, citation stds
• Archiving: sustainable/trustworthy repositories
• Metrics: recognition and credit at points of
sharing and re-use
• Intellectual Property: who owns the data
(funder, institution, researcher); concerns about
being scooped
22. | 22
The Mendeley Data Platform
Notebook
Mendeley Data
Platform
• Comply with funders'
mandates
• Showcase institutional
research outputs
• Prevent re-work: save
time searching,
collecting and sharing
data
• Increase data reuse,
avoid duplication of
efforts
• Open system
Pre-integrated with
Elsevier's ecosystem of
research solutions
A modular, cloud-based platform designed for research
institutions, to manage the entire lifecycle of research data.
Search
Monitor Repository
Manager
23. | 23
Mendeley Data Platform for Institutions:
Module Use case Features
MD –
Notebook
(Hivebench)
Collect research data in a
structured way
Effectively manage experiments between collaborators, online or
in local storage. inside and outside of the institution (private
cloud); reporting and monitoring at institutional level.
MD –
Repository
(MD)
Store and preserve
research data outcomes
Store, archive, preserve, manage data; archive data when
researchers leave; collaborate beyond institution.
Showcase institutional
data
Showcase data inside & outside the institution, link with Pure
showcasing.
MD - Manager Manage research data
within project/department
Track and manage all research data stored and shared in MD
Repository or other repositories (e.g. Dropbox); curate metadata.
MD – Search
(DataSearch)
Discover Data & prevent
re-work
Search and index institutional data, whether in MDM or other (eg.
Zenodo, Dspace etc) repositories;
Expose institutional data to the outside world.
MD – Monitor Engage with researchers
& increase uptake
Engage with the researchers in a scalable way, at the right time.
Identify data stored by researchers in repositories inside and
outside institution.
MD - Admin Report on institutional
data management
Report on activities by all connected modules (Repository,
Search, Manager, and Notebook).
Create metrics & tracking of data created by the institution.
Administration Overall admin & reporting dashboard: assign roles, permissions,
etc.