The document summarizes an iODaV Data Workshop held at JKUAT in Kenya on open data and the JORD policy. It discusses why open data is important for reproducibility, innovation and scientific discovery. It outlines the FAIR principles for open data and metadata to make data findable, accessible, interoperable and reusable. It also discusses opportunities and challenges of open data for universities, including developing skills and infrastructure. Finally, it provides examples of open data initiatives at JKUAT including developing an open data policy, the iODaV program, contributions to national ICT policies, and the digital health applied research centre.
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
JKUAT iODaV Workshop Promotes Open Data Principles & FAIR Research
1. iODaV Data Workshop.
iPiC Centre, JKUAT Main Campus, JUJA 19th September 2017
Open Data & JORD Policy
Prof Joseph Muliaro Wafula PhD, FCCS, FCSK.
Chair, iODaV & Director, iCEOD
Jomo Kenyatta University of Agriculture and Technology
Kenya
3. Why Data especially in this digital era?
• Science demands that you support your arguments with
evidence/data.
• Open research data are essential for reproducibility, self-
correction.
• Academic publishing has not kept up with age of digital data.
• Danger of an replication / evidence / credibility gap.
• Open data foster innovation and accelerate scientific discovery
through reuse of data.
Data for research should be intelligently open: accessible, assessible,
intelligible, useable.
FAIR: Findable, Accessible, Interoperable, Reusable.
Publications and data should be Open and available concurrently:
argues that not to make data concurrently open is scientific
malpractice
Science International Accord on Open Data in a Big Data World:
http://www.science-international.org/ (JKUAT has signed this
accord)
4. Open Data Guiding Principles-FAIR
• FAIR Data
• Findable: have sufficiently rich metadata and a unique and persistent identifier.
• Accessible: retrievable by humans and machines through a standard protocol;
open and free; authentication and authorization where necessary.
• Interoperable: metadata use a ‘formal, accessible, shared, and broadly
applicable language for knowledge representation’.
• Reusable: metadata provide rich and accurate information; clear usage license;
detailed provenance.
• FAIR Guiding Principles for scientific data management and
stewardship, http://dx.doi.org/10.1038/sdata.2016.18
• Guiding Principles for FAIR Data: https://www.force11.org/node/6062
5. FAIR Principles
• To be Findable:
• F1. (meta)data are assigned a globally unique and
persistent identifier
• F2. data are described with rich metadata (defined by R1
below)
• F3. metadata clearly and explicitly include the identifier
of the data it describes
• F4. (meta)data are registered or indexed in a searchable
resource
• To be Accessible:
• A1. (meta)data are retrievable by their identifier using a
standardized communications protocol
• A1.1 the protocol is open, free, and universally
implementable
• A1.2 the protocol allows for an authentication and
authorization procedure, where necessary
• A2. metadata are accessible, even when the data are no
longer available
• (Mons, B., et al., The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data, http://dx.doi.org/10.1038/sdata.2016.18)
• To be Interoperable:
• I1. (meta)data use a formal, accessible, shared, and
broadly applicable language for knowledge representation.
• I2. (meta)data use vocabularies that follow FAIR principles
• I3. (meta)data include qualified references to other
(meta)data
• To be Reusable:
• R1. meta(data) are richly described with a plurality of
accurate and relevant attributes
• R1.1. (meta)data are released with a clear and accessible
data usage license
• R1.2. (meta)data are associated with detailed provenance
• R1.3. (meta)data meet domain-relevant community
standards
• (Mons, B., et al., The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data, http://dx.doi.org/10.1038/sdata.2016.18)
6. Opportunities & Challenges for JKUAT/PAUSTI
Open and FAIR Research Data Presents Major Opportunities for Universities:
Research intensive universities will be data intensive universities.
Supporting researchers’ use of data is a key strategic mission and enabler: world
class research environment includes support for data stewardship.
A university’s reputation is increasingly built on all research outputs and wider
societal and economic impact: data is core to this.
Development of significant data collections of research intensive universities.
Leading departments / research groups will be characterised by excellence in
data, by Open FAIR data collections.
The way in which the contribution to research of both the individual researcher
and the institution will increasingly be measured on the basis of data outputs as
well as research articles.
Policies less and less ambiguous – data stewardship, RDM is necessary for grant
funding success.
Avoid reputational damage through data loss.
Challenges:
Policy development: unpicking Open
and FAIR data (JKUAT has JORD)
Supporting data through the
lifecycle.
Culture and incentives: what’s in it
for us?
Skills gaps: training and support.
Technical systems and infrastructure.
Developing culture of conscious data
stewardship: what to keep and what
to discard.
Supporting the long term
stewardship of research data
Sustainability and finance..
7. Boundaries of Open
For data created with public funds or where there is a strong
demonstrable public interest, Open should be the default.
As Open as Possible as Closed as Necessary.
Proportionate exceptions for:
Legitimate commercial interests (sectoral variation)
Privacy (‘safe data’ vs Open data – the anonymisation problem)
Public interest (e.g. endangered species, archaeological sites)
Safety, security and dual use (impacts contentious)
All these boundaries are fuzzy and need to be understood better!
There is a need to evolve policies, practices and ethics around
closed, shared, and open data.
9. Incentives: Data Citation
Out of Cite, Out of Mind
http://bit.ly/out_of_cite
Joint Declaration of Data Citation
Principles:
https://www.force11.org/datacitation
Background and Developments:
http://bit.ly/data_citation_principles
International Series of Data Citation
Workshops
http://bit.ly/data-citation-workshops
CODATA Task Group on
Data Citation
Principles and Practices
If publications are the stars and
planets of the scientific universe,
data are the ‘dark matter’ –
influential but largely unobserved
in our mapping process
10. Open Data Policy
Key Objectives:
1. Promote Data publication, preservation and reuse.
2. Promote multi-disciplined research capabilities and activities that are
ICT enabled
3. Accelerate ICT innovation through equipping innovators with
requisite skills and credible and quality data
4. Change culture of keeping data private to public by default
11. The long end of the tail…..has individual scientists data
• Much of this revolution is taking place at the top end
– at the head and neck
• Although ‘big data’ is all the rage….the vast majority
of data sets created through research fall into the
“Long Tail”
Source – Wagging the Long Tail, Kathleen Shearer et al, 2014
13. Data-driven
Innovation
successfully capitalizing on data
revolution requires public policies
and strategies designed to allow
data-driven innovation to
flourish(2013 WB).
These policies and strategies will
remove barriers , stimulate release,
use and impact assessment of
open data (Rininta et al., 2015).
Open
Data
Policy
Strategy
Action
Plan
14. Open Data Initiative(ODI) 1: JORD Policy
http://www.jkuat.ac.ke/directorates/iceod/wp-content/uploads/2017/06/JORD-Policy-ISO-ref-April-2016.pdf
JKUAT with the support of
CODATA, developed and
implemented an open research
data policy (JORD) Policy
(February 2016)
14
ROI
Encouragement of
diverse studies
and opinion
Promotion of new
areas of work not
envisioned by the
initial investigators
Development of
new products and
services
Strengthen the
credibility of
scholarly
publications
Development of
new products and
services
15. ODI 2: Innovative Open Data and Visualization (iODaV)-JKUAT and PAUST
The specific objectives of AFRICA
ai JAPAN Project Sub-Task Force
are as follows:
i See Link http://www.jkuat.ac.ke/wp-
content/uploads/2017/02/Innovation-
Research-Grants-AFRICA-ai-JAPAN-
Project.pdf
iODaV
Open
Research
Data-based
Innovation
Data
Analytics
Data,Info &
Scientific
Visualization
Smart
Learning-
ThinkBoard
S/W
Open Data
Principles,
Stds & JORD
Reuse of
Research
Data
15
16. ODI 3: DRAFT NATIONAL INFORMATION & COMMUNICATIONS TECHNOLOGY
(ICT) POLICY JUNE 2016 (http://icta.go.ke/pdf/National-ICT-Policy-20June2016.pdf)
• Article 5.10 –Data Centre:The government will:
Promote Data Centre infrastructure buildout carried out in cognizance of globally
approved standards for purposes of ensuring quality of service under open access,
carrier neutral model;
(b) Develop incentives to ensure and protect investment in the field of data centre;
(c) Facilitate the development and enactment of legislation on localization to support
growth in IT service consumption – as an engine to spur data centre growth;
(d) Ensure that Data is processed fairly and lawfully in accordance with the rights of
citizens and obtained only for specific, lawful purposes
In support of Kenya Open Data Initiative (http://www.opendata.go.ke/)
16
17. ODI 3….2
• Article 7.1- Digital Content
(a) Adopting Open Data principles: - in order to share historical/archive data that can be
a rich source for the creative and broadcast industry;
(b) Promoting Animation Labs (A-Lab):- Government will support incubation labs focused
on animation & film production that is largely computer generated;
(c) Content Ratings: - The Government will, develop policies and legislation that take into
consideration age appropriate content that upholds national values.
(d) Copyright Protection:- Government will recognise digital content as copyright
material and will actively protect the rights of copyright owners through law
enforcement to prevent digital content piracy.
17
18. ODI 3….3
• 15.4 Information Security
The government will develop information security policies and
guidelines to ensure protection of the confidentiality, integrity and
availability of information
18
19. ODI 4: DIGITAL HEALTH APPLIED RESEARCH CENTRE
(DHARC) -JKUAT
• DHARC is one of the deliverables of HIGDA Project funded by USAID 5 yr project
started Oct 2016
• DHARC -implementation of interoperability solutions informed by Open Data
Principles and Stds.
• DHARC will join a network of interoperability labs which have been established in
Canada (2007), South Africa (2010), and the Philippines (2016)
• It will provide examples of how key components (DHIS2, DATIM, MFL ver2, AMRS
and other mHealth solutions) interoperate, providing guidelines-based care
workflows, policies, and M&E mandates.
19
20. In 6: Open Data Policy Development
• Open Data policy development need to be based on the following three pillars:
1. C-context
2. C-content
3. I-impact
20
21. Policy Context Pillar
Key factors include:
Level of Gov organization
Key motivations, policy objectives
Open data platform launch
Resource allocation & economic context
Legislation
Social, cultural & Political context
Drivers for open data
Forces against Opening data
21
22. Policy Content Pillar
Key factors include:
Licensing
Access fee
Data restriction
Data presentation
Contact with user
Amount published
Processing before publishing
22
Cost of opening
Types of Data
Data Formats & stds
Data quality
Provision of metadata
23. Policy Impact Pillar
Key factors include:
Re-use of published data
Possible predicted risks
Benefits aligned with motivation
Public value
Transparency & accountability
Economic growth
Entrepreneurial open data use/ innovation
Efficiency
Environmental sustainability
Inclusion of marginalized 23
24. I Key Strategic Pillars of Sustainable Open Data
Programs
Support open data infrastructure build based on open data
policies standards and supportive legal and licensing frameworks
Make data publishing and access available and easy
Create feedback channels for data users
Prioritize dataset that users want
Address quality issues of datasets
Protect privacy rights
Provide clear, consistent, and useful metadata
24
25. Open data implementation best practices
• Have an open data policy ( e.g. JORD-JKUAT)
• Ensure easy to understand content & formatting
• Release high-value and high-impact data first
• Ensure compatibility and interoperability of systems (e.g. Kenya
Health sector DHARC project –USAID/JKUAT)
• Establish data ownership
• Involve stakeholders
• Plan for open data advocacy (e.g. KALRO)
• Implement interaction and feedback mechanism
• Build communities of data producers and users
• Organize training programs
• Organize hackathons( eg CODATA, JAPAN ai AFRICA Project, IBM,
JKUAT, USAID have sponsored hackathon on Agriculture and Health
sector open data to promote innovations and data use in Kenya) 25