SlideShare uma empresa Scribd logo
1 de 23
A revolution in
open science?
- open data & the
role of libraries
- Geoffrey Boulton
LIBER
Munich
June 2013
Open communication of data: the source of a
scientific revolution and of scientific progress
Henry Oldenburg
Problems & opportunities in the data deluge
1020 bytes
Available storage
The challenge to Oldenburg’s principle:
- a crisis of replicability and credibility?
A fundamental principle: the data providing the evidence
for a published concept MUST be concurrently
published, together with the metadata
But what about the vast data volumes that are not used to
support publication as well as those that are?
The opportunity:
new knowledge from data
(some technical opportunities & priorities)
Exploiting the potential
of linked data requires:
• data integration
• dynamic data
Solutions/agreements are
needed for:
• provenance
• persistent identifiers
• standards
• data citation formats
• algorithm integration
• file-format translation
• software-archiving
• automated data reading
• metadata generation
• timing of data release
BUT - its not just sharing and
integrating data – its also what we do with it!
Jim Gray - “When you go and look at what scientists are
doing, day in and day out, in terms of data analysis, it is
truly dreadful. We are embarrassed by our data!”
….. and we need a new breed of informatics-trained
data scientist as the new librarians of the post-
Gutenberg world
but will they be in the Library?
So what are the priorities?
1. Ensuring valid reasoning
2. Innovative manipulation to create new information
3. Effective management of the data ecology
4. Education & training in data informatics & statistics
Benefits of open communication of data
• Greater benefit to the individual than hugging their own data (e.g.
bioinformatics)
• Permits faster responses to emergencies (e.g. Hamburg 2011 E. Coli outbreak)
• Deterrent to fraud (system integrity as well as personal integrity)
• Stimulates novel and creative collaborations (e.g. Tim Gowers & “crowd-sourcing
in mathematics)
• Support for citizen science movement ( e.g. Galaxy Zoo, Fold-it, Ash-Tag, etc)
• Demands by citizens for access (e.g. “Climategate”)
• Efficient responses to global challenges
Benefits 1 : data sharing in ethos & practice
– the example of bio-informatics
ELIXIR Hub (European Bioinformatic Institute) and ELIXIR Nodes provide
infrastructure for data, computing, tools, standards and training.
• E-coli outbreak spread through
several countries affecting 4000 people
• Strain analysed and genome
released under an open data license.
• Two dozen reports in a week with
interest from 4 continents
• Crucial information about strain’s
virulence and resistance
Benefits 2: Response to Gastro-intestinal
infection in Hamburg
“Scientific fraud is rife: it's time to stand up for good science”
“Science is broken”
Examples:
 psychology academics making up data,
 anaesthesiologist Yoshitaka Fujii with 172 faked articles
 Nature - rise in biomedical retraction rates overtakes rise in published papers
Cause:
Rewards and pressures promote extreme behaviours, and normalise malpractice
(e.g. selective publication of positive novel findings)
Cures:
Open data for replication
Transparent peer review
Not just personal integrity – but system integrity
Benefits 3: Response to fraud
Mathematics related discussions
Tim Gowers
- crowd-sourced mathematics
An unsolved problem posed on
his blog.
32 days – 27 people – 800
substantive contributions
Emerging contributions rapidly
developed or discarded
Problem solved!
“Its like driving a car whilst
normal research is like pushing
it”
What inhibits such processes?
- The criteria for credit and
promotion.
Benefits 4: Opening-up
science:
e.g. crowd-sourcing
• Openly collected science is already helping policy
makers.
• AshTag app allows users to submit photos and
locations of sightings to a team who will refer them on
to the Forestry Commission, which is leading efforts to
stop the disease's spread with the Department for
Environment, Food and Rural Affairs (Defra).
Chalara spread: 1992-2012
Benefits 5: Citizen Science
•
Benefits 6: The Economy
Benefits 7: Openness to public scrutiny
Benefits 8: Global challenges
Boundaries of openness?
Openness should be the default position, with
proportional exceptions for:
• Legitimate commercial interests (sectoral
variation)
• Privacy (completely anonymised data is
impossible)
• Safety, security & dual use (impacts
contentious)
All these boundaries are fuzzy
Openness of data per se has little value.
Open science is more than disclosure
For effective communication, replication and re-purposing we
need intelligent openness. Data and meta-data must be:
• Accessible
• Intelligible
• Assessable
• Re-usable
Only when these four criteria are fulfilled are data
properly open.
But, intelligent openness must be audience sensitive.
Intelligent openness to fellow citizens normally makes far
greater demands than openness to peers – we should
prioritise “PUBLIC INTEREST SCIENCE” for the former.
Responsibilities & actions
• Scientists: - changing the mindset
• Learned Societies: - influencing their communities
• Universities/Insts: - responsibility for the knowledge they produce?
- incentives & promotion criteria
- proactive, not just compliant
- strategies (e.g. the library)
- management processes
• Funders of research: - mandate intelligent openness
- accept diverse outputs
- cost of open data is a cost of science
- strategic funding for technical solutions
(a priority for international collaboration)
• Publishers: - mandate concurrent open deposition
A data management ecology?
The role of the top-down
and the bottom-up?
Massive data loss
Sum (little science data) >
Sum (big science data)?
Can libraries rise to the challenges of a post-
Gutenberg world?
“Libraries do the wrong things, employ the wrong people”
People
• Funders mandate novel customers – the public
• Can they attract data scientists?
• Support for researchers & students
Things
• Reversing centralisation
• A data repository – directory - metadata – background
• Dynamic data
• Selection problem
• Compliant or proactive?
A taxonomy of openness
Inputs Outputs
Open access
Administrative
data (held by
public
authorities e.g.
prescription
data)
Public Sector
Research data
(e.g. Met
Office weather
data)
Research
Data (e.g.
CERN,
generated in
universities)
Research
publications
(i.e. papers in
journals)
Open data
Open science
Collecting the
data
Doing
research
Doing science
openly
Researchers - Citizens - Citizen scientists – Businesses – Govt & Public sector
Science as a public enterprise
A realiseable aspiration: all scientific
literature open & online,
all data open & online, and for them to
interoperate
… but, this is a process, not an event!
www.royalsociety.org

Mais conteúdo relacionado

Mais procurados

Healthcare confidentiality training.2013bev
Healthcare confidentiality training.2013bevHealthcare confidentiality training.2013bev
Healthcare confidentiality training.2013bev
blk70130
 
Ethical issues of hiv and aids in health
Ethical issues of hiv and aids in healthEthical issues of hiv and aids in health
Ethical issues of hiv and aids in health
Babli Gupta
 
Donor education for donor recruitment and donor retention
Donor education for donor recruitment and donor retentionDonor education for donor recruitment and donor retention
Donor education for donor recruitment and donor retention
Rashmi Sd
 

Mais procurados (19)

Healthcare confidentiality training.2013bev
Healthcare confidentiality training.2013bevHealthcare confidentiality training.2013bev
Healthcare confidentiality training.2013bev
 
Facturadores medicos
Facturadores medicosFacturadores medicos
Facturadores medicos
 
Why you should donate blood?
Why you should donate blood?Why you should donate blood?
Why you should donate blood?
 
Confidentiality
ConfidentialityConfidentiality
Confidentiality
 
Improving the Health Outcomes of Both Patients AND Populations
Improving the Health Outcomes of Both Patients AND PopulationsImproving the Health Outcomes of Both Patients AND Populations
Improving the Health Outcomes of Both Patients AND Populations
 
RBITC Blood donation ppt 2013 b
RBITC Blood donation ppt 2013 bRBITC Blood donation ppt 2013 b
RBITC Blood donation ppt 2013 b
 
Patient Centered Care | Unit 1 Lecture
Patient Centered Care | Unit 1 LecturePatient Centered Care | Unit 1 Lecture
Patient Centered Care | Unit 1 Lecture
 
ICD-10 Presentation Takes Coding to New Heights
ICD-10 Presentation Takes Coding to New HeightsICD-10 Presentation Takes Coding to New Heights
ICD-10 Presentation Takes Coding to New Heights
 
Ethical issues of hiv and aids in health
Ethical issues of hiv and aids in healthEthical issues of hiv and aids in health
Ethical issues of hiv and aids in health
 
Bioethics defined
Bioethics definedBioethics defined
Bioethics defined
 
Medical ethics and public health
Medical ethics and public healthMedical ethics and public health
Medical ethics and public health
 
Blood donation
Blood donationBlood donation
Blood donation
 
Patients’ privacy and confidentiality
Patients’ privacy and confidentialityPatients’ privacy and confidentiality
Patients’ privacy and confidentiality
 
Overview of the Health Equity Framework Through Social Justice
Overview of the Health Equity Framework Through Social JusticeOverview of the Health Equity Framework Through Social Justice
Overview of the Health Equity Framework Through Social Justice
 
El Secreto Profesional
El Secreto ProfesionalEl Secreto Profesional
El Secreto Profesional
 
Hippocratic oath ppt
Hippocratic oath pptHippocratic oath ppt
Hippocratic oath ppt
 
Confidentiality
ConfidentialityConfidentiality
Confidentiality
 
Donor education for donor recruitment and donor retention
Donor education for donor recruitment and donor retentionDonor education for donor recruitment and donor retention
Donor education for donor recruitment and donor retention
 
Health Care Reform in the United States
Health Care Reform in the United StatesHealth Care Reform in the United States
Health Care Reform in the United States
 

Semelhante a A Revolution in Open Science: Open Data and the Role of Libraries (Professor Geoffrey Boulton at LIBER 2013)

Presentation of science 2.0 at European Astronomical Society
Presentation of science 2.0 at European Astronomical SocietyPresentation of science 2.0 at European Astronomical Society
Presentation of science 2.0 at European Astronomical Society
osimod
 
Science20brussels osimo april2013
Science20brussels osimo april2013Science20brussels osimo april2013
Science20brussels osimo april2013
osimod
 

Semelhante a A Revolution in Open Science: Open Data and the Role of Libraries (Professor Geoffrey Boulton at LIBER 2013) (20)

Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014
 
Science as an Open Enterprise – Geoffrey Boulton
Science as an Open Enterprise – Geoffrey BoultonScience as an Open Enterprise – Geoffrey Boulton
Science as an Open Enterprise – Geoffrey Boulton
 
From Open Data to Open Science, by Geoffrey Boulton
 From Open Data to Open Science, by Geoffrey Boulton From Open Data to Open Science, by Geoffrey Boulton
From Open Data to Open Science, by Geoffrey Boulton
 
Presentation of science 2.0 at European Astronomical Society
Presentation of science 2.0 at European Astronomical SocietyPresentation of science 2.0 at European Astronomical Society
Presentation of science 2.0 at European Astronomical Society
 
The Biodiversity Informatics Landscape
The Biodiversity Informatics LandscapeThe Biodiversity Informatics Landscape
The Biodiversity Informatics Landscape
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
Open Science Governance and Regulation/Simon Hodson
Open Science Governance and Regulation/Simon HodsonOpen Science Governance and Regulation/Simon Hodson
Open Science Governance and Regulation/Simon Hodson
 
The biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspectiveThe biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspective
 
eROSA Policy WS2: Second Stakeholder Workshop
eROSA Policy WS2: Second Stakeholder WorkshopeROSA Policy WS2: Second Stakeholder Workshop
eROSA Policy WS2: Second Stakeholder Workshop
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
How to overcome obstacles to data publication: Issues, requirements, and good...
How to overcome obstacles to data publication: Issues, requirements, and good...How to overcome obstacles to data publication: Issues, requirements, and good...
How to overcome obstacles to data publication: Issues, requirements, and good...
 
Science20brussels osimo april2013
Science20brussels osimo april2013Science20brussels osimo april2013
Science20brussels osimo april2013
 
Delivering biodiversity knowledge in the information age
Delivering biodiversity knowledge in the information ageDelivering biodiversity knowledge in the information age
Delivering biodiversity knowledge in the information age
 
The Era of Open
The Era of OpenThe Era of Open
The Era of Open
 
Safari syndrome
Safari syndromeSafari syndrome
Safari syndrome
 
The SAFARI syndrome. Implementing CRIS and open science
The SAFARI syndrome. Implementing CRIS and open scienceThe SAFARI syndrome. Implementing CRIS and open science
The SAFARI syndrome. Implementing CRIS and open science
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptx
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 

Mais de LIBER Europe

Mais de LIBER Europe (20)

LIBER Europe Covid-19 Research Libraries Survey - December 2020
LIBER Europe Covid-19 Research Libraries Survey - December 2020LIBER Europe Covid-19 Research Libraries Survey - December 2020
LIBER Europe Covid-19 Research Libraries Survey - December 2020
 
LIBER Webinar: Turning FAIR Data Into Reality
LIBER Webinar: Turning FAIR Data Into RealityLIBER Webinar: Turning FAIR Data Into Reality
LIBER Webinar: Turning FAIR Data Into Reality
 
Copyright Reform: EU Legislative Process & LIBER Advocacy
Copyright Reform: EU Legislative Process & LIBER AdvocacyCopyright Reform: EU Legislative Process & LIBER Advocacy
Copyright Reform: EU Legislative Process & LIBER Advocacy
 
LIBER Webinar: Supporting Data Literacy
LIBER Webinar: Supporting Data LiteracyLIBER Webinar: Supporting Data Literacy
LIBER Webinar: Supporting Data Literacy
 
Applying Bourdieu's Field Theory to MLS Curricula Development. Charlotte Nord...
Applying Bourdieu's Field Theory to MLS Curricula Development. Charlotte Nord...Applying Bourdieu's Field Theory to MLS Curricula Development. Charlotte Nord...
Applying Bourdieu's Field Theory to MLS Curricula Development. Charlotte Nord...
 
Growing a Culture for Change at The University of Manchester Library. Penny H...
Growing a Culture for Change at The University of Manchester Library. Penny H...Growing a Culture for Change at The University of Manchester Library. Penny H...
Growing a Culture for Change at The University of Manchester Library. Penny H...
 
Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...
Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...
Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...
 
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
 
The Role of Libraries in the Adoption of Research Data Management. Ingeborg V...
The Role of Libraries in the Adoption of Research Data Management. Ingeborg V...The Role of Libraries in the Adoption of Research Data Management. Ingeborg V...
The Role of Libraries in the Adoption of Research Data Management. Ingeborg V...
 
LibChain – Open, Verifiable and Anonymous Access Management. Juan Cabello, P...
 LibChain – Open, Verifiable and Anonymous Access Management. Juan Cabello, P... LibChain – Open, Verifiable and Anonymous Access Management. Juan Cabello, P...
LibChain – Open, Verifiable and Anonymous Access Management. Juan Cabello, P...
 
From Open Access to Open Data: Collaborative Work in the University Libraries...
From Open Access to Open Data: Collaborative Work in the University Libraries...From Open Access to Open Data: Collaborative Work in the University Libraries...
From Open Access to Open Data: Collaborative Work in the University Libraries...
 
The Perks and Challenges of Drawing Maps and Walking at the Same Time
The Perks and Challenges of Drawing Maps and Walking at the Same TimeThe Perks and Challenges of Drawing Maps and Walking at the Same Time
The Perks and Challenges of Drawing Maps and Walking at the Same Time
 
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
 
Text and Data Mining : Making the Most of a Copyright Exception. Julien Roche...
Text and Data Mining : Making the Most of a Copyright Exception. Julien Roche...Text and Data Mining : Making the Most of a Copyright Exception. Julien Roche...
Text and Data Mining : Making the Most of a Copyright Exception. Julien Roche...
 
Adoption and Integration of Persistent Identifiers in European Research Infor...
Adoption and Integration of Persistent Identifiers in European Research Infor...Adoption and Integration of Persistent Identifiers in European Research Infor...
Adoption and Integration of Persistent Identifiers in European Research Infor...
 
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
 
COUNTER Standards for Open Access: The Value of Measuring/The Measuring of Va...
COUNTER Standards for Open Access: The Value of Measuring/The Measuring of Va...COUNTER Standards for Open Access: The Value of Measuring/The Measuring of Va...
COUNTER Standards for Open Access: The Value of Measuring/The Measuring of Va...
 
Enabling the Exchange and use of Data in Agriculture
Enabling the Exchange and use of Data in AgricultureEnabling the Exchange and use of Data in Agriculture
Enabling the Exchange and use of Data in Agriculture
 
GDPR - Thoughts on the EU Data Protection Regulation, Research and Libraries
GDPR - Thoughts on the EU Data Protection Regulation, Research and LibrariesGDPR - Thoughts on the EU Data Protection Regulation, Research and Libraries
GDPR - Thoughts on the EU Data Protection Regulation, Research and Libraries
 
Research Data Services and Data Collections: Library Synergies for Economic R...
Research Data Services and Data Collections: Library Synergies for Economic R...Research Data Services and Data Collections: Library Synergies for Economic R...
Research Data Services and Data Collections: Library Synergies for Economic R...
 

Último

Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 

Último (20)

Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 

A Revolution in Open Science: Open Data and the Role of Libraries (Professor Geoffrey Boulton at LIBER 2013)

  • 1. A revolution in open science? - open data & the role of libraries - Geoffrey Boulton LIBER Munich June 2013
  • 2. Open communication of data: the source of a scientific revolution and of scientific progress Henry Oldenburg
  • 3. Problems & opportunities in the data deluge 1020 bytes Available storage
  • 4. The challenge to Oldenburg’s principle: - a crisis of replicability and credibility? A fundamental principle: the data providing the evidence for a published concept MUST be concurrently published, together with the metadata But what about the vast data volumes that are not used to support publication as well as those that are?
  • 5. The opportunity: new knowledge from data (some technical opportunities & priorities) Exploiting the potential of linked data requires: • data integration • dynamic data Solutions/agreements are needed for: • provenance • persistent identifiers • standards • data citation formats • algorithm integration • file-format translation • software-archiving • automated data reading • metadata generation • timing of data release
  • 6. BUT - its not just sharing and integrating data – its also what we do with it! Jim Gray - “When you go and look at what scientists are doing, day in and day out, in terms of data analysis, it is truly dreadful. We are embarrassed by our data!” ….. and we need a new breed of informatics-trained data scientist as the new librarians of the post- Gutenberg world but will they be in the Library? So what are the priorities? 1. Ensuring valid reasoning 2. Innovative manipulation to create new information 3. Effective management of the data ecology 4. Education & training in data informatics & statistics
  • 7. Benefits of open communication of data • Greater benefit to the individual than hugging their own data (e.g. bioinformatics) • Permits faster responses to emergencies (e.g. Hamburg 2011 E. Coli outbreak) • Deterrent to fraud (system integrity as well as personal integrity) • Stimulates novel and creative collaborations (e.g. Tim Gowers & “crowd-sourcing in mathematics) • Support for citizen science movement ( e.g. Galaxy Zoo, Fold-it, Ash-Tag, etc) • Demands by citizens for access (e.g. “Climategate”) • Efficient responses to global challenges
  • 8. Benefits 1 : data sharing in ethos & practice – the example of bio-informatics ELIXIR Hub (European Bioinformatic Institute) and ELIXIR Nodes provide infrastructure for data, computing, tools, standards and training.
  • 9. • E-coli outbreak spread through several countries affecting 4000 people • Strain analysed and genome released under an open data license. • Two dozen reports in a week with interest from 4 continents • Crucial information about strain’s virulence and resistance Benefits 2: Response to Gastro-intestinal infection in Hamburg
  • 10. “Scientific fraud is rife: it's time to stand up for good science” “Science is broken” Examples:  psychology academics making up data,  anaesthesiologist Yoshitaka Fujii with 172 faked articles  Nature - rise in biomedical retraction rates overtakes rise in published papers Cause: Rewards and pressures promote extreme behaviours, and normalise malpractice (e.g. selective publication of positive novel findings) Cures: Open data for replication Transparent peer review Not just personal integrity – but system integrity Benefits 3: Response to fraud
  • 11. Mathematics related discussions Tim Gowers - crowd-sourced mathematics An unsolved problem posed on his blog. 32 days – 27 people – 800 substantive contributions Emerging contributions rapidly developed or discarded Problem solved! “Its like driving a car whilst normal research is like pushing it” What inhibits such processes? - The criteria for credit and promotion. Benefits 4: Opening-up science: e.g. crowd-sourcing
  • 12. • Openly collected science is already helping policy makers. • AshTag app allows users to submit photos and locations of sightings to a team who will refer them on to the Forestry Commission, which is leading efforts to stop the disease's spread with the Department for Environment, Food and Rural Affairs (Defra). Chalara spread: 1992-2012 Benefits 5: Citizen Science
  • 14. Benefits 7: Openness to public scrutiny
  • 15. Benefits 8: Global challenges
  • 16. Boundaries of openness? Openness should be the default position, with proportional exceptions for: • Legitimate commercial interests (sectoral variation) • Privacy (completely anonymised data is impossible) • Safety, security & dual use (impacts contentious) All these boundaries are fuzzy
  • 17. Openness of data per se has little value. Open science is more than disclosure For effective communication, replication and re-purposing we need intelligent openness. Data and meta-data must be: • Accessible • Intelligible • Assessable • Re-usable Only when these four criteria are fulfilled are data properly open. But, intelligent openness must be audience sensitive. Intelligent openness to fellow citizens normally makes far greater demands than openness to peers – we should prioritise “PUBLIC INTEREST SCIENCE” for the former.
  • 18. Responsibilities & actions • Scientists: - changing the mindset • Learned Societies: - influencing their communities • Universities/Insts: - responsibility for the knowledge they produce? - incentives & promotion criteria - proactive, not just compliant - strategies (e.g. the library) - management processes • Funders of research: - mandate intelligent openness - accept diverse outputs - cost of open data is a cost of science - strategic funding for technical solutions (a priority for international collaboration) • Publishers: - mandate concurrent open deposition
  • 19. A data management ecology? The role of the top-down and the bottom-up? Massive data loss Sum (little science data) > Sum (big science data)?
  • 20. Can libraries rise to the challenges of a post- Gutenberg world? “Libraries do the wrong things, employ the wrong people” People • Funders mandate novel customers – the public • Can they attract data scientists? • Support for researchers & students Things • Reversing centralisation • A data repository – directory - metadata – background • Dynamic data • Selection problem • Compliant or proactive?
  • 21. A taxonomy of openness Inputs Outputs Open access Administrative data (held by public authorities e.g. prescription data) Public Sector Research data (e.g. Met Office weather data) Research Data (e.g. CERN, generated in universities) Research publications (i.e. papers in journals) Open data Open science Collecting the data Doing research Doing science openly Researchers - Citizens - Citizen scientists – Businesses – Govt & Public sector Science as a public enterprise
  • 22. A realiseable aspiration: all scientific literature open & online, all data open & online, and for them to interoperate … but, this is a process, not an event!

Notas do Editor

  1. You have just been discussing global challenges; major targets for modern science – about science for policy. I will talk about some of the processes of science that will be vital if those challenges are to be effectively met – about policy for science. I restrict my talk to science itself, Chas Bountra will talk about one of the main domains of application – that of the commercial.
  2. But first a little helpful history. This is Henry Oldenberg, the first secretary of the newly formed Royal Society in the early 1660s. Henry was an inveterate correspondent, with those we would now call scientists both in Europe and beyond. Rather than keep this correspondence private, he thought it would be a good idea to publish it, and persuaded the new Society to do so by creating the Philosophical Transactions, which remains a top-flight journal to the present day. But he demanded two things of his correspondents: that they should submit in the vernacular and not Latin; and that evidence (data) that supported a concept must be published together with the concept. It permitted others to scrutinize the logic of the concept, the extent to which it was supported by the data and permitted replication and re-use. Open publication of concept and evidence is the basis of “scientific self-correction”, which historians of science argue were the crucial building blocks on which the scientific revolution of the 18th and 19th centuries was built and remain fundamental to the progress of science, and its value to humanity as the most reliable means of acquiring knowledge. Openness to scrutiny by scientific peers is the most powerful form of peer review.
  3. But Oldenberg’s world has changed. The last 20 years has seen a revolution in the rate at which data can be acquired, in the volume and complexity that can be stored and in the immediacy of ubiquitous communication. We have an unprecedented data storm. This poses challenges and opportunities in the way science is done.
  4. The fundamental challenge is to scientific self-correction. Journals can no longer contain the data, and neither scientists nor journals have taken the obvious step of having data relevant to a publication concurrently available in an electronic database. (example of last year’s Nature paper revealing that only 11% of results in 50 benchmark papers in pre-clinical oncology were replicable – a crisis of credibility?)But enormous data volumes are created by publicly funded research that are never used as the basis for publication. There is a powerful argument that open sharing of such data has enormous potential.
  5. The opportunity is, as some have argued, of another scientific revolution of a scale and significance equivalent to that of the 18th and 19th centuries. Integrating data from a large number of cognate databases, ensuring dynamic data that is automatically up-dated, and using powerful techniques for database interrogation and data and text mining permit us to ask questions in new ways with the potential to gain a more profound understanding of deep data relationships and the information and knowledge that they potentially contain, as promised by the vision of a semantic web.
  6. Henry Oldenburg: the scientific journal and the process of peer review  Henry Oldenburg (1619-1677) was a German theologian who became the first Secretary of the Royal Society.  He corresponded with the leading scientists of Europe, and believed that rather than waiting for entire books to be published, letters were much better suited to quick communication of facts or new discoveries.  He invited people to write to him, even laymen who were not involved with science but had discovered some item of knowledge. He no longer required that science be conveyed in Latin, but in any vernacular language. From these letters the idea of printing scientific papers or articles in a scientific journal was born. In creating the Philosophical Transactions of the Royal Society in 1665, he wrote:   "It is therefore thought fit to employ the [printing] press, as the most proper way to gratify those [who] . . . delight in the advancement of Learning and profitable Discoveries [and who are] invited and encouraged to search, try, and find out new things, impart their knowledge to one another, and contribute what they can to the Grand Design of improving Natural Knowledge . . . for the Glory of God . . . and the Universal Good of Mankind."  Oldenburg also initiated the process of peer review of submissions by asking three of the Society’s Fellows who had more knowledge of the matters in question than he, to comment on submissions prior to making the decision about whether to publish.REFAaron Klug, (2000) “Address of the President, Sir Aaron Klug, O.M., P.R.S., Given at the Anniversary Meeting on 30 November 1999”, Notes Rec. R. Soc. Lond. 2000 54, 99-108.Marie Boas Hall, Henry Oldenburg: Shaping the Royal Society (Oxford: Oxford University Press 2002).
  7. Open communication of data offers many benefits:  Data sharing in specific scientific communities offers greater benefits to the individual than does hugging their own data (e.g. bio-informatics).Data sharing permits faster, more efficient response to emergencies (e.g. the 2011 Hamburg-centred e-coli infection).Mandating open data concurrent with publication has the potential to deter fraud (examples of invented data) and malpractice (mention clinical trials) – stress system integrity as well as personal integrity.Openness stimulates novel, highly creative and efficient modes of scientific collaboration (e.g. crowd sourcing and Tim Gowers).The stimulus it offers to the “citizen science movement”, which has the potential in the next decade or so to fundamentally change the social dynamics of science.A response to the increasing demand from many citizens to interrogate for themselves the evidence for a particular policy that may have major impacts on the lives of individuals and society, and after all they pay, through their taxes for publicly funded scienceAnd finally, and crucially, it offers more efficient and speedier means of addressing many modern science-related challenges (e.g. climate change; energy; infectious pandemics etc)
  8. Henry Oldenburg: the scientific journal and the process of peer review  Henry Oldenburg (1619-1677) was a German theologian who became the first Secretary of the Royal Society.  He corresponded with the leading scientists of Europe, and believed that rather than waiting for entire books to be published, letters were much better suited to quick communication of facts or new discoveries.  He invited people to write to him, even laymen who were not involved with science but had discovered some item of knowledge. He no longer required that science be conveyed in Latin, but in any vernacular language. From these letters the idea of printing scientific papers or articles in a scientific journal was born. In creating the Philosophical Transactions of the Royal Society in 1665, he wrote:   "It is therefore thought fit to employ the [printing] press, as the most proper way to gratify those [who] . . . delight in the advancement of Learning and profitable Discoveries [and who are] invited and encouraged to search, try, and find out new things, impart their knowledge to one another, and contribute what they can to the Grand Design of improving Natural Knowledge . . . for the Glory of God . . . and the Universal Good of Mankind."  Oldenburg also initiated the process of peer review of submissions by asking three of the Society’s Fellows who had more knowledge of the matters in question than he, to comment on submissions prior to making the decision about whether to publish.REFAaron Klug, (2000) “Address of the President, Sir Aaron Klug, O.M., P.R.S., Given at the Anniversary Meeting on 30 November 1999”, Notes Rec. R. Soc. Lond. 2000 54, 99-108.Marie Boas Hall, Henry Oldenburg: Shaping the Royal Society (Oxford: Oxford University Press 2002).
  9. Henry Oldenburg: the scientific journal and the process of peer review  Henry Oldenburg (1619-1677) was a German theologian who became the first Secretary of the Royal Society.  He corresponded with the leading scientists of Europe, and believed that rather than waiting for entire books to be published, letters were much better suited to quick communication of facts or new discoveries.  He invited people to write to him, even laymen who were not involved with science but had discovered some item of knowledge. He no longer required that science be conveyed in Latin, but in any vernacular language. From these letters the idea of printing scientific papers or articles in a scientific journal was born. In creating the Philosophical Transactions of the Royal Society in 1665, he wrote:   "It is therefore thought fit to employ the [printing] press, as the most proper way to gratify those [who] . . . delight in the advancement of Learning and profitable Discoveries [and who are] invited and encouraged to search, try, and find out new things, impart their knowledge to one another, and contribute what they can to the Grand Design of improving Natural Knowledge . . . for the Glory of God . . . and the Universal Good of Mankind."  Oldenburg also initiated the process of peer review of submissions by asking three of the Society’s Fellows who had more knowledge of the matters in question than he, to comment on submissions prior to making the decision about whether to publish.REFAaron Klug, (2000) “Address of the President, Sir Aaron Klug, O.M., P.R.S., Given at the Anniversary Meeting on 30 November 1999”, Notes Rec. R. Soc. Lond. 2000 54, 99-108.Marie Boas Hall, Henry Oldenburg: Shaping the Royal Society (Oxford: Oxford University Press 2002).
  10. Henry Oldenburg: the scientific journal and the process of peer review  Henry Oldenburg (1619-1677) was a German theologian who became the first Secretary of the Royal Society.  He corresponded with the leading scientists of Europe, and believed that rather than waiting for entire books to be published, letters were much better suited to quick communication of facts or new discoveries.  He invited people to write to him, even laymen who were not involved with science but had discovered some item of knowledge. He no longer required that science be conveyed in Latin, but in any vernacular language. From these letters the idea of printing scientific papers or articles in a scientific journal was born. In creating the Philosophical Transactions of the Royal Society in 1665, he wrote:   "It is therefore thought fit to employ the [printing] press, as the most proper way to gratify those [who] . . . delight in the advancement of Learning and profitable Discoveries [and who are] invited and encouraged to search, try, and find out new things, impart their knowledge to one another, and contribute what they can to the Grand Design of improving Natural Knowledge . . . for the Glory of God . . . and the Universal Good of Mankind."  Oldenburg also initiated the process of peer review of submissions by asking three of the Society’s Fellows who had more knowledge of the matters in question than he, to comment on submissions prior to making the decision about whether to publish.REFAaron Klug, (2000) “Address of the President, Sir Aaron Klug, O.M., P.R.S., Given at the Anniversary Meeting on 30 November 1999”, Notes Rec. R. Soc. Lond. 2000 54, 99-108.Marie Boas Hall, Henry Oldenburg: Shaping the Royal Society (Oxford: Oxford University Press 2002).
  11. Openness of itself has no value unless it is “intelligent openness”, where data are:Accessible – they can be foundIntelligible – they can be understoodAssessable – e.g. does the creator have an interest in a particular outcome?Re-useable – sufficient meta-data to permit re-use and re-purposing.These should be standard criteria for an open data regime.But we must recognise that the amount of meta and background data required for intelligent openness to fellow citizens is usually far greater than that required for openness to scientific peers. If all data were to be intelligently open to fellow citizens on the basis that that have ultimately paid for it, science would stop tomorrow. A way forward would be to make a much greater effort to make data intelligently open in what we could call “public interest science”, including those issues that frequently arise in public debate or concern.