SlideShare uma empresa Scribd logo
1 de 24
Data accessibility and the 
role of informatics in 
predicting the biosphere 
Alex Hardisty 
Director of Informatics Projects, 
School of Computer Science & Informatics 
Coordinator, FP7 BioVeL project www.biovel.eu 
email: hardistyar@cardiff.ac.uk 
/alexhardisty (occasionally!) 
1
Structuring the biodiversity informatics community at the European level and beyond 
Biodiversity Informatics Horizons 2013 
180 experts conclude that there is 
“a growing need for predictive biosphere modelling” 
• Integration: Make better use of what we have 
• Cooperation: Data from the whole world is needed 
• Promotion: Europe is well placed to offer leadership 
2
What if …? 
Imagine if we could … 
… Predict community level dynamics of 
ecosystems (i.e., behaviours) at scales 
from local to global, based on the 
ecology and biology of all individual 
organisms … 
e.g., Ecosystems: Time to model all life on Earth. Purves et al., 
Nature 493 (2013) 
Image: StuartMiles / FreeDigitalPh3otos.net
Imagine if we could … 
… Measure and calculate “Essential Biodiversity Variables” … 
… for any geographic area (continental, regional, local), by any person 
anywhere, using data for that area that may be held by any (research) 
infrastructure. Not only that, but also learn how to forecast EBVs 4
Depend on collaboration to deliver the evidence, i.e., based 
on synthesis and modelling of 
• Increasingly large amounts of data from multiple sources 
(environmental, taxonomic, genomic and ecological) 
• Gathered by manual observation and automated sensors, 
digitisation, nextgen sequencing and remote sensing 
Beyond the abilities of any one individual or any single 
research community to collect, observe or generate. 
Variety, Velocity and Volume of “Big Data” 
5 
Photo: Smokestacks against skyline and sunset, Estonia. © Curt Carnemark / World Bank Photo Collection
From informatics perspective, how close are we to that? 
Topical coverage 
100% 
Data sharing and QC 
100% 
0% 
Data types 
Data source tracking 
Data citation tracking 
Data integration 
User applications & 
interfaces 
Funding 
Access policy 
Technology 
GIS 
Standards 
Data 
9 research infrastructures from 
around the world exhibit “a 
satisfactory level of potential 
interoperability” 
Software architecture 
100% 
0% 
Programming 
languages 
Authentication 
Authorization 
Middleware 
Computing 
infrastructure 
Standards 
Technology 
Service logic 
0% 
Geographical 
coverage 
Infrastructure 
topology 
Native 
interoperability and 
enablers 
Merging of science & 
policy needs 
Merging of science & 
industry needs 
Engagement of 
citizens 
Licensing and 
business model 
General 
6
A computational challenge: Greater than that of weather 
forecasting; greater than that of climate prediction? 
Image from climateprediction.net 
HarfootMBJ, Newbold T, Tittensor DP, Emmott S, et al. (2014) Emergent Global 
Patterns of Ecosystem Structure and Function from a Mechanistic General 
Ecosystem Model. PLoS Biol 12(4): e1001841. doi:10.1371/journal.pbio.1001841 
For 1km resolution, “… 3 
to 6 orders of magnitude 
larger, … an exascale 
problem” 
Jack K. Horner 
Independent consultant & 
7 
Adviser to KU Biodiversity Institute
The situation today can be 
likened to meteorology in 
1950’s, 60’s and 70’s (and 
later in climatology) when 
the emergence of numerical 
weather prediction drove 
demand for: 
• New observations 
• The emergence of a global 
infrastructure for acquiring, 
mobilising and normalising 
data, and 
• Better models of global 
atmospheric behaviour 
8
Accessible data is useful data, not just for research 
Global policies/reports 
Regional 
policies/reports 
National 
policies/reports 
Data and information 
Direct provision of data/information 
Indirect provision through reports 
Assessment processes 
Green accounting etc 
9 
Diagram courtesy of EC FP7 EU BON project
To be able to predict the biosphere we need to 
mobilise data and make it accessible 
10
It’s a journey towards 
• Global data, covering the whole planet. There are 
significant gaps everywhere today 
• Making all our small-scale, local data – which often 
characterises the current day practice of field 
ecology – global 
That is to say, we have to mobilise, clean, normalise 
and quality assure many small sets of data that 
together can give us the global data we need to 
calibrate models 
We are achieving that for certain classes of data but 
it is not without its difficulties 
11
Issues arise in each of the 4 stages 
of mobilising data for synthesis 
• Data acquisition 
– Standardised measurement protocols 
• Data curation 
– Assigning right metadata and persistent identifiers 
– Finding a home for the data – and putting it there 
• Data discovery and access 
– Finding relevant data 
– Machine readable access to data i.e., WS front-end 
• Data processing / analysis, including re-use 
– Owners want attribution 
– Tracking provenance and follow licensing conditions 
– Problems at every step, on every workflow run 
http://envri.eu/rm 12
See also: 
“Showing you this 
map of aggregated 
bullfrog occurrences 
would be illegal” 
http://peterdesmet.com 
/posts/illegal-bullfrogs. 
html 
“Our analysis of the licenses of all 11.000+ GBIF registered datasets shows a 
bleak picture. Very few GBIF registered datasets can be easily and legally 
used, let alone without restrictions. This is mainly due to data being 
published with no or a non-standard license.” 
13 
Peter Desmet and Bart Aelterman, 22nd Nov 2013, peterdesmet.com
See also: 
“Showing you this 
map of aggregated 
bullfrog occurrences 
would be illegal” 
http://peterdesmet.com 
/posts/illegal-bullfrogs. 
html 
“Our analysis of the licenses of all 11.000+ GBIF registered datasets shows a 
bleak picture. Very few GBIF registered datasets can be easily and legally 
used, let alone without restrictions. This is mainly due to data being 
published with no or a non-standard license.” 
14 
Peter Desmet and Bart Aelterman, 22nd Nov 2013, peterdesmet.com
Data re-use: Owners want attribution 
Example 1) Taxonomic data refinement Workflow 
BioSTIF 
CoL 3 levels of attribution 
• complete work 
• contributing database of the record 
• expert who provides taxonomic 
scrutiny of the individual record. 
Tool 
license (s) 
GBIF data use agreement 
• Respect restrictions of access to sensitive data. 
• Identifier of ownership of data must be retained with every data record (through the workflow) 
• Publicly acknowledge the Data Publishers whose biodiversity data they have used. 
15 
• Any additional terms and conditions of use set by the Data Publisher.
More problems at every step, on every run 
Example 2) Niche Modelling Workflow 
Create model 
Model test 
Model projection 
High quality occurrence data 
set 
Select algorithm 
Select parameter values for 
the chosen algorithm 
Assemble the model on 
openModeller service 
Test the performance of the 
parameter in the model 
Test performance of the 
distribution prediction on the 
model 
Project Model with prediction 
layers 
Changing algorithm, parameter 
values, and set of layers 
Project Model with original 
layers 
Visualize and publish results 
Select layers with environmental 
factors that are likely to influence the 
distribution of the species 
Select prediction layers 
• License on algorithm 
• License on software 
Licenses on 
environmental data layers 
• Permissions to use 
• AuthN/AuthZ 
Moving data from one 
service to another 
• 3rd party software 
• All issues associated 
with publication 
16
In a recent EU BON study 
Only 35% of surveyed datasets 
(wider scope than just GBIF) are 
accessible under an open license or 
waiver, without restriction on use 
For 29 scientific questions relating to 
needs of European environmental 
policy, the availability of datasets to 
answer the questions is in the range 
‘satisfactory’ (3) to ‘poor’ (2) 
17
Multiple initiatives to make data more accessible; 
some are general purpose 
https://rd-alliance.org/ 
… builds the social and technical bridges that enable open sharing of data … 
researchers and innovators openly sharing data across technologies, disciplines, 
and countries to address the grand challenges of society. 
http://www.datafairport.org/ 
… successful community supported conventions, policies and practices for data 
identifiers, formats, checklists and vocabularies that enable data interoperability, 
citation and stewardship. 
ORCID and DataCite initiatives to uniquely identify (respectively) scientists and data sets 18
Some are more domain specific 
Promoting free and open access 
to biodiversity information 
A framework to focus 
effort and investment 
to deliver biodiversity 
knowledge more 
effectively 
www.biodiversityinformatics.org/ 
www.bouchout-declaration.org 19
A shared and maintained multi-purpose network of 
computationally-based processing services in an open 
data domain 
Image: CoolDesign / FreeDigitalPh2o0tos.net 
With 78 contributors, we 
published the whitepaper, 
April 2013 - since viewed 
more than 34,000 times.
Building a heterogeneous Service Network 
21 
Users’ workflows and 
applications 
Sustained Service and 
Data Providers 
GBIF, CoL, OBIS, WoRMS, 
EMBL-EBI, BGBM, CRIA, EoL, 
BHL, ALA, LTER, etc. & more. 
www.biodiversitycatalogue.org 
Recognised and stable 
Infrastructure Providers 
National, EGI.eu, PRACE, 
commercial, EUDAT, etc.
Preparing the next, coordinated steps 
22 
Diagram from LinkD Concept Note, September 2014
LinkD 
Develop the highly responsive digital framework required to enable high 
throughput research and support science of scale towards the long term vision of 
modelling Life on Earth 
LinkD 
Science of Scale 
for 
L i fe on Ear th 
What we want to do in LinkD? 
ELODINS ENVRI+ 
From slides by Vince Smith, LinkD proposal coordinator, Natural History Musuem, London
Take home message: “It’s a journey” 
• Accessible data is the enabler of “in-silico” science 
that leads towards predicting the biosphere 
• A shared multi-purpose network of processing 
services, sitting on top of open data is the route to 
interoperability 
•Working together as a community is essential 
24 
Photo: A lone farmer walks among rice paddies. © DFATD-MAECD/Tick Collins

Mais conteúdo relacionado

Mais procurados

Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And Grid
Ian Foster
 
David Park APAN Slid..
David Park APAN Slid..David Park APAN Slid..
David Park APAN Slid..
Videoguy
 

Mais procurados (20)

Big Data as a Catalyst for Collaboration & Innovation
Big Data as a Catalyst for Collaboration & InnovationBig Data as a Catalyst for Collaboration & Innovation
Big Data as a Catalyst for Collaboration & Innovation
 
There is No Intelligent Life Down Here
There is No Intelligent Life Down HereThere is No Intelligent Life Down Here
There is No Intelligent Life Down Here
 
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...Accelerating Science, Technology and Innovation Through Open Data and Open Sc...
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...
 
Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And Grid
 
Highlights from NIH Data Science
Highlights from NIH Data ScienceHighlights from NIH Data Science
Highlights from NIH Data Science
 
Turning FAIR into Reality: Briefing on the EC’s report on FAIR data
Turning FAIR into Reality: Briefing on the EC’s report on FAIR dataTurning FAIR into Reality: Briefing on the EC’s report on FAIR data
Turning FAIR into Reality: Briefing on the EC’s report on FAIR data
 
What is eScience, and where does it go from here?
What is eScience, and where does it go from here?What is eScience, and where does it go from here?
What is eScience, and where does it go from here?
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?
 
Australia's Environmental Predictive Capability
Australia's Environmental Predictive CapabilityAustralia's Environmental Predictive Capability
Australia's Environmental Predictive Capability
 
EGI Engage: Impact & Results
EGI Engage: Impact & ResultsEGI Engage: Impact & Results
EGI Engage: Impact & Results
 
Bioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big DataBioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big Data
 
Big Data
Big Data Big Data
Big Data
 
Cri big data
Cri big dataCri big data
Cri big data
 
Towards the Digital Research Enterprise
Towards the Digital Research EnterpriseTowards the Digital Research Enterprise
Towards the Digital Research Enterprise
 
UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015
UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015
UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015
 
Implications of the Fourth Paradigm
Implications of the Fourth ParadigmImplications of the Fourth Paradigm
Implications of the Fourth Paradigm
 
I o dav data workshop prof wafula final 19.9.17
I o dav data workshop prof wafula final 19.9.17I o dav data workshop prof wafula final 19.9.17
I o dav data workshop prof wafula final 19.9.17
 
Data sharing for development: a case of Infrastructural development in Uganda...
Data sharing for development: a case of Infrastructural development in Uganda...Data sharing for development: a case of Infrastructural development in Uganda...
Data sharing for development: a case of Infrastructural development in Uganda...
 
David Park APAN Slid..
David Park APAN Slid..David Park APAN Slid..
David Park APAN Slid..
 
The NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGThe NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAG
 

Semelhante a Data accessibility and the role of informatics in predicting the biosphere

Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and SecurityOntology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
Barry Smith
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
Carole Goble
 

Semelhante a Data accessibility and the role of informatics in predicting the biosphere (20)

A Data Biosphere for Biomedical Research
A Data Biosphere for Biomedical ResearchA Data Biosphere for Biomedical Research
A Data Biosphere for Biomedical Research
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWG
 
Open Science - Global Perspectives/Simon Hodson
Open Science - Global Perspectives/Simon HodsonOpen Science - Global Perspectives/Simon Hodson
Open Science - Global Perspectives/Simon Hodson
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon Hodson
 
Open Data in a GIS-perspective - Dr. Joep Crompvoets
Open Data in a GIS-perspective - Dr. Joep CrompvoetsOpen Data in a GIS-perspective - Dr. Joep Crompvoets
Open Data in a GIS-perspective - Dr. Joep Crompvoets
 
Dealing with Semantic Heterogeneity in Real-Time Information
Dealing with Semantic Heterogeneity in Real-Time InformationDealing with Semantic Heterogeneity in Real-Time Information
Dealing with Semantic Heterogeneity in Real-Time Information
 
big_data_casestudies_2.ppt
big_data_casestudies_2.pptbig_data_casestudies_2.ppt
big_data_casestudies_2.ppt
 
SemWeb 4 Gov – opportunities and challenges
SemWeb 4 Gov – opportunities and challengesSemWeb 4 Gov – opportunities and challenges
SemWeb 4 Gov – opportunities and challenges
 
Rdaeu russia_fg_1_july2014_final
Rdaeu  russia_fg_1_july2014_finalRdaeu  russia_fg_1_july2014_final
Rdaeu russia_fg_1_july2014_final
 
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and SecurityOntology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
 
Turning FAIR data into reality
Turning FAIR data into realityTurning FAIR data into reality
Turning FAIR data into reality
 
Software Sustainability Institute
Software Sustainability InstituteSoftware Sustainability Institute
Software Sustainability Institute
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
Sinnott Paper
Sinnott PaperSinnott Paper
Sinnott Paper
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
Open Data is not Enough
Open Data is not EnoughOpen Data is not Enough
Open Data is not Enough
 
Critique and Reflections on Open Data Initiatives
Critique and Reflections on  Open Data  InitiativesCritique and Reflections on  Open Data  Initiatives
Critique and Reflections on Open Data Initiatives
 
Towards a big data roadmap for europe
Towards a big data roadmap for europeTowards a big data roadmap for europe
Towards a big data roadmap for europe
 
Enabling the physical world to the Internet and potential benefits for agricu...
Enabling the physical world to the Internet and potential benefits for agricu...Enabling the physical world to the Internet and potential benefits for agricu...
Enabling the physical world to the Internet and potential benefits for agricu...
 
What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?
 

Mais de Alex Hardisty

10th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v210th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v2
Alex Hardisty
 
Eudat user forum-london-11march2013-biovel-v3
Eudat user forum-london-11march2013-biovel-v3Eudat user forum-london-11march2013-biovel-v3
Eudat user forum-london-11march2013-biovel-v3
Alex Hardisty
 
TextofKeynote-EGIforum-15-Sep2010
TextofKeynote-EGIforum-15-Sep2010TextofKeynote-EGIforum-15-Sep2010
TextofKeynote-EGIforum-15-Sep2010
Alex Hardisty
 

Mais de Alex Hardisty (16)

openDS - A new standard for digital specimens
openDS - A new standard for digital specimensopenDS - A new standard for digital specimens
openDS - A new standard for digital specimens
 
Global Research Infrastructures for Biodiversity and Ecosystems Research
Global Research Infrastructures for Biodiversity and Ecosystems ResearchGlobal Research Infrastructures for Biodiversity and Ecosystems Research
Global Research Infrastructures for Biodiversity and Ecosystems Research
 
Approach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) project
Approach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) projectApproach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) project
Approach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) project
 
Constructing bottomup
Constructing bottomupConstructing bottomup
Constructing bottomup
 
Mapping Research Infrastructures with the ENVRI Reference Model
Mapping Research Infrastructures with the ENVRI Reference ModelMapping Research Infrastructures with the ENVRI Reference Model
Mapping Research Infrastructures with the ENVRI Reference Model
 
BioVeL at IBERGRID e-Infrastructures and biodiversity workshop, 19th Septembe...
BioVeL at IBERGRID e-Infrastructures and biodiversity workshop, 19th Septembe...BioVeL at IBERGRID e-Infrastructures and biodiversity workshop, 19th Septembe...
BioVeL at IBERGRID e-Infrastructures and biodiversity workshop, 19th Septembe...
 
Biodiversity Informatics Horizons 2013 - Introduction and Scope
Biodiversity Informatics Horizons 2013 - Introduction and ScopeBiodiversity Informatics Horizons 2013 - Introduction and Scope
Biodiversity Informatics Horizons 2013 - Introduction and Scope
 
Hardistyroberts190313opt 130319072407-phpapp02
Hardistyroberts190313opt 130319072407-phpapp02Hardistyroberts190313opt 130319072407-phpapp02
Hardistyroberts190313opt 130319072407-phpapp02
 
10th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v210th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v2
 
Eudat user forum-london-11march2013-biovel-v3
Eudat user forum-london-11march2013-biovel-v3Eudat user forum-london-11march2013-biovel-v3
Eudat user forum-london-11march2013-biovel-v3
 
Biodiversity Virtual e-Laboratory (BioVeL)
Biodiversity Virtual e-Laboratory (BioVeL)Biodiversity Virtual e-Laboratory (BioVeL)
Biodiversity Virtual e-Laboratory (BioVeL)
 
E cconcertation lyon-22-sep2011-v3
E cconcertation lyon-22-sep2011-v3E cconcertation lyon-22-sep2011-v3
E cconcertation lyon-22-sep2011-v3
 
AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011
 
XldbEuropeEdinburgh-09-jun2011
XldbEuropeEdinburgh-09-jun2011XldbEuropeEdinburgh-09-jun2011
XldbEuropeEdinburgh-09-jun2011
 
TextofKeynote-EGIforum-15-Sep2010
TextofKeynote-EGIforum-15-Sep2010TextofKeynote-EGIforum-15-Sep2010
TextofKeynote-EGIforum-15-Sep2010
 
EGIforum-Amsterdam-15-Sep2010
EGIforum-Amsterdam-15-Sep2010EGIforum-Amsterdam-15-Sep2010
EGIforum-Amsterdam-15-Sep2010
 

Último

9953056974 ,Low Rate Call Girls In Adarsh Nagar Delhi 24hrs Available
9953056974 ,Low Rate Call Girls In Adarsh Nagar  Delhi 24hrs Available9953056974 ,Low Rate Call Girls In Adarsh Nagar  Delhi 24hrs Available
9953056974 ,Low Rate Call Girls In Adarsh Nagar Delhi 24hrs Available
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Bloom Boutique | GK-1 ☎ 9990224454 High Class Delhi NCR 24 Hour...
Call Girls In Bloom Boutique | GK-1 ☎ 9990224454 High Class Delhi NCR 24 Hour...Call Girls In Bloom Boutique | GK-1 ☎ 9990224454 High Class Delhi NCR 24 Hour...
Call Girls In Bloom Boutique | GK-1 ☎ 9990224454 High Class Delhi NCR 24 Hour...
rajputriyana310
 
Contact Number Call Girls Service In Goa 9316020077 Goa Call Girls Service
Contact Number Call Girls Service In Goa  9316020077 Goa  Call Girls ServiceContact Number Call Girls Service In Goa  9316020077 Goa  Call Girls Service
Contact Number Call Girls Service In Goa 9316020077 Goa Call Girls Service
sexy call girls service in goa
 
VIP Call Girls Valsad 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Valsad 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Valsad 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Valsad 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
E Waste Management
E Waste ManagementE Waste Management
E Waste Management
Dr. Salem Baidas
 

Último (20)

BOOK Call Girls in (Dwarka) CALL | 8377087607 Delhi Escorts Services
BOOK Call Girls in (Dwarka) CALL | 8377087607 Delhi Escorts ServicesBOOK Call Girls in (Dwarka) CALL | 8377087607 Delhi Escorts Services
BOOK Call Girls in (Dwarka) CALL | 8377087607 Delhi Escorts Services
 
Call Girls Ramtek Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Ramtek Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Ramtek Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Ramtek Call Me 7737669865 Budget Friendly No Advance Booking
 
9953056974 ,Low Rate Call Girls In Adarsh Nagar Delhi 24hrs Available
9953056974 ,Low Rate Call Girls In Adarsh Nagar  Delhi 24hrs Available9953056974 ,Low Rate Call Girls In Adarsh Nagar  Delhi 24hrs Available
9953056974 ,Low Rate Call Girls In Adarsh Nagar Delhi 24hrs Available
 
VIP Model Call Girls Wagholi ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Wagholi ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Wagholi ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Wagholi ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
VIP Model Call Girls Viman Nagar ( Pune ) Call ON 8005736733 Starting From 5K...
VIP Model Call Girls Viman Nagar ( Pune ) Call ON 8005736733 Starting From 5K...VIP Model Call Girls Viman Nagar ( Pune ) Call ON 8005736733 Starting From 5K...
VIP Model Call Girls Viman Nagar ( Pune ) Call ON 8005736733 Starting From 5K...
 
Call Girls In Bloom Boutique | GK-1 ☎ 9990224454 High Class Delhi NCR 24 Hour...
Call Girls In Bloom Boutique | GK-1 ☎ 9990224454 High Class Delhi NCR 24 Hour...Call Girls In Bloom Boutique | GK-1 ☎ 9990224454 High Class Delhi NCR 24 Hour...
Call Girls In Bloom Boutique | GK-1 ☎ 9990224454 High Class Delhi NCR 24 Hour...
 
Proposed Amendments to Chapter 15, Article X: Wetland Conservation Areas
Proposed Amendments to Chapter 15, Article X: Wetland Conservation AreasProposed Amendments to Chapter 15, Article X: Wetland Conservation Areas
Proposed Amendments to Chapter 15, Article X: Wetland Conservation Areas
 
Call Girls Magarpatta Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Magarpatta Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Magarpatta Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Magarpatta Call Me 7737669865 Budget Friendly No Advance Booking
 
Contact Number Call Girls Service In Goa 9316020077 Goa Call Girls Service
Contact Number Call Girls Service In Goa  9316020077 Goa  Call Girls ServiceContact Number Call Girls Service In Goa  9316020077 Goa  Call Girls Service
Contact Number Call Girls Service In Goa 9316020077 Goa Call Girls Service
 
Call Girls In Okhla DELHI ~9654467111~ Short 1500 Night 6000
Call Girls In Okhla DELHI ~9654467111~ Short 1500 Night 6000Call Girls In Okhla DELHI ~9654467111~ Short 1500 Night 6000
Call Girls In Okhla DELHI ~9654467111~ Short 1500 Night 6000
 
Call On 6297143586 Pimpri Chinchwad Call Girls In All Pune 24/7 Provide Call...
Call On 6297143586  Pimpri Chinchwad Call Girls In All Pune 24/7 Provide Call...Call On 6297143586  Pimpri Chinchwad Call Girls In All Pune 24/7 Provide Call...
Call On 6297143586 Pimpri Chinchwad Call Girls In All Pune 24/7 Provide Call...
 
(NEHA) Call Girls Navi Mumbai Call Now 8250077686 Navi Mumbai Escorts 24x7
(NEHA) Call Girls Navi Mumbai Call Now 8250077686 Navi Mumbai Escorts 24x7(NEHA) Call Girls Navi Mumbai Call Now 8250077686 Navi Mumbai Escorts 24x7
(NEHA) Call Girls Navi Mumbai Call Now 8250077686 Navi Mumbai Escorts 24x7
 
Call Girls Budhwar Peth Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Budhwar Peth Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Budhwar Peth Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Budhwar Peth Call Me 7737669865 Budget Friendly No Advance Booking
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 8005736733 Starting From 5K to...VIP Model Call Girls Hadapsar ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 8005736733 Starting From 5K to...
 
Verified Trusted Kalyani Nagar Call Girls 8005736733 𝐈𝐍𝐃𝐄𝐏𝐄𝐍𝐃𝐄𝐍𝐓 Call 𝐆𝐈𝐑𝐋 𝐕...
Verified Trusted Kalyani Nagar Call Girls  8005736733 𝐈𝐍𝐃𝐄𝐏𝐄𝐍𝐃𝐄𝐍𝐓 Call 𝐆𝐈𝐑𝐋 𝐕...Verified Trusted Kalyani Nagar Call Girls  8005736733 𝐈𝐍𝐃𝐄𝐏𝐄𝐍𝐃𝐄𝐍𝐓 Call 𝐆𝐈𝐑𝐋 𝐕...
Verified Trusted Kalyani Nagar Call Girls 8005736733 𝐈𝐍𝐃𝐄𝐏𝐄𝐍𝐃𝐄𝐍𝐓 Call 𝐆𝐈𝐑𝐋 𝐕...
 
VIP Call Girls Valsad 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Valsad 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Valsad 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Valsad 7001035870 Whatsapp Number, 24/07 Booking
 
VIP Model Call Girls Bhosari ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Bhosari ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Bhosari ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Bhosari ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
E Waste Management
E Waste ManagementE Waste Management
E Waste Management
 
(AISHA) Wagholi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(AISHA) Wagholi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(AISHA) Wagholi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(AISHA) Wagholi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
VVIP Pune Call Girls Koregaon Park (7001035870) Pune Escorts Nearby with Comp...
VVIP Pune Call Girls Koregaon Park (7001035870) Pune Escorts Nearby with Comp...VVIP Pune Call Girls Koregaon Park (7001035870) Pune Escorts Nearby with Comp...
VVIP Pune Call Girls Koregaon Park (7001035870) Pune Escorts Nearby with Comp...
 

Data accessibility and the role of informatics in predicting the biosphere

  • 1. Data accessibility and the role of informatics in predicting the biosphere Alex Hardisty Director of Informatics Projects, School of Computer Science & Informatics Coordinator, FP7 BioVeL project www.biovel.eu email: hardistyar@cardiff.ac.uk /alexhardisty (occasionally!) 1
  • 2. Structuring the biodiversity informatics community at the European level and beyond Biodiversity Informatics Horizons 2013 180 experts conclude that there is “a growing need for predictive biosphere modelling” • Integration: Make better use of what we have • Cooperation: Data from the whole world is needed • Promotion: Europe is well placed to offer leadership 2
  • 3. What if …? Imagine if we could … … Predict community level dynamics of ecosystems (i.e., behaviours) at scales from local to global, based on the ecology and biology of all individual organisms … e.g., Ecosystems: Time to model all life on Earth. Purves et al., Nature 493 (2013) Image: StuartMiles / FreeDigitalPh3otos.net
  • 4. Imagine if we could … … Measure and calculate “Essential Biodiversity Variables” … … for any geographic area (continental, regional, local), by any person anywhere, using data for that area that may be held by any (research) infrastructure. Not only that, but also learn how to forecast EBVs 4
  • 5. Depend on collaboration to deliver the evidence, i.e., based on synthesis and modelling of • Increasingly large amounts of data from multiple sources (environmental, taxonomic, genomic and ecological) • Gathered by manual observation and automated sensors, digitisation, nextgen sequencing and remote sensing Beyond the abilities of any one individual or any single research community to collect, observe or generate. Variety, Velocity and Volume of “Big Data” 5 Photo: Smokestacks against skyline and sunset, Estonia. © Curt Carnemark / World Bank Photo Collection
  • 6. From informatics perspective, how close are we to that? Topical coverage 100% Data sharing and QC 100% 0% Data types Data source tracking Data citation tracking Data integration User applications & interfaces Funding Access policy Technology GIS Standards Data 9 research infrastructures from around the world exhibit “a satisfactory level of potential interoperability” Software architecture 100% 0% Programming languages Authentication Authorization Middleware Computing infrastructure Standards Technology Service logic 0% Geographical coverage Infrastructure topology Native interoperability and enablers Merging of science & policy needs Merging of science & industry needs Engagement of citizens Licensing and business model General 6
  • 7. A computational challenge: Greater than that of weather forecasting; greater than that of climate prediction? Image from climateprediction.net HarfootMBJ, Newbold T, Tittensor DP, Emmott S, et al. (2014) Emergent Global Patterns of Ecosystem Structure and Function from a Mechanistic General Ecosystem Model. PLoS Biol 12(4): e1001841. doi:10.1371/journal.pbio.1001841 For 1km resolution, “… 3 to 6 orders of magnitude larger, … an exascale problem” Jack K. Horner Independent consultant & 7 Adviser to KU Biodiversity Institute
  • 8. The situation today can be likened to meteorology in 1950’s, 60’s and 70’s (and later in climatology) when the emergence of numerical weather prediction drove demand for: • New observations • The emergence of a global infrastructure for acquiring, mobilising and normalising data, and • Better models of global atmospheric behaviour 8
  • 9. Accessible data is useful data, not just for research Global policies/reports Regional policies/reports National policies/reports Data and information Direct provision of data/information Indirect provision through reports Assessment processes Green accounting etc 9 Diagram courtesy of EC FP7 EU BON project
  • 10. To be able to predict the biosphere we need to mobilise data and make it accessible 10
  • 11. It’s a journey towards • Global data, covering the whole planet. There are significant gaps everywhere today • Making all our small-scale, local data – which often characterises the current day practice of field ecology – global That is to say, we have to mobilise, clean, normalise and quality assure many small sets of data that together can give us the global data we need to calibrate models We are achieving that for certain classes of data but it is not without its difficulties 11
  • 12. Issues arise in each of the 4 stages of mobilising data for synthesis • Data acquisition – Standardised measurement protocols • Data curation – Assigning right metadata and persistent identifiers – Finding a home for the data – and putting it there • Data discovery and access – Finding relevant data – Machine readable access to data i.e., WS front-end • Data processing / analysis, including re-use – Owners want attribution – Tracking provenance and follow licensing conditions – Problems at every step, on every workflow run http://envri.eu/rm 12
  • 13. See also: “Showing you this map of aggregated bullfrog occurrences would be illegal” http://peterdesmet.com /posts/illegal-bullfrogs. html “Our analysis of the licenses of all 11.000+ GBIF registered datasets shows a bleak picture. Very few GBIF registered datasets can be easily and legally used, let alone without restrictions. This is mainly due to data being published with no or a non-standard license.” 13 Peter Desmet and Bart Aelterman, 22nd Nov 2013, peterdesmet.com
  • 14. See also: “Showing you this map of aggregated bullfrog occurrences would be illegal” http://peterdesmet.com /posts/illegal-bullfrogs. html “Our analysis of the licenses of all 11.000+ GBIF registered datasets shows a bleak picture. Very few GBIF registered datasets can be easily and legally used, let alone without restrictions. This is mainly due to data being published with no or a non-standard license.” 14 Peter Desmet and Bart Aelterman, 22nd Nov 2013, peterdesmet.com
  • 15. Data re-use: Owners want attribution Example 1) Taxonomic data refinement Workflow BioSTIF CoL 3 levels of attribution • complete work • contributing database of the record • expert who provides taxonomic scrutiny of the individual record. Tool license (s) GBIF data use agreement • Respect restrictions of access to sensitive data. • Identifier of ownership of data must be retained with every data record (through the workflow) • Publicly acknowledge the Data Publishers whose biodiversity data they have used. 15 • Any additional terms and conditions of use set by the Data Publisher.
  • 16. More problems at every step, on every run Example 2) Niche Modelling Workflow Create model Model test Model projection High quality occurrence data set Select algorithm Select parameter values for the chosen algorithm Assemble the model on openModeller service Test the performance of the parameter in the model Test performance of the distribution prediction on the model Project Model with prediction layers Changing algorithm, parameter values, and set of layers Project Model with original layers Visualize and publish results Select layers with environmental factors that are likely to influence the distribution of the species Select prediction layers • License on algorithm • License on software Licenses on environmental data layers • Permissions to use • AuthN/AuthZ Moving data from one service to another • 3rd party software • All issues associated with publication 16
  • 17. In a recent EU BON study Only 35% of surveyed datasets (wider scope than just GBIF) are accessible under an open license or waiver, without restriction on use For 29 scientific questions relating to needs of European environmental policy, the availability of datasets to answer the questions is in the range ‘satisfactory’ (3) to ‘poor’ (2) 17
  • 18. Multiple initiatives to make data more accessible; some are general purpose https://rd-alliance.org/ … builds the social and technical bridges that enable open sharing of data … researchers and innovators openly sharing data across technologies, disciplines, and countries to address the grand challenges of society. http://www.datafairport.org/ … successful community supported conventions, policies and practices for data identifiers, formats, checklists and vocabularies that enable data interoperability, citation and stewardship. ORCID and DataCite initiatives to uniquely identify (respectively) scientists and data sets 18
  • 19. Some are more domain specific Promoting free and open access to biodiversity information A framework to focus effort and investment to deliver biodiversity knowledge more effectively www.biodiversityinformatics.org/ www.bouchout-declaration.org 19
  • 20. A shared and maintained multi-purpose network of computationally-based processing services in an open data domain Image: CoolDesign / FreeDigitalPh2o0tos.net With 78 contributors, we published the whitepaper, April 2013 - since viewed more than 34,000 times.
  • 21. Building a heterogeneous Service Network 21 Users’ workflows and applications Sustained Service and Data Providers GBIF, CoL, OBIS, WoRMS, EMBL-EBI, BGBM, CRIA, EoL, BHL, ALA, LTER, etc. & more. www.biodiversitycatalogue.org Recognised and stable Infrastructure Providers National, EGI.eu, PRACE, commercial, EUDAT, etc.
  • 22. Preparing the next, coordinated steps 22 Diagram from LinkD Concept Note, September 2014
  • 23. LinkD Develop the highly responsive digital framework required to enable high throughput research and support science of scale towards the long term vision of modelling Life on Earth LinkD Science of Scale for L i fe on Ear th What we want to do in LinkD? ELODINS ENVRI+ From slides by Vince Smith, LinkD proposal coordinator, Natural History Musuem, London
  • 24. Take home message: “It’s a journey” • Accessible data is the enabler of “in-silico” science that leads towards predicting the biosphere • A shared multi-purpose network of processing services, sitting on top of open data is the route to interoperability •Working together as a community is essential 24 Photo: A lone farmer walks among rice paddies. © DFATD-MAECD/Tick Collins

Notas do Editor

  1. (s)
  2. Inspired by roadmap publications such as GBIO and the White paper. Mandated by European and global societal challenges. Supported by the maturity of the available foundational e-Infrastructures. Science of Scale: To maximize the efficiency of the available data, services and tools. This is what the commission calls science 2.0. In short is using economies of scale in data collection and associated infrastructure to do big things.