Personal views on what Research Infrastructures really need for data - a more comprehensive version of the 5 minute presentation I have at XLDB-Europe, 8-10th June 2011 in Edinburgh
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
XldbEuropeEdinburgh-09-jun2011
1. What does Research Infrastructure
really need for Data?
An e-Science infrastructure for biodiversity and ecosystem science
ENVRI
Common Operations of Environmental Research Infrastructures
Alex Hardisty
School of Computer Science & Informatics
2. What is LifeWatch?
• European Research Infrastructure for
understanding biodiversity as a whole
interacting system
– Exploring patterns of biodiversity and
processes of biodiversity across space/time
• A geospatial data e-Infrastructure
– Distributed observatories / sensors
– Data mgmt., processing and analytical tools
– Computational capability and capacity
– Collaborative environments
– Support, training, partnering, fellowship
portal.lifewatch.eu www.lifewatch.eu
3. Challenge of SCALE: > 25,000 users
1800 terrestrial Long- >200 Marine reference and focal Hundreds of millions of
Term Ecological sites, with more to come: specimens in natural
Research (LTER) increasingly sensor instrumented science collections:
sites: increasingly >275m now indexed,
sensor instrumented increasing at 20% p.a.
Plus: all kinds of small, personal, group, and departmental datasets that need to get published
4. From Peterson et al (2010), Syst Biodivers 8(2), 159-168
From Guralnick and Hill (2010), http://www.slideshare.net/robgur/ievobio-keynote-talk-2010
portal.lifewatch.eu
www.lifewatch.eu
Challenge of HETEROGENEITY: Interconnected
nature of biodiversity ideas, outputs, repositories
5. ENVRI
Common solutions to common challenges
faced by ESFRI environmental infrastructures
(left to right, top to bottom)
Global ocean observing infrastructure
Svalbard arctic Earth observing system
Aircraft for global observing system
Tropospheric research aircraft
Polar research icebreaker
Biodiversity and ecosystem research
Multidisciplinary seafloor observatory
Upgrade of incoherent scatter facility
Plate observing system
Integrated carbon observation system
Source: EC
6. ENVRI
Data transfer
Fast data transmission
Operation at remote sites
User functionalities
Data Data Virtual Environments & Collaborative organisations
generators users Security & Protection
Data discovery & Navigation
Data submission tools (meta) data tagging tools
Operational Semantic Interoperability
Community – specific
Services Workflow Generator
Knowledge management
Virtualisation
Persistant storage capacity
Data Services 24/7 operation
Preservation & Sustainability (digital asset
management)
Authenticity
Certification & Integrity
GUIDs
Source: W.Los, UvA
7. ENVRI
What do RIs REALLY need for data?
• Common solutions to common problems
– adopted by each infrastructure through its construction phase
• Common Reference Model providing multiple ‘views’ of RI:
– Science business / enterprise view, Information view,
Computational / services view, Engineering view, Technology
view
• Standards, Standards, Standards
– Data capture from distributed sensors, Metadata definition,
Management of high volume data, Execution of workflows,
Visualization of data, Provenance and annotation,
Interoperability between assets
• Common tools e.g., for data discovery and access
– in a federation of distributed data repositories and
interoperating infrastructures
8. ENVRI
• Report of the High
Level Expert Group
on Scientific Data
• Neelie Kroes, EC
Vice-President for
the Digital Agenda
– “... use it as a
reference point
when discussing
the priorities of
EU research
investments.”