Introduction about protein and General method of analysis of protein
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth Observation
1. NextGEOSS: The Next Generation European Data
Hub and Cloud Platform for Earth Observation
Bente Bye, Wolfgang Ksoll, Nuno Catarino, Marie-Francoise
Voidrot, Erwin Goor, Julian Meyer-Arnek, Pedro Goncalves,
and Nuno Grosso
EGU 2018,Vienna, 9th of April 2018
2. Agenda
● The project NextGEOSS
● Data flow
● Data sources
● Data consumers: pilots
● CKAN
● Experiences - lessons learned
4. NextGEOSS at a glance
User Feedback Mechanism
Enabling users to efficiently deliver and find fit-for-purpose
GEOSS data and information
Advanced Discovery Tools
Increased discoverability of Earth observations and related
information for thematic areas.
5. NextGEOSS at a glance
Community Enhancement
Developing solutions with the communities for the
communities, creating relevant tools tailored to meet
community specific needs.
6. NextGEOSS at a glance
Open, Inclusive, and Agile
Development Strategy
The NextGEOSS approach and methodology are aligned with
the EU openness policies and the GEO open data sharing
policy. Multiple releases allow extensive collaboration.
7. NextGEOSS at a glance
NextGEOSS Project Facts
H2020 project*
Who: 27 partners from 13 countries
Period: 2016 - 2020
Budget: 10M EURO
*NextGEOSS is a winning answer to the H2020 SC5-20-2016 call
8. Open Source Technology – Earth Observation Science –
Benefits management - Sustainability
End User
Communities:
Civil Society,
Government,
Business
Extension
SPARQL
Extension
RDF
Zookeeper
Solrcloud
Search
CKAN API
Web
Access/
GUI
Open
Search
Innovative Research:
- Agriculture/
Foresting
- Biodiversity
- Space+Security
- Cold Regions
- Air Pollution
- Disaster Risk
Reduction
Search Business Cases
Market Study
VCM
Business Innovation/
Sustainability
Benefit Assessment
Sustainability Report
Sustainability
Development Goals of
the UN
Metadata + Tags
Dublin Core
DCAT
GeoDCAT-AP
ISO
ITags
Harvesting
Connectors
Providers
1 - Sentinel
2 - GOME-2
3 - Proba-V
4 - CMEMS
…
WP2.2
Data Flow
Data Sources
Open Data
Copernicus
Sentinel 1-5
Marine
Land
Atmosphere
Citizen
Commercial
Provider
WP3
Resources / Raw Data
Metadata
CKAN Data Hub
Interfaces
Data Discovery Guide Data Ingestion Guide
NIMMbus
User-feed
back
Pilots/Apps/Cloud
Other Data
Cube
Apps.
GEO DAB
Business:
- Territorial
Planning
- Food Security
- Smart Cities
- Energy
- Grid Operating
- Solar Mapping
CKAN Core
WP4+5+6+7
WP8
WP2.1
Requirements
Harvest
9. User Feedback with NiMMBUS (UofB, Barcelona)
Example for integration of external programs
1. User Feedback for a particular dataset starts in the NextGEOSS data hub
2. NiMMbus as an external program is called (see login)
3. Mask is filled in in NiMMbus
4. Feedback data are in the last step available in NiMMbus and NextGEOSS
10. Data Sources
● Sources
○ Satellites: Copernicus, Sentinel, ESA (Fotos, Radar, Laser, …)
○ In Situ
○ Civil Society
● All Open Data
● The EU wants to update the PSI directive: not only public service but
also with public money financed data in public transport, energy
research (open access data)
● These sources produce a tremendous amount of data in real time
(10,000 datasets a day)
● How to search in the data to support transforming data to
knowledge?
11. PILOTSInnovative
Pilots
Business
Pilots
IP1
Agricultural
Monitoring
IP2
Biodiversity
IP4
Cold Regions
IP5
Air Pollution
in Mega Cities
IP3
Space &
Security
IP6
Disaster Risk
Reduction
BP1
Territorial
Planning
BP2
Food Security
BP4.1/2
Energy*
BP3
Smart Cities
NOA
● Pilots are numerical intensive applications in the cloud
● Time Series are calculated from the sources, e.g. in
agriculture for crop performance optimization
● Machine Learning is applied, e.g. in Biodiversity pilots
● Smart Cities pilot brings together data from different
stakeholders like civil society, government and business
● Energy pilots creates e.g. solar maps for cities
● Air Pollution measures e.g. NOx in cities
12. CKAN - Open Source and Standard Metadata
● Open Source in Github (https://github.com/NextGeoss)
● Harvesting metadata
● Searchable by Web GUI or API (OpenSearch, RDF),
tagging with iTAG
● Metadata-standards:
○ “Normal” standards: Dublin Core,DCAT, GeoCAT,
ISO
○ Community metadata standards. Essential
Variables: Biodiversity, Climate, Ocean
Extension
SPARQL
Extension
RDF
Zookeeper
Solrcloud
Search
CKAN API
Web
Access/
GUI
Open
Search
Metadata + Tags
Dublin Core
DCAT
GeoDCAT-AP
ISO
ITags
Interfaces
Data Discovery Guide Data Ingestion Guide
CKAN Core
WP2.1
13. Experiences - Lessons learned
● Faster discovery from a bunch of sources and easier access by a single point of access
● Due to large number of datasets harvesters have to be designed carefully to catch up the load
● IT-architecture has to be scalable
● End-2-End tests (sources - datahub - application in the cloud) bring quality assurance to the
parts
● Not many sources offer standard based metadata -> connectors have to be programmed
individually
● Metadata are not standardized enough: spelling, language, meaning
Quality issues
● Traditional metadata like Dublin Core, DCAT, GeoDCAT or ISO are not enough
Some communities need special metadata (Essential Biodiversity Variables, Climate, Ocean,
…)
● Licenses: public domain (German: gemeinfrei) is free of rights. No license possible. But
administrations are creative in finding licences (>100) for open data
Sources therefore have often licences not mentioned: too complicate
See also: http://www.mdpi.com/2071-1050/10/2/545 Theoretical Availability versus Practical Accessibility: The Critical Role of
Metadata Management in Open Data Portals
16. IP1: Time Series Analysis for Agricultural Monitoring
Pilot Scope
• Scale up Time Series analysis tools to huge amounts of HR EO-data
• SAT EO-data & in-situ data
Pilot Objectives
• Extend Proba-V MEP & Copernicus Global Land Time Series Viewer with Sent-2 derived VGT indices
• REST and/or WPS end-points → WP3
• Extend prototype of Agro STAC (Spatial Temporal Catalogue for Agronomy) from FP-7 SIGMA → towards
operations
• Temporal and attribute accuracy on WM(T)S: guidelines and prototype
Challenges
• Integrate with processing chains & data on public clouds
• Transfer to operations (in-situ)
17. BP2: Crop Monitoring supporting Food Security
Pilot Scope
• Use of Sentinel-2 for crop monitoring in collaboration with industry
• Data fusion between Proba-V 100 m and Sentinel-2
Pilot Objectives
• Deploy and run HR processing chain for Vegetation Parameters on public cloud: on-demand & subscription
• Develop dynamic dashboard: integration of time series analysis
• Demonstrations & training for users from Agro and Insurance sector
Challenges
• Convenient & scalable processing of large amounts Sentinel-2
• Data analytics
• Data fusion of Proba-V and Sentinel-2
18. IP2:Biodiversity
Pilot Scope
• Essential Biodiversity Variables (RS-EBVs) for habitat
mapping and monitoring
Pilot Objectives
• demonstrate the value of an European Data Hub for the creation of RS-EBVs, which leads to creating a
GEOhub for EBVs by linking the key policy/user network groups (GEO-BON, CBD and IPBES) with the
space agencies.
• demonstrate the use of the European Data Hub in terms high resolution RS-EBVs for habitat mapping
(distribution, suitability and probability) in order to support the European Environment Agency (EEA)
and its Topic Centre for Biological Diversity (ETC-BD). The integration of EO data with in-situ observations,
vegetation relevés, will play an important role.
Challenges
• Incorporation of several RS-EBVs (e.g. phenology) to improve the distribution mapping of EUNIS habitats.
• How far can we integrate different aspects of the developed habitat modelling method (data & models) into
Cloud Sandbox Solution?
19. IP5: Air polution, Urban Growth, Health Risks in Megacities
Pilot Scope
• Analysis of air pollution trends, urban growth rates and health risk indicators
for megacities by integrating EO data with the nextGEOSS infrastructure
• New inputs from Sentinel-3, -5P, CAMS, WDC/RSAT
Pilot Objectives
• Develop a multi-sensor approach to analyse air pollution variability in megacities
linked to
urban growth rates
• Develop a tool to analyse local trends and health risks using the NextGEOSS
infrastructure
• Exploit Copernicus data and servies (Sentinel-3, -5P, CAMS)
• Strengthen the link to the health community
Challenges
• Integrate with Copernicus data hubs and processing chains
20. BP3: Smart Cities
Pilot Scope
• Pilot based in work developed in ESPRESSO H2020 support action. Smart cities use the ISO 37120 and
we will see how that maps on the SDG for EO, as well as pilot how we can integrate smart city sensors in
the in-situ EO
Pilot Objectives
• Mapping ISO 37120 and SDG, sensor integration in GC
Challenges
• Sensor standards in Smart Cities and standards in in-situ EO