The Helmholtz Association in Germany is developing a federated research data infrastructure called the Helmholtz Data Federation (HDF). The HDF aims to support data management across scientific communities in Germany through innovative software, user support, and hardware. It is intended to be a building block for both a national research data infrastructure in Germany (NFDI) and the European Open Science Cloud (EOSC). The HDF focuses on three key elements: data management software, excellent user support and joint research and development, and advanced storage and analysis hardware.
1. KIT – The Research University in the Helmholtz Association
Steinbuch Centre for Computing
www.kit.edu
EOSC and national providers
Achim Streit (achim.streit@kit.edu)
2. 2
Helmholtz Data Federation (HDF)
The Helmholtz Association is developing a federated
research data infrastructure in Germany based on
scientific use cases from communities across the
German science system
High strategic relevance in the context “Information &
Data Science” of the further development of the
Helmholtz Association
Building block for a national research data
infrastructure (NFDI) and the European Open
Science Cloud (EOSC)
Three elements: innovative software for data
management and FedIDM/AAI, excellent user
support and joint R&D, leading-edge storage and
analysis hardware
Disclaimer: HDF is not a federation of data per se (as the name might imply) –
it is a federation of data management systems
3. 3 22.11.2018 Steinbuch Centre for ComputingSteinbuch Centre for Computing
Stimulus
What service(s) the EOSC should provide to attract national
providers to open up their national facilities and services to
international research?
Federated AAI as the basis – broad role-out in all scientific/public
institutions
Simple services that 99% of the scientists really need, e.g.
Google docs like functionality
Dropbox like functionality (with a certain limit in volume)
Doodle like functionality
Video/teleconferencing like functionality
Data repositories
4. 4 22.11.2018 Steinbuch Centre for ComputingSteinbuch Centre for Computing
Stimulus – cont‘d
Data-intensive services for “large-scale” scientific data
(probably not needed by 99% of all scientists, howsoever “large-scale”
is defined)
Data storage & archival (investments needed !!!, EuroHPC does not help
at all)
Data transfer services based on fast WAN connections
High throughput analytics services next to the data storage (I’m not talking
about HPC systems as in PRACE)
Compliant with the legal framework (e.g. GDPR)
Based on solid funding – no short-termed (e.g. 3 years) project
funding
In conjunction with national initiatives