1. The African Open Science Platform
ICT Infrastructure in Support of Data
Sharing
Presented by Ina Smith
Project Manager African Open Science Platform
Academy of Science of South Africa (ASSAf)
WACREN 2018 Conference, 15 March 2018
3. Square Kilometre Array (SKA)
• Data collection on a massive scale
• Telescope array to consist of 250,000 radio
antennas between Australia & SA
• Investment in machine learning and artificial
intelligence software tools to enable data
analysis
• 400+ engineers and technicians in
infrastructure, fibre optics, data collection
• Supercomputers to process data (IBM)
• To come: super computer 3x times power of
world’s current fastest computer (Tianhe-2) to
cope with SKA data
4. “Construction of the SKA is due to begin in 2018 and finish
sometime in the middle of the next decade. Data
acquisition will begin in 2020, requiring a level of
processing power and data management know-how that
outstretches current capabilities.
Astronomers estimate that the project will generate
35,000-DVDs-worth of data every second. This is
equivalent to “the whole world wide web every day,”
said Fanaroff.”
6. • African genomic research; Central node at University of
Cape Town
• Using NetMap to monitor connectivity
• Data transfer: Africa Globus Online (668,622 files transferred
between Rhodes University & UCT; 140TB data transferred
from USA to SA
• Challenges: slow & unstable Internet, unreliable power
supply, continent-wide obsolete computer infrastructure that
varies between medium-scale server infrastructure to a small
number of workstations, with multiple operating systems, lack
of centralized, secure data storage
• Other: database of participants (H3APRDB, REDCap), data
analysis incl. Galaxy, Job Management System, eBiokits,
REDCap, WebProtege, Pipelines for data execution, data
repository (European Genome-Phenome Archive)
7. Open Science Defined
“Open Science is the practice of science in
such a way that others can collaborate and
contribute, where research data, lab notes
and other research processes are freely
available, under terms that enable reuse,
redistribution and reproduction of the
research and its underlying data and
methods.” - FOSTER Project, funded by the European
Commission
8. Benefits of open data
• Provide evidence for research conducted
• Collaboration advances science, discovery
• Predict trends & informed decisions
• Drive development, service delivery
• More entrepreneurs – using data in innovative
ways, create jobs
• Have potentially far more outcomes when
open, higher impact
• Democratising research & data towards
achieving 2030 Sustainable Development
Goals
10. Original Research Data Lifecycle image from University of California, Santa Cruz
http://guides.library.ucsc.edu/datamanagement/
Repositories
Repositories
Tools
Gold/Green OA
Plan
Policy&Infrastructure
11. “Several open science activities are
underway across Africa, but a great deal
will be gained if, in the context of
developing inter-regional links, these
activities were to be coordinated and
developed through such a coordinating
initiative.” - CODATA
15. African Open Science Platform
• Platform = opportunity to engage in dialogue,
create awareness, connect all, provide
continental view
• Funded by SA Dept. of Science & Technology
through National Research Foundation
• 3 years (1 Nov. 2016 – 31 Oct. 2019)
• Managed by Academy of Science of South
Africa (ASSAf)
• Through ASSAf hosting ICSU Regional Office for
Africa (ICSU ROA)
• Direction from CODATA
http://africanopenscience.org.za/
16. Accord on Open Data in a
Big Data World
• Proposes
comprehensive set of
principles
• FAIR Principles
• Values of open data in
emerging scientific
culture of big data
• Need for an
international
framework
• Provides framework &
plan for African data
science capacity
mobilization initiativeCall to
Endorse
17. Key Stakeholders
• Global Network of Science Academies (IAP)
• International Council for Science (ICSU)
• The World Academy of Sciences (TWAS)
• Research Data Alliance (RDA)
• NRENs (Internet Service Providers for Education)
• Association of African Universities (AAU)
• Network of African Science Academies (NASAC)
• African Research Councils (incl. DIRISA, funders)
• African Universities
• African Governments
• Other
21. Click to view Initiatives/Country
https://www.targetmap.com/viewer.aspx?reportId=56245
Please note: this is just a preview and data still to be cleaned and
updated and corrected.
23. Infrastructure Framework
• Purpose: Create awareness &
guide development of a cyber-
infrastructure strategy & action
plan, promote policies &
strategies
• NRENs – Level 6 Elaborated
Service Offering
An NREN Capability Maturity Model – Duncan
Greaves (2015, Tertiary Education Network)
• Richly connected at high speed to
many other networks/resources
• Deep culture of collaboration
24. Proposed NREN Service
Catalogue in support of Data
• Grid & cloud computing
resources/middleware – access:
• Scientific applications, complex data sets,
computing facilities
• User controlled light paths, videoconferencing,
federated identity services, security, data storage
and archives, connecting e-resources e.g. electron
& astronomical microscopes, medical imaging,
simulators, sensor networks, accelerators,
supercomputers, state-of-the-art affordable
bandwidth on demand, computing power,
capacity building, dedicated point-to-point
Internet Protocol circuits, data storage (data
centres)
25. • Disciplines: Engineering, IT, Economics,
Physics, Biology, Environmental Studies,
Public Health, Town Planning (Smart Cities),
Population Studies
• Research Areas: Climate change,
environmental impact, extreme weather
events, biodiversity, food security, malaria,
infectious diseases and pandemics
26. Data in Africa
• Tunisian Computing Centre el Khawarizmi
manages Data Centre
• Kenya Education Network (KENET) provides
access to domain names, data center, cloud
computing & science gateways, capacity
building, security services
• Data Intensive Research Initiative for South Africa
(DIRISA) – component of SA National Cyber-
Infrastructure System
• Open Data for Africa platform (African
Development Bank (AfDB)) – to boost access to
quality data for managing & monitoring
development results in African countries, incl.
African Action Plan 2063 & 2030 SDGs
27. • High Performance Computing (HPCs):
Botswana, Lesotho, Mozambique, SA,
Tanzania, Zambia, Zimbabwe
• South Africa: Data Intensive Research
Cloud Infrastructure Initiatives – ARC,
SADIRC, Ilifu (cloud for researchers working
in astronomy and bioinformatics in Western
Cape & research data management
system)
28. Africa Data Consensus Study
• Adopted in March 2015 at High Level
Conference on Data Revolution
• Strategy for implementing data revolution in
Africa
• Plan of action to be guided by United Nations
Economic Commission for Africa (UNECA),
African Union Commission (AUC), African
Development Bank (AfDB), supported by UN
Development Programme (UNDP), UN
Populations Fund (UNFPA)
• Implemented in collaboration with partner
institutions from public & private sectors, civil
society organisations
29. • Towards strategy and action plan,
implementation plan and governance
structure
• Support strategic plans on Science,
Technology, Innovation
• Guide on creating and enabling environment
to harness science, technology and
innovation
• Impact socio-economic development &
industrialization
• Enhance education in developing & using
technologies
• Support collaborative research development &
innovation
SADC Cyber-Infrastructure Framework
30. • Cyber-infrastructure is a key driver for a
knowledge based economy
• Comprises of technologies, skills, people
and policies which support generation,
analysis, transport, sharing, stewardship of
information (incl. data)
• Framework provides Roadmap towards
Cyber-infrastructure Strategy
32. Components
• Research and Education Networks (RENs)
• Computation resources & services (HPC etc)
• Data – tools & facilities to enable efficient
data driven discoveries, technologies,
innovations
• HR-capacity development to enable:
• CI specialists to roll-out services & infrastructure
• Beneficiaries to fully benefit from CI services
• Policies to enable optimum establishment &
utilization of CI
33.
34. Closing Remarks
• Exploit data for the benefit of society (Min
Naledi Pandor)
• Collaboration in research is key, based on
reliable infrastructure & high speed
connectivity
• Increasing need for data gathering,
transmission, analysis on a massive scale
• Infrastructure Frameworks to be adopted,
developed in support of data sharing,
research collaboration
• NRENs important key stakeholder to make
collaboration, sharing of data possible
• Build capacity within NRENs