3. hpc.nih.gov
The NIH intramural program’s large-scale high-performance computing resource
completely dedicated to biomedical computing
• High availability and high data durability
• Designed for general-purpose scientific computing (not dedicated to any single application type)
• Dedicated staff with expertise in high-performance computing and computational biology
Biowulf: the NIH Intramural Program HPC system
6. hpc.nih.gov
TB
Outbound data
Inbound data
8 DTNsSingle host
Globus Transfers since 2014
20192018201720162015
20192018201720162015
50
100
150
200
250
50
100
150
200
250
Outbound data
Inbound data
20192018201720162015
7. hpc.nih.gov
Globus Transfers in the last year
~ 3 PB of biomedical data transferred
450 unique users
2000 unique hosts
High Points
24 million files in Oct 2018
300 TB in March 2019
NIH site license
~20 Endpoints at NIH
8. hpc.nih.gov
Data Sharing via Globus
Many NIH researchers have outside & international collaborators
Globus shares
Oct 2018: use Globus SDK -> get list of user shares
1900+ user shares via Globus on the NIH HPC Systems!!
11. hpc.nih.gov
Data Sharing via Globus
NCI Sequencing Core Facility
- serves 150 labs and collaborators
NICHD Sequencing Facility
- serves 11 labs
- 10,000 samples sequenced and shared since 2014
- 150 TB data shared off NIH HPC in 2018
- additional data shared off their own Globus endpoint
- transfers ~ 15 TB /year
12. hpc.nih.gov
Wishlist
• Admin ability to delete endpoints
• Admin ability to prohibit ‘world-write’ shared endpoints
(and maybe ‘world-read’ as well)
• Admin ability to get ‘create date’ for share
• Users who set up a shared endpoint would like to know when data has
been downloaded