1. “The Synergy Between
CHASE-CI and CineGrid”
Opening Talk
CineGrid/CHASE-CI Workshop
Calit2’s Qualcomm Institute
University of California, San Diego
May 15, 2018
Dr. Larry Smarr
Director, California Institute for Telecommunications and Information Technology
Harry E. Gruber Professor,
Dept. of Computer Science and Engineering
Jacobs School of Engineering, UCSD
http://lsmarr.calit2.net
1
2. DOE ESnet’s Science DMZ
Creates a Separate Network for Big Data Applications
• A Science DMZ Integrates 4 Key Concepts Into a Unified Whole:
– A network architecture designed for high-performance applications,
with the science network distinct from the general-purpose network
– The use of dedicated systems as data transfer nodes (DTNs)
– Performance measurement and network testing systems that are
regularly used to characterize and troubleshoot the network
– Security policies and enforcement mechanisms that are tailored for
high performance science environments
http://fasterdata.es.net/science-dmz/
Science DMZ
Coined 2010
3. Based on Community Input and on ESnet’s Science DMZ Concept,
NSF Has Made Over 200 Campus-Level Awards in 44 States
Source: Kevin Thompson, NSF
4. The Pacific Research Platform (PRP) Interconnects Campus DMZs:
CENIC and Pacific Wave are the Optical Backplane
NSF CC*DNI Grant
$5M 10/2015-10/2020
PI: Larry Smarr, UC San Diego Calit2
Co-PIs:
• Camille Crittenden, UC Berkeley CITRIS
• Tom DeFanti, UC San Diego Calit2/QI
• Philip Papadopoulos, UCSD SDSC
• Frank Wuerthwein, UCSD Physics & SDSC
5. • FIONAs PCs [a.k.a ESnet DTNs]:
– ~$8,000 Big Data PC with:
– 1 CPU
– 10/40 Gbps Network Interface Cards
– 3 TB SSDs or 100+ TB Disk Drive
– Extensible for Higher Performance to:
– +Up to 38 Intel CPUs
– +Up to 8 GPUs [4M GPU Core Hours/Week]
– +NVMe SSDs for 100Gbps Disk-to-Disk
– +Up to 160 TB Disks for Data Posting
– $700 10Gpbs FIONAs Being Tested
• FIONettes are $250 FIONAs
– 1Gbps NIC With USB-3 for Flash Storage or SSD
Big Data Science Data Transfer Nodes (DTNs)-
Flash I/O Network Appliances (FIONAs)
Phil Papadopoulos, SDSC &
Tom DeFanti, Joe Keefe & John Graham, Calit2
Key Innovation: UCSD Designed Flash I/O Network Appliances (FIONAs)
To Provide Disk-to-Disk Data Transfer at Full Speed on 10/40/100G Networks
FIONAS—10/40G, $8,000
FIONette—1G, $250
6. FIONA8
FIONA8
100G Epyc NVMe
40G 160TB
100G NVMe 6.4T
SDSU
100G Gold NVMe
March 2018 John Graham, UCSD
100G NVMe 6.4T
Caltech
40G 160TB
UCAR
FIONA8
UCI
FIONA8
FIONA8
FIONA8
FIONA8
FIONA8
FIONA8
FIONA8
FIONA8
sdx-controller
controller-0
Calit2
100G Gold FIONA8
SDSC
40G 160TB
UCR 40G 160TB
USC
40G 160TB
UCLA
40G 160TB
Stanford
40G 160TB
UCSB
100G NVMe 6.4T
40G 160TB
UCSC
40G 160TB
Hawaii
Running Kubernetes/Rook/Ceph On PRP
Allows Us to Deploy a Distributed PB+ of Storage for Posting Science Data
Rook/Ceph - Block/Object/FS
Swift API compatible with
SDSC, AWS, and Rackspace
Kubernetes
Centos7
7. Operational Metrics: Containerized Trace Route Tool
Allows Realtime Visualization of Status of Network Links
All Kubernetes Nodes on PRP
Source: Dmitry Mishin(SDSC),
John Graham (Calit2)Presets
This node graph shows UCR
as the source of the flow
to the mesh
8. Operational Metrics: Containerized perfSONAR MaDDash Dashboards
For Realtime Measurements of PRP Number of Paths and Packet Loss
Source: Dmitry Mishin(SDSC),
John Graham (Calit2)
9. Data Transfer Rates From 40 Gbps DTN in UCSD Physics Building,
Across Campus on PRISM DMZ, Then to Chicago’s Fermilab Over CENIC/ESnet
Based on This Success,
Würthwein Will Upgrade 40G DTN to 100G
For Bandwidth Tests & Kubernetes Integration
With OSG, Caltech, and UCSC
Source: Frank Würthwein, OSG, UCSD/SDSC, PRP
10. Global Scientific Instruments Will Produce Ultralarge Datasets Continuously
Requiring Dedicated Optic Fiber and Supercomputers
Large Synoptic Survey Telescope
3.2 Gpixel Camera
Tracks ~40B Objects,
Creates 1-10M Alerts/Night
Within 1 Minute of Observing
1000 Supernovas Discovered/Night
2x100Gb/s
“First Light”
In 2019
Talk by Shaw Dong, UCSC
Yesterday
11. Pacific
City
Neptune
Canada
45°N
47°30’N
130°W 127°30’W
N
Seattle
GigaPOP
Portland
PRP to Include NSF’s Ocean Observatory Initiative
Fiber Optic Ocean Observatory on Seafloor Off Washington/Oregon
To PRP via
Pacific Wave
Sea Bottom
Electro-optical Cable:
8,000 Volts
10 Gbps Optics
Slide Courtesy, John Delaney, UWash
Axial Volcano
140 Scientific Instruments
12. Being There - Remote Live High Definition Video
of Deep Sea Hydrothermal Vents
http://novae.ocean.washington.edu/story/Ashes_CAMHD_Liv
Mushroom Hydrothermal Vent
on Axial Seamount
1 Mile Below Sea Level
Picture Created
From 40 HD Frames
14 Minutes Live HD Video
On-Line Every 3 Hours
15 feet
Slide Courtesy, John Delaney, UWash
13. John Delaney Viewing High Res Mosaic of
Mushroom Hydrothermal Vent on Axial Seamount on Calit2’s VROOM
Photo by Tom DeFanti, Calit2
July 26, 2017
14. Video of Live Video From ROV
Controlled From Laptop in Calit2’s VROOM
15. 40G FIONAs
20x40G PRP-connected
WAVE@UC San Diego
PRP Now Enables
Distributed Virtual Reality
PRP
WAVE @UC Merced
Transferring 5 CAVEcam Images from UCSD to UC Merced:
2 Gigabytes now takes 2 Seconds (8 Gb/sec)
16. Director: F. Martin Ralph Website: cw3e.ucsd.edu
Big Data Collaboration with:
Source: Scott Sellers, CW3E
Collaboration on Atmospheric Water in the West
Between UC San Diego and UC Irvine
Director, Soroosh Sorooshian, UCSD Website http://chrs.web.uci.edu
17. Calit2’s FIONA
SDSC’s COMET
Calit2’s FIONA
Pacific Research Platform (10-100 Gb/s)
GPUsGPUs
Complete workflow time: 20 days20 hrs20 Minutes!
UC, Irvine UC, San Diego
Major Speedup in Scientific Work Flow
Using the PRP
Source: Scott Sellers, CW3E
18. • CONNected objECT (CONNECT) Algorithm, developed at UCI-CHRS
– Team: Wei Chu, Scott Sellars, Phu Nguyen, Xiaogang Gao, Kuo-lin Hsu, and Soroosh Sorooshian
– Most algorithms do not track the events over it’s life cycle
t=1
t=2
t=3
t=4
t=5
Data Hypercube:
Longitude
Time
1. Each voxel must have 1mm/hr
2. Each object must exist for 24
hours
3. 6 voxel connections
Set Object Criteria:
CONNECT: Object Segmentation Object Storage
(PostgreSQL)
1. Object ID Number
2. Latitude (of each voxel in
objects)
3. Longitude (of each voxel in
objects)
4. Time (hour)
Database Indexes:
5mm/hr
Rainfall
1. 60N-60S, 0-360 lat and long
2. Hourly time step
3. March 1st, 2000 to January 1st,
2011
Data
Convert Global Precipitation Maps to
a Database of Precipitation Spacetime Objects
Source: (Sellars et al. 2013, 2015)
19. Using Machine Learning to Determine
the Precipitation Object Starting Locations
*Sellars et al., 2017 (in prep)
20. UC San Diego Jaffe Lab (SIO) Scripps Plankton Camera
Off the SIO Pier with Fiber Optic Network
21. Over 1 Billion Images So Far!
Requires Machine Learning for Automated Image Analysis and Classification
Phytoplankton: Diatoms
Zooplankton: Copepods
Zooplankton: Larvaceans
Source: Jules Jaffe, SIO
”We are using the FIONAs for image processing...
this includes doing Particle Tracking Velocimetry
that is very computationally intense.”-Jules Jaffe
22. New NSF CHASE-CI Grant Creates a Community Cyberinfrastructure:
Adding a Machine Learning Layer Built on Top of the Pacific Research Platform
Caltech
UCB
UCI UCR
UCSD
UCSC
Stanford
MSU
UCM
SDSU
NSF Grant for High Speed “Cloud” of 256 GPUs
For 30 ML Faculty & Their Students at 10 Campuses
for Training AI Algorithms on Big Data
NSF Program Officer: Mimi McClure
23. CHASE-CI’s ML Researchers Are Exploring Mapping
Machine Learning Algorithm Families Onto Novel Architectures
Qualcomm
Institute
1. Deep & Recurrent Neural Networks (DNN, RNN)
2. Reinforcement Learning (RL)
3. Variational Autoencoder (VAE) and Markov Chain Monte Carlo (MCMC)
4. Support Vector Machine (SVM)
5. Sparse Signal Processing (SSP) and Sparse Baysian Learning (SBL)
6. Latent Variable Analysis (PCA, ICA)
26. Google Has Designed and Deployed
an NvN TensorFlow Accelerator
Calit2 is Negotiating Access for CHASE-CI
27. Partnering with Cloud Vendors
Adds Non von Neumann Processors to CHASE-CI
• Microsoft is Putting FPGAs into Their Data Centers
to Accelerate Critical Applications
• Microsoft is Providing Access for Research Purposes to
432 FPGAS in the Texas Advanced Computing Center
• TACC is Joining PRP
www.microsoft.com/en-us/research/project/project-
catapult/
28. Intel is Positioned to Integrate Multicore CPUs
With GPUs, FPGA, and ML Accelerators
29. The Second National Research Platform Workshop
Bozeman, MT August 6-7, 2018
Announced in Internet2 Closing Keynote:
Larry Smarr “Toward a National Big Data Superhighway”
on Wednesday, April 26, 2017
Co-Chairs:
Larry Smarr, Calit2
Inder Monga, ESnet
Ana Hunsinger, Internet2
Local Host: Jerry Sheehan, MSU