1. “The Pacific Research Platform”
CISE Distinguished Lecture
National Science Foundation
Arlington, VA
January 28, 2016
Dr. Larry Smarr
Director, California Institute for Telecommunications and Information
Technology
Harry E. Gruber Professor,
Dept. of Computer Science and Engineering
Jacobs School of Engineering, UCSD
1
2. Abstract
Research in data-intensive fields is increasingly multi-investigator and multi-campus,
depending on ever more rapid access to ultra-large heterogeneous and widely distributed
datasets. The Pacific Research Platform (PRP) is a multi-institutional extensible
deployment that establishes a science-driven high-capacity data-centric “freeway
system.” The PRP spans all 10 campuses of the University of California, as well as the
major California private research universities, four supercomputer centers, and several
universities outside California. Fifteen multi-campus data-intensive application teams act
as drivers of the PRP, providing feedback over the five years to the technical design staff.
These application areas include particle physics, astronomy/astrophysics, earth sciences,
biomedicine, and scalable multimedia, providing models for many other applications. The
PRP partnership extends the NSF-funded campus cyberinfrastructure awards to a regional
model that allows high-speed data-intensive networking, facilitating researchers moving
data between their labs and their collaborators’ sites, supercomputer centers or data
repositories, and enabling that data to traverse multiple heterogeneous networks without
performance degradation over campus, regional, national, and international distances.
3. Vision: Creating a West Coast “Big Data Freeway”
Connected by CENIC/Pacific Wave to Internet2 & GLIF
Use Lightpaths to Connect
All Data Generators and Consumers,
Creating a “Big Data” Freeway
Integrated With High Performance Global Networks
“The Bisection Bandwidth of a Cluster Interconnect,
but Deployed on a 20-Campus Scale.”
This Vision Has Been Building for 25 Years
4. Launching the Nation’s Information Infrastructure:
NSFnet Supernetwork and the Six NSF Supercomputers
NCSANCSA
NSFNET 56 Kb/s Backbone (1986-8)
PSCPSCNCARNCAR
CTCCTC
JVNCJVNC
SDSCSDSC
Supernetwork Backbone:
56kbps is 50 Times Faster than 1200 bps PC Modem!
5. Interactive Supercomputing End-to-End Prototype:
Using Analog Communications to Prototype the Fiber Optic Future
“We’re using satellite technology…
to demo what It might be like to have
high-speed fiber-optic links between
advanced computers
in two different geographic locations.”
― Al Gore, Senator
Chair, US Senate Subcommittee on Science, Technology and Space
Illinois
Boston
SIGGRAPH 1989
“What we really have to do is eliminate distance between
individuals who want to interact with other people and
with other computers.”
― Larry Smarr, Director, NCSA
6. NSF’s OptIPuter Project: Using Supernetworks
to Meet the Needs of Data-Intensive Researchers
OptIPortal–
Termination
Device
for the
OptIPuter
Global
Backplane
Calit2 (UCSD, UCI), SDSC, and UIC Leads—Larry Smarr PI
Univ. Partners: NCSA, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST
Industry: IBM, Sun, Telcordia, Chiaro, Calient, Glimmerglass, Lucent
2003-2009
$13,500,000
In August 2003,
Jason Leigh and his
students used
RBUDP to blast data
from NCSA to SDSC
over the
TeraGrid DTFnet,
achieving18Gbps file
transfer out of the
available 20Gbps
LS Slide 2005
7. Integrated “OptIPlatform” Cyberinfrastructure System:
A 10Gbps Lightpath Cloud
National LambdaRail
Campus
Optical
Switch
Data Repositories & Clusters
HPC
HD/4k Video Images
HD/4k Video Cams
End User
OptIPortal
10G
Lightpath
HD/4k Telepresence
Instruments
LS 2009
Slide
8. Two New Calit2 Buildings Provide
New Laboratories for “Living in the Future”
• “Convergence” Laboratory Facilities
– Nanotech, BioMEMS, Chips, Radio, Photonics
– Virtual Reality, Digital Cinema, HDTV, Gaming
• Over 1000 Researchers in Two Buildings
– Linked via Dedicated Optical Networks
UC Irvine
www.calit2.net
Preparing for a World in Which
Distance is Eliminated…
9. So Why Don’t We Have a National
Big Data Cyberinfrastructure?
“Research is being stalled by ‘information overload,’ Mr. Bement said, because
data from digital instruments are piling up far faster than researchers can
study. In particular, he said, campus networks need to be improved. High-speed
data lines crossing the nation are the equivalent of six-lane superhighways, he
said. But networks at colleges and universities are not so capable. “Those
massive conduits are reduced to two-lane roads at most college and university
campuses,” he said. Improving cyberinfrastructure, he said, “will transform the
capabilities of campus-based scientists.”
-- Arden Bement, the director of the National Science Foundation May 2005
10. The “Golden Spike” UCSD Experimental Optical Core:
NSF Quartzite Grant
NSF Quartzite Grant 2004-2007
Phil Papadopoulos, PI
11. The “Golden Spike” UCSD Experimental Optical Core:
Ready to Couple Users to CENIC L1, L2, L3 Services
Source: Phil Papadopoulos, SDSC/Calit2
(Quartzite PI, OptIPuter co-PI)
Funded by
NSF MRI
Grant
Lucent
Glimmerglass
Force10
OptIPuter Border Router
CENIC L1, L2
Services
Cisco 6509
Goals by 2008:
>= 60 endpoints at 10 GigE
>= 30 Packet switched
>= 30 Switched wavelengths
>= 400 Connected endpoints
Approximately 0.5 Tbps
Arrive at the “Optical” Center
of Hybrid Campus Switch
12. Rapid Evolution of 10GbE Port Prices
Makes Campus-Scale 10Gbps CI Affordable
2005 2007 2009 2010
$80K/port
Chiaro
(60 Max)
$ 5K
Force 10
(40 max)
$ 500
Arista
48 ports
~$1000
(300+ Max)
$ 400
Arista
48 ports
• Port Pricing is Falling
• Density is Rising – Dramatically
• Cost of 10GbE Approaching Cluster HPC Interconnects
Source: Philip Papadopoulos, SDSC/Calit2
13. DOE ESnet’s Science DMZ: A Scalable Network
Design Model for Optimizing Science Data Transfers
• A Science DMZ integrates 4 key concepts into a unified whole:
– A network architecture designed for high-performance applications,
with the science network distinct from the general-purpose network
– The use of dedicated systems for data transfer
– Performance measurement and network testing systems that are
regularly used to characterize and troubleshoot the network
– Security policies and enforcement mechanisms that are tailored for
high performance science environments
http://fasterdata.es.net/science-dmz/
Science DMZ
Coined 2010
The DOE ESnet Science DMZ and the NSF “Campus Bridging” Taskforce Report Formed the Basis
for the NSF Campus Cyberinfrastructure Network Infrastructure and Engineering (CC-NIE) Program
14. Based on Community Input and on ESnet’s Science DMZ Concept,
NSF Has Funded Over 100 Campuses to Build Local Big Data Freeways
Red 2012 CC-NIE Awardees
Yellow 2013 CC-NIE Awardees
Green 2014 CC*IIE Awardees
Blue 2015 CC*DNI Awardees
Purple Multiple Time Awardees
Source: NSF
15. The Pacific Research Platform Creates
a Regional End-to-End Science-Driven “Big Data Freeway System”
NSF CC*DNI Grant
$5M 10/2015-10/2020
PI: Larry Smarr, UC San Diego Calit2
Co-Pis:
• Camille Crittenden, UC Berkeley CITRIS,
• Tom DeFanti, UC San Diego Calit2,
• Philip Papadopoulos, UC San Diego SDSC,
• Frank Wuerthwein, UC San Diego Physics and
SDSC
16. Creating a “Big Data” Freeway on Campus:
NSF-Funded CC-NIE Grants Prism@UCSD and CHeruB
Prism@UCSD, Phil Papadopoulos, SDSC, Calit2, PI (2013-15)
CHERuB, Mike Norman, SDSC PI
CHERuB
17. Big Data Compute/Storage Facility -
Interconnected at Over 1 Trillion Bits Per Second
128
COMET
VM SC 2 PF
128
Gordon
Big Data SC Oasis Data Store
•
128
Source: Philip Papadopoulos, SDSC/Calit2
Arista Router
Can Switch
576 10Gps Light Paths
6000 TB
> 800 Gbps
# of Parallel 10Gbps
Optical Light Paths
128 x 10Gbps = 1.3Tbps
SDSC
Supercomputers
18. FIONA – Flash I/O Network Appliance:
Linux PCs Optimized for Big Data
UCOP Rack-Mount Build:
FIONAs Are
Science DMZ Data Transfer Nodes &
Optical Network Termination Devices
UCSD CC-NIE Prism Award & UCOP
Phil Papadopoulos & Tom DeFanti
Joe Keefe & John Graham
Cost $8,000 $20,000
Intel Xeon
Haswell Multicore
E5-1650 v3
6-Core
2x E5-2697 v3
14-Core
RAM 128 GB 256 GB
SSD SATA 3.8 TB SATA 3.8 TB
Network Interface 10/40GbE
Mellanox
2x40GbE
Chelsio+Mellanox
GPU NVIDIA Tesla K80
RAID Drives 0 to 112TB (add ~$100/TB)
John Graham, Calit2’s QI
19. How Prism@UCSD Transforms Big Data Microbiome Science
– PRP Does This on a Sub-National Scale
FIONA
12 Cores/GPU
128 GB RAM
3.5 TB SSD
48TB Disk
10Gbps NIC
Knight Lab
10Gbps
Gordon
Prism@UCSD
Data Oasis
7.5PB,
200GB/s
Knight 1024 Cluster
In SDSC Co-Lo
CHERuB
100Gbps
Emperor & Other Vis Tools
64Mpixel Data Analysis Wall
120Gbps
40Gbps
1.3Tbps
20. FIONAs as
Uniform DTN End Points
Existing DTNs
As of October 2015
FIONA DTNs
UC FIONAs Funded by
UCOP “Momentum” Grant
21. Ten Week Sprint to Demonstrate
the West Coast Big Data Freeway System: PRPv0
Presented at CENIC 2015
March 9, 2015
FIONA DTNs Now Deployed to All UC Campuses
And Most PRP Sites
22. PRP Allows for Multiple Secure Independent
Cooperating Research Groups
• Any Particular Science Driver is Comprised of
Scientists and Resources at a Subset of
Campuses and Resource Centers
• We Term These Science Teams with
the Resources and Instruments they Access as
Cooperating Research Groups (CRGs).
• Members of a Specific CRG Trust One Another,
But They Do Not Necessarily Trust Other CRGs
23. PRP Timeline
• PRPv1 (Years 1 and 2)
– A Layer 3 System
– Completed In 2 Years
– Tested, Measured, Optimized, With Multi-domain Science Data
– Bring Many Of Our Science Teams Up
– Each Community Thus Will Have Its Own Certificate-Based Access
To its Specific Federated Data Infrastructure.
• PRPv2 (Years 3 to 5)
– Advanced IPv6-Only Version with Robust Security Features
– e.g. Trusted Platform Module Hardware and SDN/SDX Software
– Support Rates up to 100Gb/s in Bursts And Streams
– Develop Means to Operate a Shared Federation of Caches
24. Pacific Research Platform
Multi-Campus Science Driver Teams
• Jupyter Hub
• Biomedical
– Cancer Genomics Hub/Browser
– Microbiome and Integrative ‘Omics
– Integrative Structural Biology
• Earth Sciences
– Data Analysis and Simulation for Earthquakes and Natural Disasters
– Climate Modeling: NCAR/UCAR
– California/Nevada Regional Climate Data Analysis
– CO2 Subsurface Modeling
• Particle Physics
• Astronomy and Astrophysics
– Telescope Surveys
– Galaxy Evolution
– Gravitational Wave Astronomy
• Scalable Visualization, Virtual Reality, and Ultra-Resolution Video 24
25. PRP First Application: Distributed IPython/Jupyter Notebooks:
Cross-Platform, Browser-Based Application Interleaves Code, Text, & Images
IJulia
IHaskell
IFSharp
IRuby
IGo
IScala
IMathics
Ialdor
LuaJIT/Torch
Lua Kernel
IRKernel (for the R language)
IErlang
IOCaml
IForth
IPerl
IPerl6
Ioctave
Calico Project
•kernels implemented in Mono,
including Java, IronPython, Boo,
Logo, BASIC, and many others
IScilab
IMatlab
ICSharp
Bash
Clojure Kernel
Hy Kernel
Redis Kernel
jove, a kernel for io.js
IJavascript
Calysto Scheme
Calysto Processing
idl_kernel
Mochi Kernel
Lua (used in Splash)
Spark Kernel
Skulpt Python Kernel
MetaKernel Bash
MetaKernel Python
Brython Kernel
IVisual VPython Kernel
Source: John Graham, QI
26. GPU JupyterHub:
2 x 14-core CPUs
256GB RAM
1.2TB FLASH
3.8TB SSD
Nvidia K80 GPU
Dual 40GbE NICs
And a Trusted
Platform Module
40Gbps
GPU JupyterHub:
1 x 18-core CPUs
128GB RAM
3.8TB SSD
Nvidia K80 GPU
Dual 40GbE NICs
And a Trusted
Platform Module
PRP UC-JupyterHub Backbone
UCB Next Step: Deploy Across PRP UCSD
Source: John Graham, Calit2
27. community resources. This facility depends on a range of common services, support activities, software,
and operational principles that coordinate the production of scientific knowledge through the DHTC
model. In April 2012, the OSG project was extended until 2017; it is jointly funded by the Department of
Energy and the National Science Foundation.
OSG Federates Clusters in 40/50 States:
A Major XSEDE Resource
Source: Miron Livny, Frank Wuerthwein, OSG
28. Open Science Grid Has Had a Huge Growth Over the Last Decade -
Currently Federating Over 130 Clusters
Crossed
100 Million
Core-Hours/Month
In Dec 2015
Over 1 Billion
Data Transfers
Moved
200 Petabytes
In 2015
Supported Over
200 Million Jobs
In 2015
Source: Miron Livny, Frank Wuerthwein, OSG
ATLAS
CMS
29. PRP Prototype of LambdaGrid Aggregation of OSG Software & Services
Across California Universities in a Regional DMZ
• Aggregate Petabytes of Disk Space
& PetaFLOPs of Compute,
Connected at 10-100 Gbps
• Transparently Compute on Data
at Their Home Institutions & Systems
at SLAC, NERSC, Caltech, UCSD, & SDSC
SLAC
UCSD
& SDSC
UCSB
UCSC
UCD
UCR
CSU Fresno
UCI
Source: Frank Wuerthwein,
UCSD Physics;
SDSC; co-PI PRP
PRP Builds
on SDSC’s
LHC-UC ProjectCaltech
ATLAS
CMS
er physics
sciences
other sciences
OSG Hours 2015
by Science Domain
30. Status of LHC Science
on PRP
• Established Data & Compute Federations on PRP for ATLAS/CMS
– UCI ATLAS Researchers First to do Science on PRP
– UCSD Shipped FIONAs that Implement This Functionality
and are Being Integrated into Local Infrastructure at UCI, UCR, UCSC
• Integrated Comet as Overflow Resource for LHC Science on PRP
– Received XRAC Allocation
– Operating via OSG Interface to Comet Virtual Cluster
• Plans for 2016
– Ship FIONAs to UCSB, UCD
– Work with UCR, UCSB, UCSC, UCD to Reach Same Routine Science Use as UCI
– Deploy Inbound Data Cache in Front of Comet
Source: Frank Wuerthwein,
UCSD Physics; SDSC; co-PI PRP
31. Two Automated Telescope Surveys
Creating Huge Datasets Will Drive PRP
300 images per night.
100MB per raw image
30GB per night
120GB per night
250 images per night.
530MB per raw image
150 GB per night
800GB per night
When processed
at NERSC
Increased by 4x
Source: Peter Nugent, Division Deputy for Scientific Engagement, LBL
Professor of Astronomy, UC Berkeley
Precursors to
LSST and NCSA
PRP Allows Researchers
to Bring Datasets from NERSC
to Their Local Clusters
for In-Depth Science Analysis
32. Dark Energy Spectroscopic Instrument
Goals + Technologies
Source: Peter Nugent, Division Deputy for Scientific Engagement, LBL
Professor of Astronomy, UC Berkeley
Dark Energy Spectroscopic Instrument (DESI):
Goals + Technologies
1 meter diameter corrector
5000 fiber-robot army
200,000 meters fiber optics
10 spectrographs x 3 cameras
Target Survey which covers 14,000 sq. deg. of sky and
is 10 times deeper than SDSS. ~1 PB of data.
Collaborators throughout the UC System (and the world)
will use this data to perform target selection for DESI as
well as many other interesting science topics, including the
hunt for Planet 9.
33. DESI Goals:
35 Million Galaxy + Quasar Redshift Survey
33
DESI
4 Million Luminous
Red Galaxies
(LRGs)
17 Million
Emission-Line
Galaxies (ELGs)
0.7 million Lyman α QSOs
+1.7 million QSOs
10 Million Bright Galaxy
Sample (BGS) galaxies
SDSS-III/BOSS has only mapped
1.5 million galaxies + 160,000 quasars
Source: Peter Nugent, Division Deputy for Scientific Engagement, LBL
Professor of Astronomy, UC Berkeley
34. Global Scientific Instruments Will Produce Ultralarge Datasets Continuously
Requiring Dedicated Optic Fiber and Supercomputers
Square Kilometer Array Large Synoptic Survey Telescope
https://tnc15.terena.org/getfile/1939 www.lsst.org/sites/default/files/documents/DM%20Introduction%20-%20Kantor.pdf
Tracks ~40B Objects,
Creates 10M Alerts/Night
Within 1 Minute of Observing
2x40Gb/s
35. Dan Cayan
USGS Water Resources Discipline
Scripps Institution of Oceanography, UC San Diego
much support from Mary Tyree, Mike Dettinger, Guido Franco and other colleagues
NCAR Upgrading to 10Gbps Link from Wyoming and Boulder to CENIC/PRP
Sponsors:
California Energy Commission
NOAA RISA program
California DWR, DOE, NSF
Planning for climate change in California
substantial shifts on top of already high climate variability
SIO Campus Climate Researchers Need to Download
Results from NCAR Remote Supercomputer Simulations
to Make Regional Climate Change Forecasts
36. HPWREN Users and Public Safety Clients
Gain Redundancy and Resilience from PRP Upgrade
San Diego Countywide
Sensors and Camera
Resources
UCSD & SDSU
Data & Compute
Resources
UCSD
UCR
SDSU
UCI
UCI & UCR
Data Replication
and PRP FIONA Anchors
as HPWREN Expands
Northward
10X Increase During Wildfires
• PRP 10G Link UCSD to SDSU
– DTN FIONAs Endpoints
– Data Redundancy
– Disaster Recovery
– High Availability
– Network Redundancy
Data From Hans-Werner Braun
Source: Frank Vernon,
Greg Hidley, UCSD
37. PRP Backbone Sets Stage for 2016 Expansion
of HPWREN into Orange and Riverside Counties
• Anchor to CENIC at UCI
– PRP FIONA Connects to
CalREN-HPR Network
– Potential Data Replication Site
• Potential Future UCR Anchor
• Camera and Relay Sites at:
– Santiago Peak
– Sierra Peak
– Lake View
– Bolero Peak
– Modjeska Peak
– Elsinore Peak
– Sitton Peak
– Via Marconi
Collaborations through COAST –
County of Orange Safety Task Force
UCR
UCI
UCSD
SDSU
Source: Frank Vernon,
Greg Hidley, UCSD
38. Collaboration Between EVL’s CAVE2
and Calit2’s VROOM Over 10Gb Wavelength – Extend Across PRP
EVL
Calit2
Source: NTT Sponsored ON*VECTOR Workshop at Calit2 March 6, 2013
39. Optical Fibers Link Australian and US
Big Data Researchers-Also Korea, Japan, and the Netherlands
40. The Future of Supercomputing
Will Need More Than von Neumann Processors
“High Performance Computing Will Evolve
Towards a Hybrid Model,
Integrating Emerging Non-von Neumann Architectures,
with Huge Potential in Pattern Recognition,
Streaming Data Analysis,
and Unpredictable New Applications.”
Horst Simon, Deputy Director,
U.S. Department of Energy’s
Lawrence Berkeley National Laboratory
41. Calit2’s Qualcomm Institute Has Established a Pattern Recognition Lab
On the PRP, For Machine Learning on non-von Neumann Processors
“On the drawing board are collections of 64, 256, 1024, and 4096 chips.
‘It’s only limited by money, not imagination,’ Modha says.”
Source: Dr. Dharmendra Modha
Founding Director, IBM Cognitive Computing Group
August 8, 2014
42. Is the Release of Google’s of TensorFlow
as Transformative as the Release of C?
https://exponential.singularityu.org/medicine/big-data-machine-learning-with-jeremy-howard/
From Programming Computers
Step by Step To Achieve a Goal
To Showing the Computer
Some Examples of
What You Want It to Achieve
and Then Letting the Computer
Figure It Out On Its Own
--Jeremy Howard, Singularity Univ.
2015
43. AI is Advancing at a Amazing Pace:
Deep Learning Algorithms Working on Massive Datasets
1.5 Years!
Training on 30M Moves,
Then Playing Against Itself
45. The Pacific Research Platform Creates
a Regional End-to-End Science-Driven “Big Data Freeway System”
Opportunities for Collaboration
with Many Other Regional Systems
Notas do Editor
Campus Cyberinfrastructure – Network Infrastructure and Engineering (CC-NIE)
Campus Cyberinfrastructure – Infrastructure, Innovation, and Engineering (CC-IIE)
Campus Cyberinfrastructure – Data, Networking, and Innovation (CC-DNI) NSF 15-534 incorporates Data Infrastructure Building Blocks (CC-DNI-DIBBs) – Multi-Campus / Multi-Institution Model Implementation from Program Solicitation NSF 14-530