1. March 20, 2014
XSEDE: A Digital Ecosystem Enhancing
Productivity for All Science & Engineering
John Towns
PI and Project Director, XSEDE
Director, Collaborative Cyberinfrastructure Programs, NCSA
jtowns@ncsa.illinois.edu
4. Motivation for XSEDE:
• Scientific advancement across multiple disciplines
requires a variety of resources and services
• XSEDE is about increased productivity of the
community and providing expanded capabilities
– leads to more science
– is sometimes the difference between a feasible project
and an impractical one
– lowers barriers to adoption
• XSEDE provides a comprehensive eScience
infrastructure composed of expertly managed and
evolving advanced heterogeneous digital resources and
services integrated into a general-purpose
infrastructure
4
5. Boundary Conditions and Principles for XSEDE
• XSEDE inherited TeraGrid environment
• XSEDE inherited TG community and their expectations
• Point of view has changed
– not an HPC/CS/tech play
– about productivity and creating the environment necessary to be
productive
• focus on the success of researchers!
• Finally figured out that the project must define a solution that is
designed to evolve!
– technologically and organizationally!
• Identify the greatest needs and start there
– Don’t forget what you have learned – both good and bad!
• Oh yea… the researchers don’t care about your existence (per se)
– they care about access to resources, services and support
5
6. XSEDE – accelerating scientific discovery
• XSEDE’s Vision:
a world of digitally enabled researchers,
engineers, and scholars participating in
multidisciplinary collaborations to tackle society’s
grand challenges
• XSEDE’s Mission:
to substantially enhance the productivity of a
growing community of researchers, engineers,
and scholars through access to advanced digital
services that support open research
6
7. XSEDE’s Strategic Goals
• Deepen and extend the use of the advanced digital research
services ecosystem
– deepen use by existing researchers, engineers, and scholars
– extend use to new communities
– prepare the current and next generation via education, training, and
outreach
– raise the general awareness of the value of advanced digital services
• Advance the advanced digital research services ecosystem
– create an open and evolving e-infrastructure
– enhance the array of technical expertise and support services offered
• Sustain the advanced digital research services ecosystem
– assure and maintain a reliable and secure infrastructure
– provide excellent user support services
– operate an effective and innovative virtual organization
7
8. What is XSEDE?
• An ecosystem of advanced digital services accelerating
scientific discovery
– support a growing portfolio of resources and services
• advanced computing, high-end visualization, data analysis, and
other resources and services
• interoperability with other infrastructures
• A virtual organization (partnership!) providing
– dynamic distributed infrastructure
– support services, and technical expertise to enable
researchers engineers and scholars
• addressing the most important and challenging problems facing
the nation and world
• A project funded by the National Science Foundation
8
9. XSEDE Factoids: high order bits
• 5 year, US$121M project
– plus US$9M, 5 year Technology Investigation Service
• separate award from NSF
– option for additional 5 years of funding upon major review after
PY3
• No funding for major hardware
– coordination, support and creating a national/international
cyberinfrastructure
– coordinate allocations, support, training and documentation for
>$100M of concurrent project awards from NSF
• ~140 FTE /~250 individuals funded across 20 partner
institutions
– this requires solid partnering!
9
10. Total Research Funding Supported by XSEDE
in CY2013
10
US$750 million in research
supported by XSEDE
in CY2013
11. Innovation: proactively looking to expand
scope of capabilities
• Striking a balance between providing stable, reliable
services and fostering innovation
– both in what we are doing and how we do it
• Campus Bridging use cases are mostly for capabilities
we have not traditionally supported
• Novel and Innovative Projects team is seeking out new
communities and identifying new capabilities
necessary to support them
• Architecture design processes explicitly support
innovation by the project
– more importantly, facilitate innovation by the community
Our goal is to deliver new capabilities – and thus new science – faster
11
12. Convenience requirements will always increase
Each generation of users
requires more convenience
than the former: thus we must
always be adding new layers of
software while maintaining and
extending existing reliability
and capability.
Change is the only Constant
– Heraclitis 535BC-475BC
12
No, his mind is not for rent
To any god or government.
Always hopeful, yet discontent,
He knows changes aren't permanent,
But change is.
– Rush - Tom Sawyer
13. What do you mean by “Advanced Digital
Services?”
• Often use the terms “resources” and “services”
– these should be interpreted very broadly
– most are likely not operated by XSEDE
• Examples of resources
– compute engines: HPC, HTC (high throughput computing), campus,
departmental, research group, project, …
– data: simulation output, input files, instrument data, repositories, public
databases, private databases, …
– instruments: telescopes, beam lines, sensor nets, shake tables, microscopes, …
– infrastructure: local networks, wide-area networks, …
• Examples of services
– collaboration: wikis, forums, telepresence, …
– data: data transport, data management, sharing, curation, provenance, …
– access/used: authentication, authorization, accounting, …
– coordination: meta-queuing, …
– support: helpdesk, consulting, ECSS, training, …
– And many more: education, outreach, community building, …
13
14. Some Unexpected Challenges:
XSEDE is a socio-technical ecosystem
• Highly distributed organization
– challenges in managing a project that involves
staff at 20 partner institutions
• A completely virtual organization
– breaking new ground from an organizational
structure and management point of view
• Highly distributed engineering project
– developing new methodologies to adapt
traditional practices to the unusual context of
XSEDE
14
15. XSEDE offers access to a variety of
resources
• Leading-edge distributed memory systems
• Very large shared memory systems
• High throughput systems, including Open
Science Grid (OSG)
• Visualization engines
• Accelerators like GPUs and Xeon PHIs
Many scientific problems have components that
call for use of more than one architecture.
15
16. XSEDE User Portal: THE User Site
portal.xsede.org
• XSEDE User Portal (XUP) is designed to be the only site
a user needs to use XSEDE
• XUP presents information relevant to users
– user info is easier to find
– XUP also provides dynamic data about XSEDE systems
– capabilities to manage usage, files, data
• As a user you can
– request an allocation, and manage allocations
– sign up for training
– request help
– manage file and data, and much more!
– Portal provides single sign-on to all XSEDE resources
17. XSEDE offers more in-depth support
Extended Collaborative Support Service
• Support people who understand the discipline as
well as the systems (perhaps more than one support
person working with a project).
• 37 FTEs, spread over >70 people at more than half a
dozen sites.
• Distributed support
– Easier to find the right expert for the project
– allows us to cover many more disciplines than if every site
had to staff the common applications.
– support does not have to move with platform change
17
19. Current XSEDE Visualization and Data
Resources
• Visualization
– Longhorn @ TACC
• 20.7 TF Dell/NVIDIA
cluster
• 18.7 TB disk
• Storage
– Ranch @ TACC
• 40 PB tape
– HPSS @ NICS
• 12 PB tape
– Data Supercell @ PSC
• 4 PB disk
– Data Oasis @ SDSC
• 4 PB tape
19
https://www.xsede.org/web/xup/
resource-monitor#advanced_vis_systems https://www.xsede.org/web/xup/
resource-monitor#storage_systems
20. Approach to Other Infrastructures:
Active Interactions
• OSG is a significant CI in the US – Level 2 Service Provider in XSEDE
– the nation’s premier high-throughput computing infrastructure
• complement traditional HPC resources inherited from TeraGrid
– ties to CI (eScience infrastructure) providers internationally
• PRACE is a significant HPC CI in Europe
– PRACE represents both large scale HPC and distributed resources
• subsumed DEISA in 2011
– joint Summer School series
– working on joint call for collaborations support later this calendar year
• EGI is a significant HTC CI in Europe
– initiating organizational benchmarking effort
– identifying collaborating research teams spanning XSEDE-EGI
• HPC Wales
– Champions programs, Science Gateways
– training content
20
21. Objectives for Coming Year+
Accelerating the realization of the XSEDE vision
• Deliver new or improved software, services and capabilities on a
regular basis
– XSEDE Wide Area Filesystem; Global Federated Filesystem; enhanced
single sign-on; science gateway APIs; Canonical Use Case components
• Campus Bridging will promote "XSEDE Compatible" cluster build tools
and use of Globus Online and GFFS for data movement and access
• Incorporate the third cadre of under-represented students into the
XSEDE Scholars program
• Expand Champions Program to include Regional, Student, and Domain
Champions
• Redesign and implement a new allocations request system
• Complete baseline architecture and expanded set of defined Use
Cases
• Develop joint activities with industry
• Further develop relationships with other resource, service and
infrastructure providers
21