SlideShare uma empresa Scribd logo
1 de 61
Delivering a Campus Research Data
Service with Globus
MAGIC Meeting
Ian Foster
May 7, 2014
Give me your data,
your terabytes,
Your huddled files
yearning to
breathe free …
Building campus research
data services
“It’s deja vu all over again.”
Yogi Berra
Globus Toolkit
Globus Online
Globus
Globus
What is Globus (today)?
Big data transfer
and sharing…
…simply, securely, and fast…
…directly from your own
storage systems
Reliable, secure, high-performance
file transfer and synchronization
• “Fire-and-forget”
transfers
• Automatic fault
recovery
• Seamless security
integration
• Powerful GUI
and APIs
Data
Source
Data
Destination
User initiates
transfer
request
1
Globus
moves and
syncs files
2
Globus
notifies user
3
Simple, secure sharing off existing
storage systems
Data
Source
User A selects
file(s) to
share, selects
user or group, and
sets permissions
1
Globus tracks shared
files; no need to
move files to cloud
storage!
2
User B logs in
to Globus and
accesses
shared file
3
• Easily share large data
with any user or group
• No cloud storage
required
15,000
registered users
8,000
active endpoints
(in the past year)
3 billion
files transferred
Globus is enabling…
Study of the structure
and evolution of
galaxies, the nature
of dark energy, and
cosmological history
of the universe
Sloan Digital Sky Survey
Source: University of Utah
Joel Brownstein
University of Utah
Globus is enabling…
Development
of numerical
simulations of
severe storms
for improved
responsiveness
to weather
events
Weather Research and Forecasting Model
Source: UCAR
Ann Syrowski
University of Illinois
Globus is enabling…
Pediatric brain
research by
enhancing
analysis of
genetic material
in pursuit of the
underlying
cause
Communication impairment by genetic variants
Source: Wikimedia Commons
William Dobyns
U. Washington
Globus increasingly used to build
campus-wide data service
Source: University of Nebraska
Holland Computing Center
Enable campus computing
facilities to better utilize
high performance network
infrastructure
Typical deployment
Science
DMZ
+
Globus
Omaha Core
Holland Computing Center
Internet2 via GPN
East/West
Campus Networks
(firewalls + IDS)
Lincoln Core Router
2x 10 Gigabit
DYNES
Equipment
UNL Science DMZ
Campus Network
Researchers
WDM
Composit Traffic
100 Gigabit
100 Gigabit Capable
West Campus
Border Router
10x CMS Data
Transfer Nodes
Omaha
HPC
Clusters
100 Gigabit Capable
East Campus
Border Router
perfSONAR
+ BRO IDS
additions
10 Gigabit
4x 10 Gigabit
100 Gigabit
perfSONAR
Bro IDS
Future Redundant
I2 Path (2015+)
Lincoln Core Switch
(CMS and HPC clusters) Center for
Brain Imaging
and Behavior
10x 10 Gigabit
Internet2 via CIC
Composit Traffic
100 Gigabit
Source:
University of Nebraska
Holland Computing Center
Instruments are increasingly driving the
need for broader data service deployments
Next Gen
Sequencer
Light Sheet Microscope
MRI Advanced
Light Source
Globus enables users to manage data as
research requirements scale up or down
Research Computing HPC Cluster
Lab Server
Campus Home Filesystem
Desktop Workstation
Personal Laptop
XSEDE Resource
Public Cloud
Globus product
development
highlights in 2013-14
Sharing generally available
Much improved Web UI
Globus Connect Server
• Native RPM and Debian packaging
• Improved configuration management
• Multi-server setup
• OAuth support
Management console: “Flight Control”
Amazon S3 Endpoints
85
U.S. campuses
We are a non-profit, delivering a
production-grade service to the
non-profit research community
Our challenge:
Sustainability
We are a non-profit, delivering a
production-grade service to the
non-profit research community
Globus Provider Subscriptions
• Managed Endpoints
– Priority support
– Management console
– Usage reports
– Mass Storage System optimization
– Host shared endpoints
– Integration support
• Plus Subscriptions
– Create and manage shared endpoints
– Personal transfers
• Branded Web Site
• Alternate Identity Provider (InCommon is standard)
https://www.globus.org/provider-plans
NET+ Globus
• Internet2 members get discounted
Globus Provider subscriptions
• Completing “Service Validation” phase
– Sponsors:
Cornell, U.Michigan, Yale, U.Missouri, and
U.Chicago
• Available to “Early Adopters” soon
Bridging the gap to sustainability
• $500,000 from Sloan Foundation
• Recognition of what it takes to
“cross the chasm”
• Funds non-R&D
activities
– User Support
– Operations
– Marketing
Globus Behind the Scenes
Identity, Group, Profile
Management Services
…
Sharing Service
Transfer Service
Globus Toolkit
GlobusConnect
Globus Platform-as-a-Service
Identity, Group, Profile
Management Services
…
Sharing Service
Transfer Service
Globus Toolkit
GlobusAPIs
GlobusConnect
globus
genomics
Flexible, scalable, affordabl
e
genomics analysis
for all biologists
+
Data management
PaaS
Next-gen sequence
analysis SaaS
+
Scalable IaaS
Globus Genomics on AWS
Exome: $3 – $20
Whole Genome: $20 – $50
RNA-Seq: <$5
Alternatives are at 10-20x
Dobyns Lab
Exome analysis
20x speed-up
Next: 50x
Cox Lab
Consensus variant calling
134 samples; 4 days
<0.01% Mendel error rate
Next: 13,000 samples
Campus Data Service User Stories
• “I need a good place to store / backup / archive
my (big) research data, at a reasonable price.”
• “I need to easily, quickly, and reliably move or
mirror portions of my data to other places.”
• “I need a way to easily and securely share my
data with my colleagues at other institutions.”
Campus Data Service User Stories
• “I need a good place to store / backup / archive
my (big) research data, at a reasonable price.”
• “I need to easily, quickly, and reliably move or
mirror portions of my data to other places.”
• “I need a way to easily and securely share my
data with my colleagues at other institutions.”
• “I want to publish my data.”
• “I want to discover published data.”
An all-too familiar tale …
Data is:
Identified
Described
Curated
Verifiable
Accessible
Preserved
What does it mean to publish?
I can:
Search
Browse
Access
the data
What does it mean to discover?
Globus
data
publication
services
Announcing…
Metadata
Access Control
License
Storage
Curation
Workflow
Policies
Collection
Teeing Up a Few Terms …
Metadata
DataMetadata
Data
Metadata
Data
Dataset
Dataset
Dataset
Community
Argonne Storage
Univ. of Chicago Argonne IIT UIUC
Demo Scenario
3. Assemble Dataset
(Transfer Data)
Argonne Curator
2. Describe
Submission
Scientist
Shared Endpoint
4. Curate Dataset
1. Publish Data
6. Download
5. Search
Login with Campus or Globus Identity
46
Start a New Submission
47
Describe Submission
48
Dublin Core + Scientific Metadata
Assemble Dataset and Transfer to
Submission Endpoint
49
Grant Submission License
50
Recap: Globus Data Publication
• SaaS for publishing large research data
• Bring your own storage
• Extensible metadata
• Publication and curation workflows
• Public and restricted collections
• Rich discovery model
Curation Workflow
52
Submission is now Published with DOI
53
Search Published Datasets by
Collection
54
Search Published Datasets across
Collections
55
Discovering a Published Dataset
56
Find the Published Dataset
57
Download the Published Dataset
58
Locally Downloaded Dataset
59
Looking for 3-5 early adopters
Summer:
Use and
provide
feedback
on alpha
Fall:
Test beta on
your campus
Winter:
Celebrate
General
Availability
Spring:
Tell us about it
at GlobusWorld
2015!
Thank you to our sponsors!
U . S . D E P A R T M E N T O F
ENERGY

Mais conteúdo relacionado

Mais procurados

Gateways 2020 Tutorial - Large Scale Data Transfer with Globus
Gateways 2020 Tutorial - Large Scale Data Transfer with GlobusGateways 2020 Tutorial - Large Scale Data Transfer with Globus
Gateways 2020 Tutorial - Large Scale Data Transfer with GlobusGlobus
 
Globus and Dataverse: Towards big Data Publication
Globus and Dataverse: Towards big Data PublicationGlobus and Dataverse: Towards big Data Publication
Globus and Dataverse: Towards big Data PublicationGlobus
 
Gateways 2020 Tutorial - Automated Data Ingest and Search with Globus
Gateways 2020 Tutorial - Automated Data Ingest and Search with GlobusGateways 2020 Tutorial - Automated Data Ingest and Search with Globus
Gateways 2020 Tutorial - Automated Data Ingest and Search with GlobusGlobus
 
Gateways 2020 Tutorial - Instrument Data Distribution with Globus
Gateways 2020 Tutorial - Instrument Data Distribution with GlobusGateways 2020 Tutorial - Instrument Data Distribution with Globus
Gateways 2020 Tutorial - Instrument Data Distribution with GlobusGlobus
 
Recent Upgrades to ARM Data Transfer and Delivery Using Globus
Recent Upgrades to ARM Data Transfer and Delivery Using GlobusRecent Upgrades to ARM Data Transfer and Delivery Using Globus
Recent Upgrades to ARM Data Transfer and Delivery Using GlobusGlobus
 
Globus: Beyond File Transfer
Globus: Beyond File TransferGlobus: Beyond File Transfer
Globus: Beyond File TransferGlobus
 
Connecting Your System to Globus (APS Workshop)
Connecting Your System to Globus (APS Workshop)Connecting Your System to Globus (APS Workshop)
Connecting Your System to Globus (APS Workshop)Globus
 
GlobusWorld 2021 Tutorial: Building with the Globus Platform
GlobusWorld 2021 Tutorial: Building with the Globus PlatformGlobusWorld 2021 Tutorial: Building with the Globus Platform
GlobusWorld 2021 Tutorial: Building with the Globus PlatformGlobus
 
Delivering a Campus Research Data Service with Globus
Delivering a Campus Research Data Service with GlobusDelivering a Campus Research Data Service with Globus
Delivering a Campus Research Data Service with GlobusIan Foster
 
Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)Globus
 
Gateways 2020 Tutorial - Introduction to Globus
Gateways 2020 Tutorial - Introduction to GlobusGateways 2020 Tutorial - Introduction to Globus
Gateways 2020 Tutorial - Introduction to GlobusGlobus
 
Globus: Enabling the Open Storage Network
Globus: Enabling the Open Storage NetworkGlobus: Enabling the Open Storage Network
Globus: Enabling the Open Storage NetworkGlobus
 
RDAP 15: Research Data Management Using Globus Software-as-a-Service
RDAP 15: Research Data Management Using Globus Software-as-a-ServiceRDAP 15: Research Data Management Using Globus Software-as-a-Service
RDAP 15: Research Data Management Using Globus Software-as-a-ServiceASIS&T
 
GlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDK
GlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDKGlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDK
GlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDKGlobus
 
Materializing the Web of Linked Data
Materializing the Web of Linked DataMaterializing the Web of Linked Data
Materializing the Web of Linked DataNikolaos Konstantinou
 
Creating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesCreating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesNikolaos Konstantinou
 

Mais procurados (20)

SomeSlides
SomeSlidesSomeSlides
SomeSlides
 
Gateways 2020 Tutorial - Large Scale Data Transfer with Globus
Gateways 2020 Tutorial - Large Scale Data Transfer with GlobusGateways 2020 Tutorial - Large Scale Data Transfer with Globus
Gateways 2020 Tutorial - Large Scale Data Transfer with Globus
 
Globus and Dataverse: Towards big Data Publication
Globus and Dataverse: Towards big Data PublicationGlobus and Dataverse: Towards big Data Publication
Globus and Dataverse: Towards big Data Publication
 
Gateways 2020 Tutorial - Automated Data Ingest and Search with Globus
Gateways 2020 Tutorial - Automated Data Ingest and Search with GlobusGateways 2020 Tutorial - Automated Data Ingest and Search with Globus
Gateways 2020 Tutorial - Automated Data Ingest and Search with Globus
 
Gateways 2020 Tutorial - Instrument Data Distribution with Globus
Gateways 2020 Tutorial - Instrument Data Distribution with GlobusGateways 2020 Tutorial - Instrument Data Distribution with Globus
Gateways 2020 Tutorial - Instrument Data Distribution with Globus
 
Recent Upgrades to ARM Data Transfer and Delivery Using Globus
Recent Upgrades to ARM Data Transfer and Delivery Using GlobusRecent Upgrades to ARM Data Transfer and Delivery Using Globus
Recent Upgrades to ARM Data Transfer and Delivery Using Globus
 
Globus: Beyond File Transfer
Globus: Beyond File TransferGlobus: Beyond File Transfer
Globus: Beyond File Transfer
 
Connecting Your System to Globus (APS Workshop)
Connecting Your System to Globus (APS Workshop)Connecting Your System to Globus (APS Workshop)
Connecting Your System to Globus (APS Workshop)
 
GlobusWorld 2021 Tutorial: Building with the Globus Platform
GlobusWorld 2021 Tutorial: Building with the Globus PlatformGlobusWorld 2021 Tutorial: Building with the Globus Platform
GlobusWorld 2021 Tutorial: Building with the Globus Platform
 
Delivering a Campus Research Data Service with Globus
Delivering a Campus Research Data Service with GlobusDelivering a Campus Research Data Service with Globus
Delivering a Campus Research Data Service with Globus
 
Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)
 
Gateways 2020 Tutorial - Introduction to Globus
Gateways 2020 Tutorial - Introduction to GlobusGateways 2020 Tutorial - Introduction to Globus
Gateways 2020 Tutorial - Introduction to Globus
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Globus: Enabling the Open Storage Network
Globus: Enabling the Open Storage NetworkGlobus: Enabling the Open Storage Network
Globus: Enabling the Open Storage Network
 
RDAP 15: Research Data Management Using Globus Software-as-a-Service
RDAP 15: Research Data Management Using Globus Software-as-a-ServiceRDAP 15: Research Data Management Using Globus Software-as-a-Service
RDAP 15: Research Data Management Using Globus Software-as-a-Service
 
GlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDK
GlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDKGlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDK
GlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDK
 
contentDM
contentDMcontentDM
contentDM
 
Materializing the Web of Linked Data
Materializing the Web of Linked DataMaterializing the Web of Linked Data
Materializing the Web of Linked Data
 
Creating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesCreating Linked Data from Relational Databases
Creating Linked Data from Relational Databases
 

Semelhante a Globus status and publication plans

Introduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for ResearchersIntroduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for ResearchersGlobus
 
Introduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 TutorialIntroduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 TutorialGlobus
 
Science for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataScience for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataIan Foster
 
Science as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryScience as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryIan Foster
 
Science cloud foster june 2013
Science cloud foster june 2013Science cloud foster june 2013
Science cloud foster june 2013Kirill Osipov
 
Webinar: Q&A on Globus Subscription Features
Webinar: Q&A on Globus Subscription FeaturesWebinar: Q&A on Globus Subscription Features
Webinar: Q&A on Globus Subscription FeaturesGlobus
 
Introduction to Globus
Introduction to GlobusIntroduction to Globus
Introduction to GlobusGlobus
 
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)Globus
 
Introduction to Globus (APS Workshop)
Introduction to Globus (APS Workshop)Introduction to Globus (APS Workshop)
Introduction to Globus (APS Workshop)Globus
 
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)Globus
 
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)Globus
 
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...Globus
 
NIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus Poster
NIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus PosterNIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus Poster
NIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus PosterGlobus
 
Simplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus PlatformSimplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus PlatformGlobus
 
Introduction to Globus for New Users
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New UsersGlobus
 
GlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to GlobusGlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to GlobusGlobus
 
Introduction to Globus (GlobusWorld Tour West)
Introduction to Globus (GlobusWorld Tour West)Introduction to Globus (GlobusWorld Tour West)
Introduction to Globus (GlobusWorld Tour West)Globus
 
re:Invent 2013-foster-madduri
re:Invent 2013-foster-maddurire:Invent 2013-foster-madduri
re:Invent 2013-foster-madduriRavi Madduri
 

Semelhante a Globus status and publication plans (20)

Introduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for ResearchersIntroduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for Researchers
 
Introduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 TutorialIntroduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 Tutorial
 
Science for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataScience for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing Data
 
Science as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryScience as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate Discovery
 
Science cloud foster june 2013
Science cloud foster june 2013Science cloud foster june 2013
Science cloud foster june 2013
 
Webinar: Q&A on Globus Subscription Features
Webinar: Q&A on Globus Subscription FeaturesWebinar: Q&A on Globus Subscription Features
Webinar: Q&A on Globus Subscription Features
 
Introduction to Globus
Introduction to GlobusIntroduction to Globus
Introduction to Globus
 
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
 
globus.pptx
globus.pptxglobus.pptx
globus.pptx
 
Introduction to Globus (APS Workshop)
Introduction to Globus (APS Workshop)Introduction to Globus (APS Workshop)
Introduction to Globus (APS Workshop)
 
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
 
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
 
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
 
NIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus Poster
NIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus PosterNIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus Poster
NIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus Poster
 
Simplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus PlatformSimplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus Platform
 
Introduction to Globus for New Users
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New Users
 
GlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to GlobusGlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to Globus
 
Globus presentation
Globus presentationGlobus presentation
Globus presentation
 
Introduction to Globus (GlobusWorld Tour West)
Introduction to Globus (GlobusWorld Tour West)Introduction to Globus (GlobusWorld Tour West)
Introduction to Globus (GlobusWorld Tour West)
 
re:Invent 2013-foster-madduri
re:Invent 2013-foster-maddurire:Invent 2013-foster-madduri
re:Invent 2013-foster-madduri
 

Mais de Ian Foster

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxIan Foster
 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionIan Foster
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumIan Foster
 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsIan Foster
 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationIan Foster
 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryIan Foster
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptxIan Foster
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryIan Foster
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationIan Foster
 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryIan Foster
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterIan Foster
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light SourcesIan Foster
 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon SummaryIan Foster
 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperabilityIan Foster
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFIan Foster
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 

Mais de Ian Foster (20)

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptx
 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, Evolution
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the Continuum
 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart Instruments
 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and Computation
 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptx
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental Science
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and Chemistry
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven Discovery
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and Jupyter
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon Summary
 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperability
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCF
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 

Último

GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 

Último (20)

GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 

Globus status and publication plans

Notas do Editor

  1. Review what the Globus team has done over the past year.Announce an exciting new capability.
  2. Joel Brownstein is the data archivist of the Sloan Digital Sky Survey-IVTransfers daily telescope observations to the University of UtahThere they have a large cluster to run their various data reduction pipelinesUsing the Globus command-line interface within their Python APIJoel has moved more than 70 TB of data so far
  3. Ann develops numerical simulations of severe storms using the Weather Research and Forecasting (WRF) modelUses several HPC facilities throughout the countryMoved more than 100 TB of data using Globus— 50 TB last January alone!Moves data between various XSEDE resources, NCSA&apos;s mass storage system, and PSC&apos;s data archiver
  4. Collects tissue samples from young patients and their families and then extracts, sequences, and analyzesthe genetic material to understand underlying cause of disease.Uses Globus to move NGS data to and from public clouds where he runs analysis pipelines.More on Bill’s work later on in this talk (under Globus Genomics)
  5. Can use standard tools such as apt and yum to deployUses configuration fileAllows incremental config changesMultiple I/O nodesID node (MyProxy)Web node (OAuth)
  6. Alllows site administrators to monitor traffic to/from their site. Ultimately will allow for control.
  7. Geoffrey Moore
  8. Highlight CI ConnectHighlight XSEDE’s planned adoption of user, group and profile management
  9. Highlight CI Connect; coming up in Rob Gardner’s talkHighlight XSEDE’s planned adoption of user, group and profile management
  10. Competitive TCOAlternatives are campus computing cores and commercial sequence analysis services
  11. Collection is a set of DatasetsDataset is data + metadataCollection is within a CommunityPolicies on a CollectionMetadataAccess control Curation workflowLicenseStorage
  12. Demo scenario:A scientist, referred to throughout as “the Scientist” and associated with the user Blaiszik, has just published a paper associated with his research on nanoscale materials. He now wants to go ahead and publish the data associated with this publication.Using the Globus publication system, he is able to select the Argonne community, and the Center for Nanoscale Materials (CNM) collection. He selects to publish his dataHe describes the submission with both publication (Dublin core) and scientific metadataThe CNM collection has been preconfigured with its own storage provided at ArgonneAs part of this submission, a unique endpoint is created for “The Scientist&quot;, the endpoint is created so that only &quot;The Scientist&quot; can write to it&quot;The Scientist&quot; assembles his dataset on this endpoint by transferring files from 1 or more locations. He can assemble this dataset over a long period of time and can return to the submission workflow when he is happy with the submission. The CNM collection has also been preconfigured with a workflow requiring that an Argonne curator must approve the submissionA curator, referred to throughout as “the Curator” and associated with the user Chard, is able to view and edit the metadata and files of the datasetOnce approved the submission is published in the CNM collection with a DOIOther users (with permission to view the collection) can then discover published datasets by their DOI or using the Globus discovery interface to find datasets by their metadataThese users can choose to browse published datasets and download datasets to other resources (including local resources)
  13. Users can login using any of their linked Globus identities, e.g., Campus credentials (via InCommon), Google Account, XSEDE account, ..
  14. The first step of submission is to select a collection. In this case &quot;The Scientist&quot; selects the “Center for Nanoscale Materials”, as this is the department through which he conducted his research. Note: &quot;The Scientist&quot; can only see collections he is allowed to publish to.
  15. &quot;The Scientist&quot; must first describe the dataset he is publishing. There are two types of metadata required for submission to the CNM collection: 1) Dublin core and 2) scientific metadata. These metadata requirements are defined by the collection and can be configured depending on the domain. Additional pages can also be defined. Here, &quot;The Scientist&quot; enters information about the Authors, their ORCID (a unique researcher identity), the submission title, the date of publication, the accompanying publication to which this dataset is related, and the DOI for that publication. Note: &quot;The Scientist&quot; has missed an ORCID for one of his co-authors.
  16. Using the familiar Globus interface, &quot;The Scientist&quot; is able to select files from multiple sources and transfer them to his unique submission endpoint (publish#submission_11).This submission endpoint is created on shared Argonne storage resources, but is initially accessible only to &quot;The Scientist&quot; The dataset may be assembled over any period of time. &quot;The Scientist&quot; can create new files and folders on the endpoint and he can arrange these files in any hierarchy. At the completion of the submission the permissions on the endpoint will be changed such that the dataset is immutable. &quot;The Scientist” will be given read access to the dataset, collection curators will also be given read access to the data so that they can view the contents.
  17. Having verified the submission, &quot;The Scientist&quot; must grant the submission license. This license is again configured by the collection (i.e. each collection can customize their individual licenses), and allows the submitting user to grant rights to the collection (CNM) and the Globus system to manage and disseminate the dataset based on the agreed upon policies.
  18. The Argonne CNM collection has defined a workflow that requires a curatorto view and approve all submissions. The curation workflow enables the curator to view the submitted files and to edit the submitted metadata.
  19. At this point, the dataset is now published in the collection with a unique DOI (handle in this case) for other researchers to reference this published dataset. Access to the dataset (both metadata and files) is changed to reflect the policies of the collection. Access may be restricted to particular users, or groups of users, or it may be made public for any user to access.
  20. “The Researcher” chooses to search for all published data in the CNM collection. The results show a brief summary of each published dataset including information about the publication time, collection, summary of number of files, name, authors, description and a set of keyword tags as well as key-value tags. Each of these fields can be used to search for a particular dataset.
  21. Knowing that other collections may well have datasets of interest , “The Researcher” may broaden the search context to all accessible collections and search for datasets related to “Li-ion” and “autonomic”. Here, the results show datasets from 2 collections: the CNM and the Chemical Sciences and Engineering collection (red boxes). Results are ranked according to their relevance to the search.
  22. Going further, “The Researcher” can use different queries such as key-value and ranges. In this case, “The Researcher” searchers for energy density &gt; 1500 and microcapsules, and finds the dataset previously published in this demo with an associated key-value pair of energy-density:2000 that fits the range query criteria.
  23. Having found the desired published dataset, “The Researcher”can navigate to the summary page.
  24. The summary page shows a summary of the dataset and the list of files. “The Researcher” can choose to download individual files, browse the dataset using Globus, or download the entire dataset. Ability to view the dataset and download files is governed by the access control on the collection and permissions associated with “The Researcher”.
  25. Finally,“The Researcher” can view the downloaded dataset on their desktop PC.