SlideShare uma empresa Scribd logo
1 de 80
Baixar para ler offline
Enabling Secure Data Discoverability
Rachana Ananthakrishnan – ranantha@uchicago.edu
Brigitte Raumann - braumann@uchicago.edu
Vas Vasiliadis - vas@uchicago.edu
November 14, 2021
Agenda
• Introduction and Motivation
• The Modern Research Data Portal Design Pattern
• Brokering Access to Services using Globus Auth
• Making Data Findable with Globus Search
• Deploying a Simple Research Data Portal
• Enabling Secure Data Collaboration
• Making Data Discoverable at Scale
- Hands-on exercise
- Live demonstration
Introduction and Motivation
What’s the common theme?
4
The brilliance “arms race”...
K. Wille, The Physics of Particle Accelerators: An Introduction, Oxford University Press, Oxford, UK (2000); J. B. Parise and G. E. Brown, Jr., Elements, 2, 37-42 (2006)
Some challenges…
• Increasing data rates, heterogeneity
• Continuum of computing resources
• Differing workflows across instruments
The state of research data management
How do we...
...search?
...discover?
…share?
Distribution Store
Data Portal
Advanced Computing Facility
Instrument Facility
A common data flow pattern
Image Analysis
3
Search/Discovery
5
Science!
6
Imaging
1 Acquisition
2
Description/Identification
4
v
Globus services for research data management
Unified Data Access Data Transfer and Sharing Platform-as-a-Service
Reliable Automation Publication & Discovery Remote Execution (future)
The Modern Research Data
Portal Design Pattern
docs.globus.org/mrdp
Why we use portals
• Different experiments (beamlines, electron
microscopes, biology, etc) generate data with
different types, size and experimental information
• Processing, curation, and cataloguing need to
happen as soon as possible so data are not lost
• Standardize secure access between users
• Work toward FAIR datasets to enable more science
Advantages of a portal
• Make data FAIRer
• Track lots of (heterogeneous) data
• Facilitate discovery
– Free text search in Globus Search
– Filtering on specific values
– User Friendly GUI
• Enforce appropriate access controls
– Public/private, group-, subject-level ACLs
• Integrate with other (Globus) services
• Customize for your research environment
MRDP: Key elements
Science DMZ
Fast, clean data path
Data Transfer Nodes
Purpose-built data movers
Globus Platform
Secure, reliable data
orchestration
Globus Connect
Storage system enabler
14
Globus Portal
Framework
Data discovery and access
…makes your
storage system a
Globus endpoint
Globus Connectors support diverse systems
What’s wrong with my LRDP?
17
L(egacy)RDP architecture
18
Source: ESnet Science Engagement team
MRDP network architecture
19
Source:
ESnet
Science
Engagement
team
Globus
Platform
Services
Relevant Globus platform capabilities
• Data transfer and sharing
• Data description (metadata) and discovery
• Data (and compute) task orchestration
• Authentication and Authorization
21
Brokering Access to
Services using Globus Auth
Globus Auth: Foundational IAM service
Brokers authentication and authorization among…
– End-users
– Identity providers: enterprise, external (federated identities)
– Services: resource servers with REST APIs
– Apps: web, mobile, desktop, command line clients
– Services acting as clients to other services
• OAuth 2.0 Authorization Framework (a.k.a. OAuth2)
• OpenID Connect Core 1.0 (a.k.a. OIDC)
23
Several authentication models supported
• Application acting as user with consent
– Authorization code grant
• Application authenticating as itself
– Client credentials grant
• Application able to manage tokens for offline or long
running tasks
– Refresh tokens
Data transfer and sharing
• Move data to collection à Submit Transfer task
• Make data accessible à Set guest collection access rule
• Grant user/app access à Add/confirm Group membership
25
Groups
service
Transfer
service
GET /groups/my_groups
POST /endpoint/{endpoint_id}/access
POST /transfer
Using guest collections in your data portal
• Create a guest collection; requires authentication
– Cannot be completely automated – must ”log in”
– Create once and automate rest of the steps
• Grant the application Access Manager role
– Allows the application to manage permissions on the collection
– Set for application identity: appclientid@clients.auth.globus.org
• Grant roles for management of endpoint and tasks
Accessing Globus
platform services
27
jupyter.demo.globus.org
Making Data Findable with
Globus Search
Data description and discovery
• Metadata store with fine-
grained visibility controls
• Schema agnostic
à dynamic schemas
• Simple search using URL
query parameters
• Complex search using
search request document
29
docs.globus.org/api/search
Search
Index
Distinct access policies
may be applied to
Data and Metadata
…(ideally) using
permissions on
guest collections
…using
permissions on
metadata elements
Data ingest with Globus Search
31
Search
Index
POST /index/{index_id}/ingest'
{
"ingest_type": "GMetaList",
"ingest_data": {
"gmeta": [
{
"id": "filetype",
"subject”: "https://search.api.globus.org/abc.txt",
"visible_to": ["public"],
"content": {
"metadata-schema/file#type": "file”
}
},
...
]
}
- Bulk create and update
- Task model for ingest at scale
Data ingest with Globus Search
32
Search
Index
POST /index/{index_id}/ingest'
{
"ingest_type": "GMetaList",
"ingest_data": {
"gmeta": [
{
"id": ”weight",
"subject": "https://search.api.globus.org/abc.txt",
"visible_to": ["urn:globus:auth:identity:46bd0f56-
e24f-11e5-a510-131bef46955c"],
"content": {
"metadata-schema/file#size": ”37.6",
"metadata-schema/file#size_human": ”<50lb”
}
},
...
]
}
Visibility limited to Globus Auth identity
- Single user
- Globus Group
- Registered client application
Data discovery with Globus Search
33
{
"@datatype": "GSearchResult",
"@version": "2017-09-01",
"count": 1,
"gmeta": [
{
"@datatype": "GMetaResult",
"@version": "2019-08-27",
"entries": [
{ ... }
],
"subject": "https://..."
}
],
"offset": 0,
"total": 1
}
GET /index/{index_id}/search?q=type%3Ahdf5
Search
Index
Simple query
Data discovery with Globus Search
34
POST /index/{index_id}/search
Search
Index
Complex query
{
"filters": [
{
"type": "range",
"field_name": ”pubdate",
"values": [
{
"from": "*",
"to": "2020-12-31"
}
]
}
],
"facets": [
{
"name": "Publication Date",
"field_name": "pubdate",
...
}
]
}
Filter
Facets
Boosts
Sort
Limit
Cancer Registry Records for Research (CR3)
• Create network of federated cancer registries
– Deploy similar infrastructure at other cancer registries
– Enable queries across multiple registries
• Federation via Globus: network scale ßà local control
– Data owners input/export data sets, apply QC, set access policies
– Registry data remain at the institution where they were generated
– Identities are provided/authenticated by the institution, not Globus
– System scale depends on data owners providing storage resources
CR3 requirements
• Search Index
– Only de-identified data in search index
– No record-level for researchers
• Portal
– Fine-grained access control
– Researchers must use a specific identity
– Access must be logged
– Render graphs based on search results
– Faceted search in real time
CR3
Discovery
Portal
Cohort
aggregate
counts
Login with
UPMC/Pitt
credentials
Globus
Search (GS)
Globus
Auth (GA)
UPMC/Pitt
Identity
Providers
Authentication
Auth
initiated to
GA
Cohort
search
initiated to
GS
Researcher
Cohort
aggregate
counts
returned
CR3 Architecture
Globus
Transfer (GT)
Registry Staff
Data transfer from registrar to
researcher mediated by GT
Manage
authorization
Elasticsearch
Request
Service
Cancer Registry De-identified
Data Index (minimal criteria
data: e.g., staging)
SEER Registry
Medical Center Registry
State Registry
SEER Registry
Medical Center Registry
State Registry
CR3 Portal (simulated data)
Federated logon using Globus Auth
with Pitt/UPMC as identity providers
Dynamically updating
charts as facets change
Variable facets based on
source registry index
Google-like text search with
facets for filtering
Developed using a framework based
on the Globus Modern Research
Data Portal* design pattern
(docs.globus.org/mrdp)
* PeerJ Articles:cs-144 https://peerj.com/articles/cs-144/
Working with Globus
Search
39
jupyter.demo.globus.org
Ingesting search
metadata
40
github.com/globus/searchable-files-demo
Deploying a Simple
(but fully functional and extensible)
Research Data Portal
Globus Search
Evolving the MRDP design pattern
Enabling discoverability:
MRDP + Faceted Search
Input form
Automated
Extraction
Ingest metadata, set
visibility policies
Bulk ingest
MRDP
Implementation
• Django framework with integrated Globus services
• Python package bootstraps deployment
– Install from PyPi using pip
– github.com/globus/django-globus-portal-framework
• Reference Globus Portal Framework deployment
ALCF Community Data Co-Op: acdc.alcf.anl.gov
Exemplar portal
The ALCF Data Co-op
44
acdc.alcf.anl.gov/
Portal Core Functionality
• User authentication
• Django-based framework
– Portal URL mappings
– Token loading
• Service calls to Globus Search
• Manage request lifecycle
• Post process search requests
User authentication
• Scopes are configured in the portal
• Users authenticate with Globus using standard flow
– Python Social Auth used for Authentication backend
• User tokens are saved in the database
• Future requests authorized with user access tokens
– Searches use Search bearer token
Portal service calls use the Globus SDK
• Globus Portal Framework loads tokens from database
• Globus service object instantiated with token
• Call to Globus service(s)
• Portal renders result in templates
Globus Portal Framework URLs
• URLs span three categories
– Index Selection
– Index Search page
– Search Subject detail page
• Supports multiple Globus Search indices
• Search page links to multiple result subjects
• Each subject has a unique URL
Format of a URL
An index is configuration driven
• A Search index is configured in portal settings
• Add Globus Search index UUID
• Add a name
• Add facets
• Add fields
• Start searching!
Lifecycle of a request
• User makes a query
• Portal sends request to Globus Search
– Request contains user bearer token
• Portal receives response
• Portal does processing on response
– Parse Dates, build URL for Globus webapp, etc.
• Portal renders data into templates
• User receives a search page
Securing Apps with Globus Auth
• Select the appropriate standard OAuth flow:
• Native App (with refresh tokens – extend expiration)
– Clients can’t keep a secret; tokens stored in plain text
– Authenticate as user identity à get authorization code
– Examples: Jupyter Notebooks, Globus Timer CLI
• Authorization Code Grant
– Tokens stored securely
– Authenticate as user identity à auth code returned via callback
– Examples: JupyterHub secured with Globus Auth
• Confidential Client
– ClientID and Secret stored securely
– Authenticate as application
– Example: Data portal, custom webapps
52
Authorization Code Grant
53
Client
(Web Portal,
Application)
Globus service
(Resource Server)
Globus Auth
(Authorization Server)
5. Authenticate using client id and
secret, send authorization code
Browser (User)
1. Access
portal
2. Redirect
user
3. User authenticates
and consents
4. Authorization
code
6. Access token(s)
7. Authenticate with access token(s),
giving client authority to invoke the
requested service
Identity
Provider
Client credential grant
54
1. Authenticate with app
client id and secret
2. Access Tokens
Application,
Science Gateway,
Data Portal
(Client)
3. Authenticate as app
with access tokens to invoke
service (on behalf of authorized
user, within a given scope)
Globus Transfer
(Resource Server)
Globus Auth
(Authorization Server)
Creating your
own data portal
55
bit.ly/globus-sc21
Source: github.com/globus/django-globus-portal-framework
Docs: django-globus-portal-framework.readthedocs.io/en/stable/
Step 0: Application registration
• Set callback URL
• Get client ID and secret
• Consents implement
least privileges principle
56
developers.globus.org
Portal deployment
• (Already available on your EC2 instance)
• Clone the repo
• Configure settings
• For production use, add robust WSGI/ASGI server
• Future: containers
Portal configuration
• Review settings.py and check that portal runs
• Update settings.py
– Add search index to SEARCH_INDEXES
– Include search scope in Globus Auth setup
• Configure search (facets, fields, …)
• Optional: Configure templates
Enabling Secure Data
Collaboration
Globus Groups simplify permissions management
• Grant group access to
collection(s)
• Restrict search visibility
using group
• Make portal client a group
administrator
• Check authenticated user’s
group membership
• Add/remove user to/from
group
Adding a search
index to the portal
61
Making Data Discoverable at
Scale
Globus Automation Capabilities
Timer Service
Scheduled and recurring transfers
(a.k.a. Globus cron)
Command Line Interface
Ad hoc scripting and integration
Globus Flows service
Comprehensive task (data and
compute) orchestration with human in
the loop interactions
Globus Timer Service
Use case: Data replication
• For backup: initiated by user or system back up
• Automated transfer of data from science instrument
65
Recurring transfers
with sync option
Copy /ingest
Daily @ 3:30am
The Globus Timer service
• Scheduled/recurring file transfers
• Supports all Globus transfer and sync options
• Service with a command line interface
• Example: NIH – hpc.nih.gov/storage/globus_cron.html
66
Using the Globus Timer service
67
$ globus–timer session {login, logout, whoami}
$ globus–timer job transfer 
--name example–job 
--label "Timer Transfer Job" 
--interval 28800 
--start '2020–01–01T12:34:56' 
--source–endpoint ddb59aef–6d04–11e5–ba46–22000b92c6ec 
--dest–endpoint ddb59af0–6d04–11e5–ba46–22000b92c6ec 
--item ~/file1.txt ~/new_file1.txt false 
--item ~/file2.txt ~/new_file2.txt false
Timer options in
the Globus
webapp
Scheduled transfers
to data portal
endpoint(s)
Globus Timer CLI: pypi.org/project/globus-timer-cli
69
Globus Command Line
Interface (CLI)
Globus Command Line Interface
Open source, uses
the Python SDK
Automated
management of
data portal endpoint(s)
72
Globus SDK: globus-sdk python.readthedocs.io/en/stable/
Globus CLI: docs.globus.org/cli/
Globus Flows Service
Managed automation of tasks
• Flows: A platform service for defining, applying, and
sharing distributed research automation flows
• Flows comprise Actions
• Action Providers: Called by Flows to perform tasks
• Triggers*: Start flows based on events
* In development
Automation with Globus Flows
• Built on AWS Step Functions
– Simple JSON-based state machine
language
– Conditions, loops, fault tolerance, etc.
– Propagates state through the flow
• Standardized API for integrating
custom event and action services
– Actions: synchronous or asynchronous
– Custom Web forms prompt for user input
• Actions secured with Globus Auth
Extending the ecosystem: Action providers
76
• Action Provider is a
service endpoint
– Run
– Status
– Cancel
– Release
– Resume
• Action Provider Toolkit
action-provider-
tools.readthedocs.io/en/latest
Search
Transfer
Notification
ACLs Identifier
Delete
Ingest
User
Form
Describe Xtract
funcX Web
Form
Custom built
Globus Provided
Automation services ecosystem
GET /provider_url/
POST /provider_url/run
GET /provider_url/action_id/status
GET /provider_url/action_id/cancel
GET /provider_url/action_id/status
Create Action
Providers
Define and
deploy flows
{ “StartAt”: ”ToProject”,
”States” : {
”ToProject” : { … },
”SetPermission” : { …},
“ProcessData” : { … } … }}
Run flows
Working with
Globus Flows
Try it: demo.gladier.org/gladier-demo/upload-file
Run flows: app.globus.org/flows/library
Docs: docs.globus.org/globus-automation-services
78
Coming soon: Globus Trigger service
• Trigger–Action platform
• Predefined triggers and
actions to create rules
• Globus processes triggers
and reliably executes actions
bit.ly/globus-sc21
docs.globus.org
github.com/globus
outreach@globus.org
support@globus.org

Mais conteúdo relacionado

Mais procurados

Automating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusAutomating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with Globus
Globus
 

Mais procurados (20)

Automating Research Data Flows with Globus (CHPC 2019 - South Africa)
Automating Research Data Flows with Globus (CHPC 2019 - South Africa)Automating Research Data Flows with Globus (CHPC 2019 - South Africa)
Automating Research Data Flows with Globus (CHPC 2019 - South Africa)
 
Connecting Your System to Globus (APS Workshop)
Connecting Your System to Globus (APS Workshop)Connecting Your System to Globus (APS Workshop)
Connecting Your System to Globus (APS Workshop)
 
Automating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusAutomating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with Globus
 
Gateways 2020 Tutorial - Instrument Data Distribution with Globus
Gateways 2020 Tutorial - Instrument Data Distribution with GlobusGateways 2020 Tutorial - Instrument Data Distribution with Globus
Gateways 2020 Tutorial - Instrument Data Distribution with Globus
 
Gateways 2020 Tutorial - Automated Data Ingest and Search with Globus
Gateways 2020 Tutorial - Automated Data Ingest and Search with GlobusGateways 2020 Tutorial - Automated Data Ingest and Search with Globus
Gateways 2020 Tutorial - Automated Data Ingest and Search with Globus
 
Gateways 2020 Tutorial - Large Scale Data Transfer with Globus
Gateways 2020 Tutorial - Large Scale Data Transfer with GlobusGateways 2020 Tutorial - Large Scale Data Transfer with Globus
Gateways 2020 Tutorial - Large Scale Data Transfer with Globus
 
"What's New With Globus" Webinar: Spring 2018
"What's New With Globus" Webinar: Spring 2018"What's New With Globus" Webinar: Spring 2018
"What's New With Globus" Webinar: Spring 2018
 
Data Orchestration at Scale (GlobusWorld Tour West)
Data Orchestration at Scale (GlobusWorld Tour West)Data Orchestration at Scale (GlobusWorld Tour West)
Data Orchestration at Scale (GlobusWorld Tour West)
 
Globus Portal Framework (APS Workshop)
Globus Portal Framework (APS Workshop)Globus Portal Framework (APS Workshop)
Globus Portal Framework (APS Workshop)
 
GlobusWorld 2021 Tutorial: Building with the Globus Platform
GlobusWorld 2021 Tutorial: Building with the Globus PlatformGlobusWorld 2021 Tutorial: Building with the Globus Platform
GlobusWorld 2021 Tutorial: Building with the Globus Platform
 
Instrument Data Orchestration with Globus Search and Flows
Instrument Data Orchestration with Globus Search and FlowsInstrument Data Orchestration with Globus Search and Flows
Instrument Data Orchestration with Globus Search and Flows
 
Introduction to the Globus Platform (APS Workshop)
Introduction to the Globus Platform (APS Workshop)Introduction to the Globus Platform (APS Workshop)
Introduction to the Globus Platform (APS Workshop)
 
Gateways 2020 Tutorial - Introduction to Globus
Gateways 2020 Tutorial - Introduction to GlobusGateways 2020 Tutorial - Introduction to Globus
Gateways 2020 Tutorial - Introduction to Globus
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 Keynote
 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture Ideas
 
Globus: Beyond File Transfer
Globus: Beyond File TransferGlobus: Beyond File Transfer
Globus: Beyond File Transfer
 
Mining a Large Web Corpus
Mining a Large Web CorpusMining a Large Web Corpus
Mining a Large Web Corpus
 
Simplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus PlatformSimplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus Platform
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of Science
 
20160922 Materials Data Facility TMS Webinar
20160922 Materials Data Facility TMS Webinar20160922 Materials Data Facility TMS Webinar
20160922 Materials Data Facility TMS Webinar
 

Semelhante a Enabling Secure Data Discoverability (SC21 Tutorial)

Advanced Computing Meets Data FAIRness
Advanced Computing Meets Data FAIRnessAdvanced Computing Meets Data FAIRness
Advanced Computing Meets Data FAIRness
Globus
 
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Ian Foster
 
Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...
Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...
Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...
Brigitte Jörg
 

Semelhante a Enabling Secure Data Discoverability (SC21 Tutorial) (20)

Advanced Computing Meets Data FAIRness
Advanced Computing Meets Data FAIRnessAdvanced Computing Meets Data FAIRness
Advanced Computing Meets Data FAIRness
 
Building Data Portals and Science Gateways with Globus
Building Data Portals and Science Gateways with GlobusBuilding Data Portals and Science Gateways with Globus
Building Data Portals and Science Gateways with Globus
 
Building Research Applications with Globus PaaS
Building Research Applications with Globus PaaSBuilding Research Applications with Globus PaaS
Building Research Applications with Globus PaaS
 
Scalable Data Management: Automation and the Modern Research Data Portal
Scalable Data Management: Automation and the Modern Research Data PortalScalable Data Management: Automation and the Modern Research Data Portal
Scalable Data Management: Automation and the Modern Research Data Portal
 
Working with Globus Platform Services
Working with Globus Platform ServicesWorking with Globus Platform Services
Working with Globus Platform Services
 
Working with Globus Platform Services and Portals
Working with Globus Platform Services and PortalsWorking with Globus Platform Services and Portals
Working with Globus Platform Services and Portals
 
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
 
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
 
Leveraging the Globus Platform (GlobusWorld Tour - Columbia University)
Leveraging the Globus Platform (GlobusWorld Tour - Columbia University)Leveraging the Globus Platform (GlobusWorld Tour - Columbia University)
Leveraging the Globus Platform (GlobusWorld Tour - Columbia University)
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data Mesh
 
"Data in Context" IG sessions @ RDA 3rd Plenary
"Data in Context" IG sessions @  RDA 3rd Plenary"Data in Context" IG sessions @  RDA 3rd Plenary
"Data in Context" IG sessions @ RDA 3rd Plenary
 
Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...
Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...
Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data Commons
 
Publishing and Serving Machine Learning Models with DLHub
Publishing and Serving Machine Learning Models with DLHubPublishing and Serving Machine Learning Models with DLHub
Publishing and Serving Machine Learning Models with DLHub
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
 
Introduction to Globus for New Users
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New Users
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
 
Meetup SF - Amundsen
Meetup SF  -  AmundsenMeetup SF  -  Amundsen
Meetup SF - Amundsen
 
Building a modern in-house analytics pipeline
Building a modern in-house analytics pipelineBuilding a modern in-house analytics pipeline
Building a modern in-house analytics pipeline
 

Mais de Globus

Mais de Globus (20)

Advanced Globus System Administration Topics
Advanced Globus System Administration TopicsAdvanced Globus System Administration Topics
Advanced Globus System Administration Topics
 
Instrument Data Automation: The Life of a Flow
Instrument Data Automation: The Life of a FlowInstrument Data Automation: The Life of a Flow
Instrument Data Automation: The Life of a Flow
 
Reliable, Remote Computation at All Scales
Reliable, Remote Computation at All ScalesReliable, Remote Computation at All Scales
Reliable, Remote Computation at All Scales
 
Best Practices for Data Sharing Using Globus
Best Practices for Data Sharing Using GlobusBest Practices for Data Sharing Using Globus
Best Practices for Data Sharing Using Globus
 
An Introduction to Globus for Researchers
An Introduction to Globus for ResearchersAn Introduction to Globus for Researchers
An Introduction to Globus for Researchers
 
Introduction to Research Automation with Globus
Introduction to Research Automation with GlobusIntroduction to Research Automation with Globus
Introduction to Research Automation with Globus
 
Globus for System Administrators
Globus for System AdministratorsGlobus for System Administrators
Globus for System Administrators
 
Introduction to Globus for System Administrators
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
 
Introduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for ResearchersIntroduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for Researchers
 
Introduction to the Globus Platform for Developers
Introduction to the Globus Platform for DevelopersIntroduction to the Globus Platform for Developers
Introduction to the Globus Platform for Developers
 
Introduction to the Command Line Interface (CLI)
Introduction to the Command Line Interface (CLI)Introduction to the Command Line Interface (CLI)
Introduction to the Command Line Interface (CLI)
 
Automating Research Data with Globus Flows and Compute
Automating Research Data with Globus Flows and ComputeAutomating Research Data with Globus Flows and Compute
Automating Research Data with Globus Flows and Compute
 
Automating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus PlatformAutomating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus Platform
 
Advanced Globus System Administration
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
 
Introduction to Globus for System Administrators
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
 
Globus Automation
Globus AutomationGlobus Automation
Globus Automation
 
Advanced Globus System Administration
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
 
Introduction to Globus
Introduction to GlobusIntroduction to Globus
Introduction to Globus
 
Introduction to Globus for System Administrators
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
 
Advanced Globus System Administration
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
 

Último

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 

Último (20)

%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 

Enabling Secure Data Discoverability (SC21 Tutorial)

  • 1. Enabling Secure Data Discoverability Rachana Ananthakrishnan – ranantha@uchicago.edu Brigitte Raumann - braumann@uchicago.edu Vas Vasiliadis - vas@uchicago.edu November 14, 2021
  • 2. Agenda • Introduction and Motivation • The Modern Research Data Portal Design Pattern • Brokering Access to Services using Globus Auth • Making Data Findable with Globus Search • Deploying a Simple Research Data Portal • Enabling Secure Data Collaboration • Making Data Discoverable at Scale - Hands-on exercise - Live demonstration
  • 5. The brilliance “arms race”... K. Wille, The Physics of Particle Accelerators: An Introduction, Oxford University Press, Oxford, UK (2000); J. B. Parise and G. E. Brown, Jr., Elements, 2, 37-42 (2006)
  • 6. Some challenges… • Increasing data rates, heterogeneity • Continuum of computing resources • Differing workflows across instruments
  • 7. The state of research data management How do we... ...search? ...discover? …share?
  • 8. Distribution Store Data Portal Advanced Computing Facility Instrument Facility A common data flow pattern Image Analysis 3 Search/Discovery 5 Science! 6 Imaging 1 Acquisition 2 Description/Identification 4 v
  • 9. Globus services for research data management Unified Data Access Data Transfer and Sharing Platform-as-a-Service Reliable Automation Publication & Discovery Remote Execution (future)
  • 10.
  • 11. The Modern Research Data Portal Design Pattern docs.globus.org/mrdp
  • 12. Why we use portals • Different experiments (beamlines, electron microscopes, biology, etc) generate data with different types, size and experimental information • Processing, curation, and cataloguing need to happen as soon as possible so data are not lost • Standardize secure access between users • Work toward FAIR datasets to enable more science
  • 13. Advantages of a portal • Make data FAIRer • Track lots of (heterogeneous) data • Facilitate discovery – Free text search in Globus Search – Filtering on specific values – User Friendly GUI • Enforce appropriate access controls – Public/private, group-, subject-level ACLs • Integrate with other (Globus) services • Customize for your research environment
  • 14. MRDP: Key elements Science DMZ Fast, clean data path Data Transfer Nodes Purpose-built data movers Globus Platform Secure, reliable data orchestration Globus Connect Storage system enabler 14 Globus Portal Framework Data discovery and access
  • 15. …makes your storage system a Globus endpoint
  • 16. Globus Connectors support diverse systems
  • 17. What’s wrong with my LRDP? 17
  • 18. L(egacy)RDP architecture 18 Source: ESnet Science Engagement team
  • 21. Relevant Globus platform capabilities • Data transfer and sharing • Data description (metadata) and discovery • Data (and compute) task orchestration • Authentication and Authorization 21
  • 22. Brokering Access to Services using Globus Auth
  • 23. Globus Auth: Foundational IAM service Brokers authentication and authorization among… – End-users – Identity providers: enterprise, external (federated identities) – Services: resource servers with REST APIs – Apps: web, mobile, desktop, command line clients – Services acting as clients to other services • OAuth 2.0 Authorization Framework (a.k.a. OAuth2) • OpenID Connect Core 1.0 (a.k.a. OIDC) 23
  • 24. Several authentication models supported • Application acting as user with consent – Authorization code grant • Application authenticating as itself – Client credentials grant • Application able to manage tokens for offline or long running tasks – Refresh tokens
  • 25. Data transfer and sharing • Move data to collection à Submit Transfer task • Make data accessible à Set guest collection access rule • Grant user/app access à Add/confirm Group membership 25 Groups service Transfer service GET /groups/my_groups POST /endpoint/{endpoint_id}/access POST /transfer
  • 26. Using guest collections in your data portal • Create a guest collection; requires authentication – Cannot be completely automated – must ”log in” – Create once and automate rest of the steps • Grant the application Access Manager role – Allows the application to manage permissions on the collection – Set for application identity: appclientid@clients.auth.globus.org • Grant roles for management of endpoint and tasks
  • 28. Making Data Findable with Globus Search
  • 29. Data description and discovery • Metadata store with fine- grained visibility controls • Schema agnostic à dynamic schemas • Simple search using URL query parameters • Complex search using search request document 29 docs.globus.org/api/search Search Index
  • 30. Distinct access policies may be applied to Data and Metadata …(ideally) using permissions on guest collections …using permissions on metadata elements
  • 31. Data ingest with Globus Search 31 Search Index POST /index/{index_id}/ingest' { "ingest_type": "GMetaList", "ingest_data": { "gmeta": [ { "id": "filetype", "subject”: "https://search.api.globus.org/abc.txt", "visible_to": ["public"], "content": { "metadata-schema/file#type": "file” } }, ... ] } - Bulk create and update - Task model for ingest at scale
  • 32. Data ingest with Globus Search 32 Search Index POST /index/{index_id}/ingest' { "ingest_type": "GMetaList", "ingest_data": { "gmeta": [ { "id": ”weight", "subject": "https://search.api.globus.org/abc.txt", "visible_to": ["urn:globus:auth:identity:46bd0f56- e24f-11e5-a510-131bef46955c"], "content": { "metadata-schema/file#size": ”37.6", "metadata-schema/file#size_human": ”<50lb” } }, ... ] } Visibility limited to Globus Auth identity - Single user - Globus Group - Registered client application
  • 33. Data discovery with Globus Search 33 { "@datatype": "GSearchResult", "@version": "2017-09-01", "count": 1, "gmeta": [ { "@datatype": "GMetaResult", "@version": "2019-08-27", "entries": [ { ... } ], "subject": "https://..." } ], "offset": 0, "total": 1 } GET /index/{index_id}/search?q=type%3Ahdf5 Search Index Simple query
  • 34. Data discovery with Globus Search 34 POST /index/{index_id}/search Search Index Complex query { "filters": [ { "type": "range", "field_name": ”pubdate", "values": [ { "from": "*", "to": "2020-12-31" } ] } ], "facets": [ { "name": "Publication Date", "field_name": "pubdate", ... } ] } Filter Facets Boosts Sort Limit
  • 35. Cancer Registry Records for Research (CR3) • Create network of federated cancer registries – Deploy similar infrastructure at other cancer registries – Enable queries across multiple registries • Federation via Globus: network scale ßà local control – Data owners input/export data sets, apply QC, set access policies – Registry data remain at the institution where they were generated – Identities are provided/authenticated by the institution, not Globus – System scale depends on data owners providing storage resources
  • 36. CR3 requirements • Search Index – Only de-identified data in search index – No record-level for researchers • Portal – Fine-grained access control – Researchers must use a specific identity – Access must be logged – Render graphs based on search results – Faceted search in real time
  • 37. CR3 Discovery Portal Cohort aggregate counts Login with UPMC/Pitt credentials Globus Search (GS) Globus Auth (GA) UPMC/Pitt Identity Providers Authentication Auth initiated to GA Cohort search initiated to GS Researcher Cohort aggregate counts returned CR3 Architecture Globus Transfer (GT) Registry Staff Data transfer from registrar to researcher mediated by GT Manage authorization Elasticsearch Request Service Cancer Registry De-identified Data Index (minimal criteria data: e.g., staging)
  • 38. SEER Registry Medical Center Registry State Registry SEER Registry Medical Center Registry State Registry CR3 Portal (simulated data) Federated logon using Globus Auth with Pitt/UPMC as identity providers Dynamically updating charts as facets change Variable facets based on source registry index Google-like text search with facets for filtering Developed using a framework based on the Globus Modern Research Data Portal* design pattern (docs.globus.org/mrdp) * PeerJ Articles:cs-144 https://peerj.com/articles/cs-144/
  • 41. Deploying a Simple (but fully functional and extensible) Research Data Portal
  • 42. Globus Search Evolving the MRDP design pattern Enabling discoverability: MRDP + Faceted Search Input form Automated Extraction Ingest metadata, set visibility policies Bulk ingest MRDP
  • 43. Implementation • Django framework with integrated Globus services • Python package bootstraps deployment – Install from PyPi using pip – github.com/globus/django-globus-portal-framework • Reference Globus Portal Framework deployment ALCF Community Data Co-Op: acdc.alcf.anl.gov
  • 44. Exemplar portal The ALCF Data Co-op 44 acdc.alcf.anl.gov/
  • 45. Portal Core Functionality • User authentication • Django-based framework – Portal URL mappings – Token loading • Service calls to Globus Search • Manage request lifecycle • Post process search requests
  • 46. User authentication • Scopes are configured in the portal • Users authenticate with Globus using standard flow – Python Social Auth used for Authentication backend • User tokens are saved in the database • Future requests authorized with user access tokens – Searches use Search bearer token
  • 47. Portal service calls use the Globus SDK • Globus Portal Framework loads tokens from database • Globus service object instantiated with token • Call to Globus service(s) • Portal renders result in templates
  • 48. Globus Portal Framework URLs • URLs span three categories – Index Selection – Index Search page – Search Subject detail page • Supports multiple Globus Search indices • Search page links to multiple result subjects • Each subject has a unique URL
  • 49. Format of a URL
  • 50. An index is configuration driven • A Search index is configured in portal settings • Add Globus Search index UUID • Add a name • Add facets • Add fields • Start searching!
  • 51. Lifecycle of a request • User makes a query • Portal sends request to Globus Search – Request contains user bearer token • Portal receives response • Portal does processing on response – Parse Dates, build URL for Globus webapp, etc. • Portal renders data into templates • User receives a search page
  • 52. Securing Apps with Globus Auth • Select the appropriate standard OAuth flow: • Native App (with refresh tokens – extend expiration) – Clients can’t keep a secret; tokens stored in plain text – Authenticate as user identity à get authorization code – Examples: Jupyter Notebooks, Globus Timer CLI • Authorization Code Grant – Tokens stored securely – Authenticate as user identity à auth code returned via callback – Examples: JupyterHub secured with Globus Auth • Confidential Client – ClientID and Secret stored securely – Authenticate as application – Example: Data portal, custom webapps 52
  • 53. Authorization Code Grant 53 Client (Web Portal, Application) Globus service (Resource Server) Globus Auth (Authorization Server) 5. Authenticate using client id and secret, send authorization code Browser (User) 1. Access portal 2. Redirect user 3. User authenticates and consents 4. Authorization code 6. Access token(s) 7. Authenticate with access token(s), giving client authority to invoke the requested service Identity Provider
  • 54. Client credential grant 54 1. Authenticate with app client id and secret 2. Access Tokens Application, Science Gateway, Data Portal (Client) 3. Authenticate as app with access tokens to invoke service (on behalf of authorized user, within a given scope) Globus Transfer (Resource Server) Globus Auth (Authorization Server)
  • 55. Creating your own data portal 55 bit.ly/globus-sc21 Source: github.com/globus/django-globus-portal-framework Docs: django-globus-portal-framework.readthedocs.io/en/stable/
  • 56. Step 0: Application registration • Set callback URL • Get client ID and secret • Consents implement least privileges principle 56 developers.globus.org
  • 57. Portal deployment • (Already available on your EC2 instance) • Clone the repo • Configure settings • For production use, add robust WSGI/ASGI server • Future: containers
  • 58. Portal configuration • Review settings.py and check that portal runs • Update settings.py – Add search index to SEARCH_INDEXES – Include search scope in Globus Auth setup • Configure search (facets, fields, …) • Optional: Configure templates
  • 60. Globus Groups simplify permissions management • Grant group access to collection(s) • Restrict search visibility using group • Make portal client a group administrator • Check authenticated user’s group membership • Add/remove user to/from group
  • 61. Adding a search index to the portal 61
  • 63. Globus Automation Capabilities Timer Service Scheduled and recurring transfers (a.k.a. Globus cron) Command Line Interface Ad hoc scripting and integration Globus Flows service Comprehensive task (data and compute) orchestration with human in the loop interactions
  • 65. Use case: Data replication • For backup: initiated by user or system back up • Automated transfer of data from science instrument 65 Recurring transfers with sync option Copy /ingest Daily @ 3:30am
  • 66. The Globus Timer service • Scheduled/recurring file transfers • Supports all Globus transfer and sync options • Service with a command line interface • Example: NIH – hpc.nih.gov/storage/globus_cron.html 66
  • 67. Using the Globus Timer service 67 $ globus–timer session {login, logout, whoami} $ globus–timer job transfer --name example–job --label "Timer Transfer Job" --interval 28800 --start '2020–01–01T12:34:56' --source–endpoint ddb59aef–6d04–11e5–ba46–22000b92c6ec --dest–endpoint ddb59af0–6d04–11e5–ba46–22000b92c6ec --item ~/file1.txt ~/new_file1.txt false --item ~/file2.txt ~/new_file2.txt false
  • 68. Timer options in the Globus webapp
  • 69. Scheduled transfers to data portal endpoint(s) Globus Timer CLI: pypi.org/project/globus-timer-cli 69
  • 71. Globus Command Line Interface Open source, uses the Python SDK
  • 72. Automated management of data portal endpoint(s) 72 Globus SDK: globus-sdk python.readthedocs.io/en/stable/ Globus CLI: docs.globus.org/cli/
  • 74. Managed automation of tasks • Flows: A platform service for defining, applying, and sharing distributed research automation flows • Flows comprise Actions • Action Providers: Called by Flows to perform tasks • Triggers*: Start flows based on events * In development
  • 75. Automation with Globus Flows • Built on AWS Step Functions – Simple JSON-based state machine language – Conditions, loops, fault tolerance, etc. – Propagates state through the flow • Standardized API for integrating custom event and action services – Actions: synchronous or asynchronous – Custom Web forms prompt for user input • Actions secured with Globus Auth
  • 76. Extending the ecosystem: Action providers 76 • Action Provider is a service endpoint – Run – Status – Cancel – Release – Resume • Action Provider Toolkit action-provider- tools.readthedocs.io/en/latest Search Transfer Notification ACLs Identifier Delete Ingest User Form Describe Xtract funcX Web Form Custom built Globus Provided
  • 77. Automation services ecosystem GET /provider_url/ POST /provider_url/run GET /provider_url/action_id/status GET /provider_url/action_id/cancel GET /provider_url/action_id/status Create Action Providers Define and deploy flows { “StartAt”: ”ToProject”, ”States” : { ”ToProject” : { … }, ”SetPermission” : { …}, “ProcessData” : { … } … }} Run flows
  • 78. Working with Globus Flows Try it: demo.gladier.org/gladier-demo/upload-file Run flows: app.globus.org/flows/library Docs: docs.globus.org/globus-automation-services 78
  • 79. Coming soon: Globus Trigger service • Trigger–Action platform • Predefined triggers and actions to create rules • Globus processes triggers and reliably executes actions