Globus provides a platform and services for simplifying data management and sharing for science gateways and applications. It offers fast and reliable file transfers between any storage systems, secure data sharing without copying data, and APIs and SDKs for building applications. Globus uses OAuth authentication and supports a variety of interfaces like CLI, Python SDK, and Jupyter notebooks to enable access.
3. Globus is …
a non-profit service
developed and operated by
3
4. Our mission is to…
increase the efficiency and
effectiveness of researchers
engaged in data-driven
science and scholarship
through sustainable software
4
8. Fast, reliable file transfer …from any to any system
User-initiated,
or automated
transfer request
1
Instrument,
Lab server
Compute
Facility
Globus transfers files
reliably, securely
2
Globally accessible
multi-tenant service
• Fire-and-forget transfers
• Optimized speed
• Assured reliability
• Unified view of storage
• Browser, REST API, CLI
Optional
notifications
3
9. Secure data sharing …from any storage
Collaborator logs into Globus
and accesses shared files;
no local account required;
download via Globus2
On-prem or public
cloud storage
Select files to share,
select user or group,
and set access
permissions
1Globally accessible
multi-tenant service
Globus controls
access to shared files
on existing storage
Laptop, server,
compute facility
• Fine-grained access
control “overlay” on
storage system
• Share with any
identity, email, group
• No need to stage
data just for sharing
10. Conceptual architecture: Hybrid SaaS
DATA
Channel
CONTROL
Channel
Source
Endpoint
Destination
Endpoint
Subscriber owned
and administered
storage system
Globus
“connector”
software
No data relay or
staging via Globus
cloud service
Subscriber
Control
Domain
Globus
Control
Domain
Single, globally accessible
multi-tenant service
11. Endpoints (Collections)
• Storage abstraction
– All transfers happen between two endpoints
• Collection (end user) ~= Endpoint (sysadmin)
• Useful endpoints for testing/demonstration
– Globus Tutorial Endpoint 1 and 2
– ESnet Read-Only *
– DME Datasets *, DME PerfTest *
11
17. Globus Command Line Interface (CLI)
• Native application: docs.globus.org/cli
• Open source, uses Python SDK
• globus login – get access and refresh tokens
– Tokens stored locally in ~/.globus.cfg
• Service (transfer/auth) invocation uses tokens
• globus logout – delete tokens
docs.globus.org/cli/examples
18. Simple CLI automation examples
• Syncing a directory
– bash script; calls the Globus CLI
– Python module; run as script or import as module
• Staging data for distribution
– bash and Python variants
• Removing directories after files are transferred
– Python script
22
github.com/globus/automation-examples
20. Globus serves as…
A platform for building science
gateways, web portals and
other applications in support of
research and education
24
21. Globus platform services
• Identity and Access Management (IAM): Auth, Groups
• Data Services: Connect, Transfer, Manifest*
• Search
• Identifiers (collaboration with DataCite)
• Flows*
25
* In development/early release; contact us for access
22. Globus Auth addresses security challenges
• Make it easy for developers to provide login for
their apps (web, mobile, desktop, command line)
• …and protect all REST API communications
o App Globus service (MRDP, Jupyter Notebook)
o App non-Globus service (graph service in MRDP)
o Service Service
• …while
– Not introducing yet another identity
– Providing a platform to consolidate existing identities
– Providing a least privileges security model (via consents)
– Being web friendly and language/framework agnostic
26
23. Based on widely used web standards
• OAuth 2.0 Authorization Framework (a.k.a. OAuth2)
• OpenID Connect Core 1.0 (a.k.a. OIDC)
• Access via OAuth2 and OIDC libraries of your choice
– Google OAuth Client Libraries, Apache mod_auth_openidc, etc.
– Globus Python SDK
27
docs.globus.org/api/auth
24. Fundamental Concepts
• Scopes: APIs that client is requesting access to
– Scope syntax: OpenID Connect: openid, email, profile
– https://auth.globus.org/scopes/<service-name>:<scope-name>
– A service can have multiple scopes
• Consents: authorize client to access a service, within
limited scope, on the resource owner’s (user’s) behalf
28
25. Globus account
• Globus Account = Primary identity + Linked Identities
– An identity can be primary on only one account
– Identities can be linked to only one account
• Account does not have own identifier
– An account is uniquely identified using its primary identity
• Effective identity = linked identity from a particular
identity provider required by a client or service
29
26. Identity id vs. username
• Identity id
– Unique among all Globus Auth identities; will never be reused
– UUID
– Always use this to refer to an identity
• Identity username
– Unique at any point in time; may change, may be re-used
– Case-insensitive user@domain
– Can map to/from id, for user experience
• Globus Auth API allows mapping back and forth
30
27. Auth Example: Authorization Code Grant
31
Client
(Web Portal,
Application,
Jupyter)
Globus Transfer
(Resource Server)
Globus Auth
(Authorization
Server)
5. Authenticate using client id
and secret, send authorization
code
Browser (User)
1. Access
portal
2.
Redirects
user
3. User authenticates and
consents
4. Authorization
code
6. Access token(s)
7. Authenticate with access
token(s) to give the client
the authority invoke the
transfer service
Identity
Provider
28. Globus Transfer API
• Globus Web App consumes public Transfer API
• Resource named by URL (standard REST approach)
– Query params allow refinement (e.g., subset of fields)
• Globus APIs use JSON for documents and resource
representations
• Requests authorized via OAuth2 access token
– Authorization: Bearer asdflkqhafsdafeawk
docs.globus.org/api/transfer
32
29. Globus Python SDK
• Python client library for the Globus Auth and Transfer
REST APIs
• globus_sdk.TransferClient class handles
connection management, security, framing,
marshaling
from globus_sdk import TransferClient
tc = TransferClient()
globus-sdk-python.readthedocs.io
33
30. Experimenting with the API using Jupyter Hub
• jupyter.demo.globus.org
– Sign in with Globus and verify consents
– Go to folder: globus-jupyter-notebooks/GlobusWorldTour
– Open: Platform_Introduction_JupyterHub_Auth.ipynb
• If you mess it up and want to “go back to the beginning”
– Navigate back to your Jupyter server’s root folder
– Run NotebookPuller.ipynb
• To use the notebook outside of our hub…
– github.com/globus/globus-jupyter-notebooks
– Authentication copy-paste auth code, exchange for access token
34
31. Support resources
• Globus documentation: docs.globus.org
• Sample code: github.com/globus
• Helpdesk and issue escalation: support@globus.org
• Customer engagement team
• Globus professional services team
– Assist with portal/gateway/app architecture and design
– Develop custom applications that leverage the Globus platform
– Advise on customized deployment and integration scenarios
32. Join the Globus community
• Access the service: globus.org/login
• Create a personal endpoint: globus.org/app/endpoints/create-gcp
• Documentation: docs.globus.org
• Engage: globus.org/mailing-lists
• Subscribe: globus.org/subscriptions
• Need help? support@globus.org
• Follow us: @globusonline
Notas do Editor
Not just file transfer
Sustainable = thriving, not just surviving
Not just file transfer
Sustainable = thriving, not just surviving
The Globus service is a controller
No data passes through the Globus Service
Fire and forget control – The Service GUI (web page) can go away
Globus abstracts storage systems in a quanta called an ”endpoint”
Storage system complexities are masked or abstracted
Transfers between disparate storage systems is natural
This is a simple transfer case – a single user has permissions on both source and destination filesystems.
Endpoint definition
Endpoints you can use right now
GCP – Your very own endpoint, no DTN running Globus Connect Server needed
We will demo this in a minute
DEMO:
Login
Transfer: Midway to NCAR
Sharing on Midway
Endpoints
Some teams use the Globus CLI to script file transfers and other data management tasks, and… > NEXT SLIDE
DEMONSTRATION
export EP=af7bda53-6d04-11e5-ba46-22000b92c6ec
globus ls $EP
globus endpoint my-shared-endpoint-list $EP
globus transfer -r $EP:/~/abinitio 924a32b0-6a2a-11e6-83a8-22000b97daec:/globus/perftest/uchicago
More demo’s on the next four slides – not used due to time constraints
WHAT
We can accommodate… Globus was built by researchers for researchers. “There is always a better way to do things” is the very mantra that drives research. How could we allow you to use our foundational services to support your own applications and workflows. This is largely the theme of the day today.
Auth underpins everything, providing fine grained access control to all data and other resources on the platform
We will focus on Auth, Groups and Transfer in the third session
Lee will describe Search and Identifiers in a bit more detail in session 4
Mention Manifest service in development?
OAuth2 – OpenID Connect (Web World)
OpenID Connect – Authentication Layer (RESTful / JSON)
RA: some concepts to follow, and then present use cases for integration with Auth with specific solutions on using our SDK for that.
RA: FIXME: this slides is incorrect per implementation today. UpdateService-name + scope-name is unique
Collections of identities. Now we’re just part of the web.
Primary Identity. If account compromised can just unlink primary identity.
Guarantee that the ID will not change AND that the ID will only be bound to a single identity
Native App Grant
User attempts to access the portal (or in the case of the Jupyter Notebook) have the application access the services
Browser redirect
Local site Auth Server prompts for user name and password (if they haven’t already authenticated to Globus) and prompts for consents (the specific things it’s going to use your Globus account for) - “By clicking "Allow", you allow Insert Application Name Here, in accordance with its terms of service and privacy policy, to use the above listed information and services.”
Return to the application with an authorization code
Exchange the authorization code for
Access token(s)
Use the access token(s) to (in the case of the Jupyter Notebook) create a transfer client object
End result: All calls to the transfer service needs to have the authorization header with the transfer token.
These are the same APIs that the Globus Web App uses.
All of the Globus Services expose REST APIs
All returns are in JSON format.
URL named resources – what are resources in the context of transfer
SOMEWHERE WHERE YOU TRASFER FROM OR TO: /endpoint/endpoint-uuid
SOMEHTING THAT IS HAPPENING: /task/task-uuid
Pretty standard REST approach
Globus remote operations on a resource have rough “HTTP Verb” equivalents.
Uses “patchy” PUTs – Essentially a list of modifications to the resource as opposed to a compete replacement of the resource. For example you can update only certain fields in an endpoint document by only specifying those fields.
All calls to the transfer service needs to have the authorization header with the transfer token. Talked about this in the previous slides.
Won’t go into this in too much depth as Globus Auth will be covered by Steve later, but it needs to be present in order for Transfer to work.
And you will see an instance of this when we exercise the APIs in the Jupyter Notebook.
Show the docs.globus.org site
Hierarchy broken out by functionality.
GO TO “Task Management”
Show “GET Task by ID”
URL Named Resource
Method
Response Format – Task Document
If you want to write your own clients that’s fine, but we also have an open source Python SDK in our github for both the Auth and Transfer APIs.
The Python SDK for the Transfer APIs are what we’ll concentrate on in this discussion. With some peeks back at the low level API functionality.
Basic Transfer Client Class - You’ll see this all through the SDK and examples.
Handles all of the connection management
Deals with tokens that come back from authentication
Everything required to assemble JSON documents
So when you see “tc” in the examples that’s what that is.
Go to URL
IT’s Open Source
Show it in Github repo
Fire up a notebook
Show people how to run commands and live edit code
Run the initial configuration – everything up to endpoint search
Configuration
Authentication steps
Help
Using the transfer client
As we’ve already said, the transfer client makes REST resources available via easy to use methods.
And the response is nice clean JSON
get_endpoint method gives us a wealth of information about the endpoint just like the help said it would.
Helper methods for APIs that returns lists have iterable responses, and automatically take care of paging where required:
endpoint_search(filter_scope="recently-used")
An example of a low level implementation
Can change
r["DATA"][3]["display_name"]
limit=4
Handling errors, again we make it easy for you… example
Bogus endpoint
Standard 4xx / 5xx HTTP errors
Classes of errors spit out by ex.code
BACK TO SLIDES
One last thing we’ve done to make life easier as you build your web apps.
LEE covered this in the MRDP demo during session 2
Just a reminder of the resources we’ve made available to you and your developers.