O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Globus Automation


Confira estes a seguir

1 de 36 Anúncio

Mais Conteúdo rRelacionado

Mais de Globus (20)


Globus Automation

  1. 1. Globus Automation Rachana Ananthakrishnan ranantha@uchicago.edu September 8, 2022 Sponsored by
  2. 2. Globus Automation Capabilities Timer Service Scheduled and recurring transfers (a.k.a. Globus cron) Command Line Interface Ad hoc scripting and integration Globus Flows service Comprehensive task (data and compute) orchestration with human in the loop interactions
  3. 3. “Simple” Automation Use Cases • Data backup – as user, as system • Stage data in or out as part of a compute job • Portal/science gateway submits a transfer of compute results as the user • Portal/science gateway monitors users transfer, and initiates processing or backup of data. 4 Recurring transfers with sync option Copy /ingest Daily @ 3:30am
  4. 4. Globus Timer Service
  5. 5. Scripting with the Globus Timer service 6 $ globus–timer session {login, logout, whoami} $ globus–timer job transfer --name example–job --label "Timer Transfer Job" --interval 28800 --start '2020–01–01T12:34:56' --source–endpoint ddb59aef–6d04–11e5–ba46–22000b92c6ec --dest–endpoint ddb59af0–6d04–11e5–ba46–22000b92c6ec --item ~/file1.txt ~/new_file1.txt false --item ~/file2.txt ~/new_file2.txt false
  6. 6. Globus Command Line Interface (CLI)
  7. 7. Globus Command Line Interface Automation of simple data management tasks Integration with existing scripts (job submission …) Open source, uses the Python SDK
  8. 8. Commands refer to resources by UUID • UUIDs for endpoint, task, user identity, groups… • Use search/list options • get-identities for identity username to UUID $ globus endpoint search 'Tutorial Endpoint 1' $ globus task list $ globus get-identities vas@globusid.org bfc122a3-af43-43e1-8a41-d36f28a2bc0a
  9. 9. Parsing CLI output • Default output is text; for JSON output use --format json $ globus endpoint search --filter-scope my-endpoints $ globus endpoint search --filter-scope my-endpoints -- format json • Extract specific attributes using --jmespath <expression> $ globus endpoint search --filter-scope my-endpoints -- jmespath 'DATA[].[id, display_name]'
  10. 10. Using CLI 11 https://docs.globus.org/cli/examples/
  11. 11. A simple, yet very common use case Transfer data Transfer Set access controls for sharing data Share 1 2 • Analyze raw data from an instrument • Distribute results from computation
  12. 12. Key Globus capabilities for automation • Applications are first class entities – Register application at developers.globus.org – <client_id>@clients.auth.globus.org • Guest collections – No human in the loop for data access – Creation of guest collection requires user authentication
  13. 13. Key Globus capabilities for automation • Permissions management can be delegated – Applications can be access managers • Applications can renew tokens – Refresh tokens along with Access tokens – Refresh tokens can be used to get Access tokens – Refresh token good for 6 months after last use – Consent rescindment revokes refresh token 14
  14. 14. Examples: automation using CLI github.com/globus/automation-examples • ./share_data.sh – Transfer a folder, and set permission for a users • ./cli-sync.sh – Sync one folder with the other • See README for installation • Python scripts that use SDK 15
  15. 15. Globus Flows Service
  16. 16. CityCOVID • Integrated COVID-19 pandemic monitoring, modeling, and analysis capability. • CityCOVID is a city-scale agent- based model • Automate flow – Scrape daily Chicago reports – Perform simulations at ALCF – Postprocess data at LCRC Jonathan Ozik, Nick Collier, and Charles Macal
  17. 17. Enabling serial crystallography at scale • Serially image chips with thousands of embedded crystals • Quality control first 1,000 to report failures • Analyze batches of images as they are collected • Report statistics and images during experiment • Return crystal structure to scientist Darren Sherrell, Gyorgy Babnigg, Andrzej Joachimiak
  18. 18. 19 Automation using the Globus platform Managed, secure, reliable task orchestration across heterogenous resources, using a declarative language for composition and an event driven execution model, extensible via custom actions, for automation at scale
  19. 19. The Globus Flows service • Flows: A platform service for defining, applying, and sharing distributed research automation flows • Flows comprise Actions • Action Providers: Called by Flows to perform tasks • Triggers*: Start flows based on events * Coming soon
  20. 20. Create and deploy flows 21 • Define the flow and deploy to Flows service • Uses declarative language (JSON or YAML) • Set policy: visibility, runnable by Action 1 Action 2 Action 3 Action 4 Action 1 Action 2 Choice Action 4 Action 5 Action 3
  21. 21. Start and manage runs 22 • An instance of Flow execution – Provide input parameter – Check status – Cancel • Set policy: monitor, manager • Triggers to start flows
  22. 22. Start and Manage runs from webapp 23
  23. 23. Create and deploy new flows 24
  24. 24. Flow 1: transfer and set permissions 25 • Notebook at jupyter.demo.globus.org • Choose “Automation Using Globus Flows” • Define and deploy flow using notebook (Section A and B) • Use Globus webapp to run the flow and manage the run
  25. 25. Programmatic start of flows 26 • API to start and manage runs • Globus Automate CLI and SDK • Event driven start of flows: Triggers - When a file of specific type is created - Every 12 hours
  26. 26. Trigger: start flow when file is created 27 • SSH to the tutorial machine • Set up GCP (if not done) • Edit simple_sync.py –Set it to run flow created using notebook • Run simple_sync.py • Monitor runs on the webapp bit.ly/gw-tut
  27. 27. End to end instrument data management 28 • Trigger: – Watch for file of specific type – Start a flow with folder path and metadata about folder • Flow – Transfer data – Set permissions – Ingest public metadata to index – Ingest restricted metadata to index
  28. 28. Flow 2: transfer, set permissions & ingest 29 • Notebook at jupyter.demo.globus.org • Choose “Automation Using Flows with Search” • Define and deploy flow using notebook (Section A and B)
  29. 29. Trigger: start flow when file is created 30 • SSH to the tutorial machine • cd globus-flows-trigger-examples/ • Set up GCP (if not done) • Edit trigger_transfer_share_flow.py – Set it to run flow created using notebooks • Edit and run trigger_transfer_publish_flow.py • Monitor runs on the webapp bit.ly/gw-tut
  30. 30. Automation services ecosystem GET /provider_url/ POST /provider_url/run GET /provider_url/action_id/status GET /provider_url/action_id/cancel GET /provider_url/action_id/status Create Action Providers Define and deploy flows { “StartAt”: ”ToProject”, ”States” : { ”ToProject” : { … }, ”SetPermission” : { …}, “ProcessData” : { … } … }} Run flows
  31. 31. Build action providers 32 • Action Provider is a service endpoint – Run – Status – Cancel – Release – Resume • Action Provider Toolkit action-provider- tools.readthedocs.io/en/latest Search Transfer Notification ACLs Identifier Delete Ingest User Form Describe Xtract funcX Web Form Custom built Globus Provided
  32. 32. Automating computation with funcX* Managed, federated Functions-as-a-Service for reliably, scalably and securely executing functions on remote endpoints from laptops to supercomputers * funcX is in currently under development and in limited production use
  33. 33. CityCOVID funcX Analyze Transfer Publish Auth Get credentials funcX Scrape funcX Simulate Transfer Transfer data
  34. 34. SSX Automation Index funcX Visualize Transfer Return results funcX QA Process Stop? Threshold Transfer Transfer data Publish Publish results funcX Analyze Catalog Generate crystal map Image processing Data capture High quality FAIR data
  35. 35. Thank you, funders... U . S . D E P A R T M E N T O F ENERGY
  36. 36. Support resources • Globus documentation: docs.globus.org • YouTube channel: youtube.com/user/GlobusOnline • Helpdesk and issue escalation: support@globus.org • Mailing Lists – globus.org/mailing-lists • Customer engagement team – Office Hours • Professional services team