1. OGCE Workflow Toolkit for Multi-Scale Science Applications Suresh Marru Pervasive Technology Institute Indiana University ODI Gateway/Pipeline Evaluation NOAO Jan 21st 2010
13. Gateways Advantages Increase access To instruments Increase capabilities To analyze data Improve workforce development For underserved populations Increase outreach Increase public awareness Public sees value in investments in large facilities
15. Example Gateway Accomplishments LEAD - access to radar data NVO – access to sky surveys OOI – access to sensor data PolarGrid – access to polar ice sheet data SIDGrid – analysis tools GridChem – developing multiscale coupling How would this have been done before gateways? How many details do we want each individual scientist to need to know?
16. TeraGrid Advantages and Challenges What’s different when the resource doesn’t belong just to me? Resource discovery Accounting Security Proposal-based requests for resources (peer-reviewed access) Code scaling and performance numbers Justification of resources Gateway citations Tremendous benefits at the high end, but even more work for the developers Potential impact on science is huge Small number of developers can impact thousands of scientists But need a way to train and fund those developers and provide them with appropriate tools
18. Weather is Local, High-Impact, Heterogeneous and Rapidly Evolving…Yet Our Technologies and Thinking are Static Rain and Snow Fog Rain and Snow Snow and Freezing Rain Intense Turbulence Severe Thunderstorms
19. LEAD Dynamic Adaptive Infrastructure Storms Forming Forecast Model Streaming Observations Data Mining Instrument Steering Refine forecast grid
20. NSF Engineering Research Center for Collaborative Adaptive Sensing of the Atmosphere (CASA) UMass/Amherst, OU, CSU, UPRM Concept: inexpensive, phased array Doppler radars on cell towers and buildings Dynamically adaptive sensing of multiple targets while simultaneously meeting multiple end-user needs
21. LEAD Workflow Requirements Run jobs on-demand on TeraGrid. Deadline driven workflows (severe weather tracking) Users ranging from 8th grade students to seasoned researchers. Run jobs on Multiple TeraGrid resources to decrease turn-around time. Must be able to integrate to Portal with very user friendly web interface.
23. High Level LEAD Architecture Workflow graph Application services Compute Engine User Portal Workflow Engine Fault Tolerance & scheduler Event Notification Bus Portal server MyLEAD Agent service Data Management Service Data Catalog service Providence Collection service MyLEAD User Metadata catalog Data Storage
25. LEAD Scientists and Educational Interactions Lowering the barrier for using complex end-to-end weather technologies Democratize Empower Facilitate End Users Developers Researcherss
29. Flexible Layered Service Oriented Architecture User Interactions Other Clients XBaya GUI Web Portal XBaya Core Event Bus Middleware Services GFac Services Workflow Engine (ODE) XRegistry XMCCat Metadata Catalog Compute & Data Resources Computational Cloud Local Lab Resources Computational Grids
30. OGCE Workflow Suite Generic Service Toolkit Tool to wrap command-line applications as web services Handles file staging & job submission and monitoring Extensible runtime for security, resource brokering & urgent computing Generic Factory service for on-demand creation of application services XRegistry Information repository for the OGCE workflow suite Register, search, retrieve & share XML documents User & hierarchical group based authorization XBaya GUI based tool to compose & monitor workflows Extensible support for compiler plug-ins like BPEL, Jython, SCUFL Dynamic Workflow Execution support to start, pause, resume, rewind of workflow executions Apache ODE Scientific Workflow Extensions XBaya GUI integration for BPEL Generation Asynchronous support for long running workflows Instrumented with fine grained monitoring Eventing System Supports both WS-Eventing and WS-Notification Standards Very scalable Persistent Message Box for clients behind firewalls and with intermittent network glitches.
35. Each service generates a stream of notifications that log the service actions back to the XMCCat Metadata Catalog, user monitoring, and provenance tracking toolsApp Service Run program & publish events
47. Interoperable XBaya Workflow Architecture BPEL 1.1 BPEL 2.0 SCUFL Abstract DAG Model Composition and Monitoring Python Dynamic Enactor/Interpreter Jython Based Enactor GPEL Engine Apache ODE Engine Taverna Python Runtime Message Bus
48. WS-BPEL Business Process Execution Language for Web Services (WS-BPEL) De-facto standard for specifying web service based business processes and service compositions Basic activities Invoke, Receive, Assign.. Structured activities Sequence, Flow, ForEach,..
49. Workflow Composition, Execution & Monitoring XBaya enables users to construct, share, execute and monitor sequence of tasks executing on their local workstations to high-end compute resources.
50. GPEL Grid Process Execution Language BPEL4WS based home grown research workflow engine Supports a subset of BPEL4WS 1.1 One of the very early adaptations of BPEL efforts Specifically designed for eScience Usage Long running workflow support Decoupled client
53. Simple Recovery Architecture Portal BPEL Workflow Engine Application Performance Models Fault Tolerance/ Recovery Service Resource Reliability Models Application Service OVP/ RST/ MIG NWS, MDS BQP Notification Service Deadline & Success Probability 36 36
54. OGCE Workflow Usage Flow Scientist/Application provider registers application description with Registry Service. Workflow Author constructs the workflow with multiple wrapped application services. Workflow is compiled and deployed to the ODE workflow Engine. Workflow inputs are captured by XBaya and workflow is launched to ODE. Workflow system and possibly some services publish notifications to the Message bus reporting the progress of the workflow. XBaya monitoring system listens to notifications and color the workflow components to present workflow progress.
56. ODI Overview – Image Acquisition WIYN Buffer ODI Data Capacitor Integration Server Data Capacitor Science Gateway End Users MDSS Archive TeraGrid Compute Resources
57. Overview – Image Copy WIYN Buffer ODI Data Capacitor Integration Server Data Capacitor Science Gateway End Users MDSS Archive TeraGrid Compute Resources
58. Overview – Image Transfer to Data Capacitor WIYN Buffer ODI Data Capacitor Integration Server Data Capacitor Science Gateway End Users MDSS Archive TeraGrid Compute Resources
59. Overview – Image Ingestions into the Archive WIYN Buffer ODI Data Capacitor Integration Server Data Capacitor Science Gateway End Users MDSS Archive TeraGrid Compute Resources
60. Overview – Clean Up WIYN Buffer ODI Data Capacitor Integration Server Data Capacitor Science Gateway End Users MDSS Archive TeraGrid Compute Resources
61. Overview – Automated Tier 1 Processing WIYN Buffer ODI Data Capacitor Integration Server Data Capacitor Science Gateway End Users MDSS Archive TeraGrid Compute Resources
62. Overview – User Driven Tier 2 Processing WIYN Buffer ODI Data Capacitor Integration Server Data Capacitor Science Gateway End Users MDSS Archive TeraGrid Compute Resources
63. Overview – User Driven Tier 2 Processing WIYN Buffer ODI Data Capacitor Integration Server Data Capacitor Science Gateway End Users MDSS Archive TeraGrid Compute Resources
64.
65. Some apps have rich Client Gui’s, a challenge with asynchronous long running workflows