SlideShare uma empresa Scribd logo
1 de 67
Baixar para ler offline
Cyberinfrastructure: Helping Push Research Boundaries 



                       Shantenu Jha*

               Asst. Res. Professor (CS)  

             Sr. Research Scientist (CCT)
     *also affiliated with National e­Science Centre (UK) & UCL
CI: Helping Push Research Boundaries

• Developing CI: A one step process? 
    – “If we build it, will they come?” “Will it be usable?”
• Interplay of (sustainable, long­term, and broadly 
  usable) CI and Research more complex 
    – Research & Applications requirements  inform the 
      development of CI 
    – In response, developed CI “roughly” sets the 
      boundaries of applications and their usage mode
    – Novel applications and usage modes that can 
      exploit CI will push the boundaries of research...


                                                               2
Outline
• Scientific Grid Applications 
      • Computing Free Energies in Biological Systems
           –  STIMD (2003­04), SPICE (2005­06)
• Challenges of Distributed Environments 
      • HARC: A tool for co­allocating resources
           – GENIUS: Grid­Enabled Neurosurgical Imaging 
             Using Simulations (2007­08)
      • Simple API for Grid Applications (SAGA)
• Regional CI Example ­ Louisiana
      • Software: Cactus, HARC, Petashare, SAGA... 
      • People: LONI Institute and NSF Cybertools
      • Novel e­Science Applications
                                                     3
                             
    Source: NSF report on Cyberinfrastructure for Biology
Computing Free Energies: Motivation
             Cellullar  messangers     
             e.g.  Growth  Factors, 
                                          Thermodynamic  quantity  of  maximum
             cytokines                    significance. 
                Transmembrane 
                receptor                  Characterizes the binding accurately:
              Recognition of 
                                            Inhibit specific protein domains     
              activated receptor by         Cell signalling events
                                            Intelligent drug design ...
              SH2 domain



                                          Rapid  &  accurate  determination    is 
                                          critical:  
    Gene switched                             ­  where  FE  difference  maybe  just 
                                          one part of the overall “system”
                                               ­ library of ligands to explore
                                                
Scientific Grid Computing: An Exemplar


• Computing FE is computationally very expensive. Balance 
  between  accuracy  and  rapid  determination.  Some 
  experimental time­scales are 2­3 days. 
• Algorithmic  advances  (e.g.,  O(N  logN))  have  helped;  but 
  more than just algorithmic advances are required. 
• Computational  ‘Grid’  Science:  Which  approaches  can  be 
  adapted  to  exploit  grid­based  computation?  Interplay  of 
  physical algorithm(s) and grid  architecture(s).



                                 
                                                  
Computing a Free Energy Difference Using
                       Thermodynamic Integration (TI)
      Src SH2 domain
                                                                λ=0                 λ=1
                                                                          ∆G1

                                                                                     ∆GB
                                                                ∆GA
                                  ligand
Free  energy  of  binding  of  the  ligand  to  the 
                                                                         ∆G2
larger  protein  characterises  the  strength  of  the 
binding. 

 TI provides a formalism to compute the difference of free energy of binding (GAB) between 
  two  somewhat  similar,  yet  different  peptides.  The  key  concept  in  TI    is  that  of  a 
 thermodynamic cycle ­ varying the value of lambda from 0 (peptide A) to 1 (peptide B).     
                                                          ∆ ∆GAB  =   ∆GB ­ ∆GA
                                                                   =   ∆G1 ­ ∆G2
                                                             
TI Calculation: Modified Workflow


                   H
                    λ                                     Launch initial job, use real time analysis to 
                                                          determine  when  to  spawn  next  simulation 
     Starting             Check for convergence t         at  a  new  λ  value.  Spawning  simulation 
   conformation
                                                          continues  until  sufficient  data  collected. 
                                                          Need to control several jobs. 
 λ=0.10                                      time

                                                                           H
  λ=0.25                                                                   λ

  λ=0.33
In  general,  use  realtime  analysis  to  dynamically 
determine best next value of lambda.                                                            λ
                     …




          λ=0.9                                            Combine  and  calculate  data  from  all  runs  to 
                                                           compute the integral to get  ∆∆GAB.


                                                               
Infrastructure (Software) Developed for Application
Steering  Library:  Correct  functional    abstraction  from  app. 
perspective. 
     steering_control(), register_params(), emit_data()
  Library  determines  destination,  transport  mechanism,  fopen, 
fwrite                          Architecture of a steered application.
Details of infrastructure & middleware (layers) hidden from app




                                         
Extensions of the Distributed TI  Concept
               Computational Techniques
­ Replica Exchange Methods                                                  
  Need for “intelligent” infrastructure to be coupled with analysis method

­ Ensemble MD
   – Simulate each system many times from same starting position
   – Allows conformational sampling. Can’t say how much a priori.
      Start Conformation       Series of Runs             End Conformations

                                                                    C1


 Cx                                                                 C2

                                                                    C3

                      Equilibration Protocols
                 eq1  eq2  eq3  eq4  eq5  eq6  eq7  eq8             C4
RNA Translocation Through Protein Pores

Molecular  Biology:  Critical  and 
ubiquitous  process. A model for:
­ Gene expression in eukaryotic cells
­ Viral infections rely on import of viral 
genome via nuclear pore
­Translocation     across  bacterial 
membrane during  phage infection   
                  
Technical Applications: Artificial pores 
(similar  to  natural  pore)  for  high­
throughput DNA screening
Theoretical  Physics:  Long  standing 
problem  of  a  semi­flexible  polymer 
motion in a confined geometry
                                      
Simulated Pore Interactive Computing Environment

Size,  complexity  &  timescale:  Computations  expensive.     
Millions of CPU hours using “vanilla” MD.  Not good enough.

Free  Energy  Profile:  Extremely  challenging  but  yields 
maximal insight and understanding of translocation process.

Novel  Algorithm:  Steered  Molecular  Dynamics  to  “pull  DNA 
through  the  pore”.  Jarzynksi's  Equation    to  compute 
equilibrium  free  energy  profile  from  such  non­equilibrium 
pulling. 
                 e ­ß∆F  = ‹ e ­ßW ›
                                 
Grid Computing Using Novel Algorithms 
 SMD+JE:  Need  to  determine 
 “optimal”  parameters  before 
 simulations at the optimal values.
  Requires:  Interactive  simulations 
  and  distributing  many  large, 
  parallel simulations 
 Interactive  “Live  coupling”:  use 
 visualization to steer simulation

 Reduces computational cost by a factor of ca. 100. 
 Solve  a  computationally  “intractable”  problem  using  novel 
 algorithm.  Our  solution  not  only  exploits  grid  infrastructure,  but 
 requires it. 
                                      
SPICE: Computing the Free Energy Profile (FEP)

Replace single long running ‘vanilla’ MD simulation with following scheme:

Step I: Understand structural features using static visualization 

Step II:  Interactive simulations for dynamic and energetic features    
                  ­  Steered simulations: bidirectional communication. 
                          Qualitative + Quantitative (SMD+JE)
                  ­  Haptic interaction: Use haptic  to feel feedback forces
 
Step III:  Simulations  to compute “optimal” parameter values:
               e.g., 75 simulations on 128/256 processors each.  
    
Step IV: Use computed “optimal” values to calculate full FEP  
             along the cylindrical axis of the pore. 
      


                                       
Grid Computing, Interactivity and Analysis

•  Interactive  simulations  used  to  determine:
Optimal  value  of  force­constant  &  pulling 
velocity,  choice  of  sub­trajectory  length  and 
location for optimal value simulations


•  Use  visualization  to  provide  input  to    the 
running  simulation.  Require  256px  (or  more) 
of  HPC  for  interactivity.  Steady­state  data 
stream (up & down)

Interactive  simulations  perform  better  when  using  optical 
lightpaths  between simulation and visualization

Due to network characteristics. Typical “legacy” app (NAMD) not written 
for network I/O. “Unreliable” transfer can stall simulations.
                                            
“Global” Grid Infrastructure                                  UK NGS
                                                                NGS

                                                                HPCx
                                                                              Leeds
US TeraGrid
                        Starlight (Chicago)                            Manchester
                                                  Netherlight
                                                 (Amsterdam)
  SDSC                                                             Oxford
                 NCSA
                                 PSC                                   RAL

                                                 UKLight

                                                      App Sci   DEISA

                        All sites connected by
                         production network




 Computation       Network PoP          Visualization
Recap: FE Exemplars

    Both FE Algorithms are good candidates for distributed 
    resource utilization
     –  i.e., “pleasantly” distributable

    Similar Infrastructure 
     –  Software (ReG Steering Services), middleware.. 
     –  Federated Grids

    SPICE more complex than STIMD:
     – Complexity of tasks different
     –  Needs co­scheduling of heterogenous resources
     – number of components/degree­of­freedom different
VORTONICS:  Vortex Dynamics
              on Transatlantic Federated Grids




    US­UK TG­NGS Joint Projects Supported by NSF, 
               EPSRC, and TeraGrid

Computational challenge: Enormous problem sizes, memory 
requirements, and long run times: Largest runs require 
geographically distributed domain decomposition (GD3)


                                    
Run Sizes to Date / Performance

•   Using an early version of MPICH­G2, 3D lattice 
    sizes up to 6453  across six sites on TG/NGS
     • NCSA, SDSC, ANL, TACC, PSC, CSAR (UK)
     • Amount of data injected into network.  
       Strongly bandwidth limited.

• Effective SUPS/processor
     • Reduced by factor approximately equal to            1 2 3 4 sites
       number of sites
     • Therefore SUPS approximately constant as 
       problem grows in size                               sites   kSUPS/Proc
     – If too large to fit onto one machine, GD3   over    1       600
       N resources simultaneously is no worse than 
                                                           2       300
       N sequential runs
                                                           4       149
                                                           6       75


                                             
Outline
• Scientific Grid Applications 
      • Computing Free Energies in Biological Systems
           –  STIMD (2003­04), SPICE (2005­06)
• Challenges of Distributed Environments 
      • HARC: A tool for co­allocating resources
           – GENIUS: Grid­Enabled Neurosurgical Imaging 
             Using Simulations (2007­08)
      • Simple API for Grid Applications (SAGA)
• Regional CI Example
      • Software: Cactus, HARC, Petashare, SAGA... 
      • People: LONI Institute and NSF Cybertools
      • Novel e­Science Applications
                                                     20
Challenges of Distributed Environments
                   Lessons learnt from Pilot Projects
• Hiding the Heterogeneity; providing uniformity
We interface application code to grid middleware through well 
defined user­level APIs. No code refactoring required.
Hides heterogeneity of software stack and site­specific details:
       Vortonics: MPICH­G2 hides low­level details      (communication, 
network­topology,      resource­allocation and management)

       SPICE ­­ RealityGrid steering library 

• Need for usable, stable and  extensible infrastructure
Infrastructure relatively “easy” for demo(s); difficult for routine use! 
Science requires stable & persistent infrastructure

            Motivation for SAGA Efforts at OGF
                                        
Challenges of Distributed Environments

• Machine Configuration Issues:
­ Variants of the same problem faced, e.g., hidden IP issue      
  for MPICH­G2 and RealityGrid steering
 
­ Same problem on different resources, e.g., PSC and HPCx. 
   PSC  = qsocket + Access Gateway Node                    
   performance issues remain due to protocol constraints
     HPCx= Same solution does not work for ReG Steering;
                  port­forwarding being tested



                                  
Challenges of Distributed Environments
                             Federated Grids
• Current barrier to utilise federated grids still high:
        Many degrees­of­freedom need coordination
        Collective Inter­Grid Debugging required


• Federated Grids must be interoperable in practice:
Stress test using real applications Requires additional “user level 
middleware” (MPICH­G2,   ReG steering infrastructure), to work 
across grids
• Paper on the theory, implementation and experiences of the three 
joint projects: (CLADE 2006, Sp. Issue Cluster Comp)
       http://www.realitygrid.org/publications/triprojects_clade_final.pdf
•Application level Interoperability; Influenced the creation of GIN

                                        
Challenges of Distributed Environments
New policies for resource co­ordination are required 
A common requirement of SPICE and VORTONICS: co­scheduling 
of resources (computer, visualization, network)!
Three levels of scheduling complexity:
    –  Advance single resource reservation
    –  Advanced, coordinated multiple reservations across a grid 
    –  Advance coordinated reservations across distinct grids!
First breaks standard HPC usage model; Third  Cross­Grid Co­
scheduling is very hard today.


Current levels of human intervention too high: Need  Automation
                     
                      Motivation for HARC

                                   
HARC: Highly­Available Resource Co­allocator

• What is Co­allocation?
• Process of reserving multiple resources for use by a 
  single application or “thing” – but in a single step...
• Can reserve the resources:
   – For the same time:
      • Metacomputing, large MPIg/MPICH­G2 jobs
      • Distributed visualization 
   – Or some coordinated set of times
      • Computational workflows




          HARC is primarily  developed by Jon  Maclaren@ CCT
                                
How does HARC Work?
                   • Client makes request, from 
                     command line, or other tool via 
                     Client API

                   • Request goes to the HARC 
                     Acceptors, which manage the co­
                     allocation

                   • The Acceptors talk to individual 
                     Resource Managers which make 
                     the individual reservations by 
                     talking to the local schedulers




                
HARC is Extensible
       (Community Model)
• Modular Design throughout
    – Not just compute resources.  New resource types can be 
      added, then co­allocated with all other types of resource
    – No modification to Acceptors is needed.  Just provide 
      Resource Manager code to schedule the resource
    – And extend the Client API with new classes (again, no mods to 
      existing code)
    – Even works from the command line

• Example: Network Resource Manager
       $ harc-reserve -n EnLIGHTened/RA1-BTH
             -c bluedawg.loni.org/8 -s 12:00 -d 1:00
    – Co­allocates a lightpath between LSU & MCNC, with 8 
      processors on bluedawg...
    – Was used to schedule lightpaths in EnLIGHTened testbed for 
      Thomas Sterling’s HPC Class, broadcast in High­def video
                                 
GENIUS: Overview
                               PI: Coveney, UCL



Goals:



    
         Provide a better understanding of 
        cerebral fluid flow,
    
         Inform clinicians of best surgical approaches.
Approach:



    
        Model large scale patient specific cerebral blood flow within clinically 
        relevant time frames
Provides:



    
         Reliable & effective patient­specific image­based models
    
         Efficient LB blood flow simulation
    
        Real time blood flow volume rendering – Visualisation

                                             
HemeLB fluid flow solver

        A fast fluid flow simulation of a very large system 
        requires the use of an efficient parallel fluid solver 
        several processors
•       Lattice­Boltzmann method; Parallel MPI code,
•       Efficient algorithms for sparse geometries,
•       Topology­aware graph growing partitioning technique,
•       Optimized inter­ and intra­machine communication 
        patterns,
•       Full checkpoint capabilities...



                                   
HemeLB fluid solver performance
MPI­g: pre­release grid­enable MPI implementation which is 
optimised for the overlap of communication and computation.


Performance of HemeLB’s fluid solver on patient­specific 
system using LONI machines (IBM Power5 1.9 GHz).




LB time steps per second
Cross­site ­ 2 LONI IBM machines
                                      
Advance resources reservation: HARC


Efficient code and MPI­g exist,  but how to run over several 
distributed machines?

Use HARC ­ Highly Available Robust Co­scheduler (developed 
by Jon Maclaren at LSU).


Why  HemeLB  +  MPI­g  +  HARC ?

Heterogeneous and sparse resources are more likely to be 
available and give us prompt results

            Clinically relevant and timely results
                                
NGS          UK NGS

   US/UK Grid Infrastructure                 HPCx
                                                           Leeds

                                                    Manchester


TeraGrid                                        Oxford

                                                    RAL



                                             LONI
                                              LSU HSCTech
                                                   La
                                                       ULM

                                               NSU

The GENIUS project makes use of                     Alex
                                                                SLU
                                                           SU LSU
infrastructure provided by LONI, 
TeraGrid and NGS, connected by                      ULL              UNO
                                               McNeese              LSU HSC
dedicated switched optical light paths                             Tulane
                                          
Using HARC...
• Our aim is to get HARC available to users as part of 
  the basic Grid infrastructure
• Current deployments
    – LONI (Louisiana Optical Network Initiative)
       • production mode
    – UK NGS, Manchester, Leeds and Oxford NGS2 
    – TeraGrid Co­scheduling testbed machines 
      (SDSC/NCSA IA­64)
    – NW­Grid (Lancaster, Manchester)

• Everything is open source too
• See:
 
   – http://www.cct.lsu.edu/~maclaren/HARC/
                             
Rough Taxonomy of Applications




• Some applications are Grid­unaware and want to remain so
   – Use tools/environments (e.g, NanoHub, GridChem)
   – May run on Grid­aware/Grid­enabled environments (e.g.  
     Condor) or programming environment (e.g, MPICH­G2)
• Some applications are explicitly Grid­aware
   – Control, Interact & Exploit distributed systems at the 
     application level
                                                          34
SAGA: In a Nutshell
• A lack of: 
   • Programming interface that provides common grid 
   functionality with the correct level of abstractions? 
   • Ability to hide underlying complexities, varying semantics, 
   heterogenities and changes from application  program(er) 
• Simple, integrated, stable, uniform, high­level  interface
• Simplicity: Restricted in scope, 80/20 
• Measure(s) of success:
   • Does SAGA enable quick development  of “new” 
   distributed applications?
   • Does it enable greater functionality using less code?
Copy a File: Globus GASS
int copy_file (char const* source,    char const* target)               if (source_url.scheme_type == GLOBUS_URL_SCHEME_GSIFTP ||
{                                                                           source_url.scheme_type == GLOBUS_URL_SCHEME_FTP    ) {
globus_url_t                         source_url;                          globus_ftp_client_operationattr_init (&source_ftp_attr);
globus_io_handle_t                   dest_io_handle;                      globus_gass_copy_attr_set_ftp (&source_gass_copy_attr,
globus_ftp_client_operationattr_t    source_ftp_attr;                                                    &source_ftp_attr);
globus_result_t                      result;                            }
globus_gass_transfer_requestattr_t   source_gass_attr;                  else {
globus_gass_copy_attr_t              source_gass_copy_attr;               globus_gass_transfer_requestattr_init (&source_gass_attr,
globus_gass_copy_handle_t            gass_copy_handle;                                                    source_url.scheme);
globus_gass_copy_handleattr_t        gass_copy_handleattr;                globus_gass_copy_attr_set_gass(&source_gass_copy_attr,
globus_ftp_client_handleattr_t       ftp_handleattr;                                  &source_gass_attr);
globus_io_attr_t                     io_attr;                           }
int                                  output_file = -1;
                                                                        output_file = globus_libc_open ((char*) target,
if ( globus_url_parse (source_URL, &source_url) != GLOBUS_SUCCESS ) {                 O_WRONLY | O_TRUNC | O_CREAT,
  printf (quot;can not parse source_URL quot;%squot;nquot;, source_URL);                           S_IRUSR | S_IWUSR | S_IRGRP |
  return (-1);                                                                        S_IWGRP);
}                                                                       if ( output_file == -1 ) {
                                                                          printf (quot;could not open the file quot;%squot;nquot;, target);
if ( source_url.scheme_type != GLOBUS_URL_SCHEME_GSIFTP &&                return (-1);
     source_url.scheme_type != GLOBUS_URL_SCHEME_FTP    &&              }
     source_url.scheme_type != GLOBUS_URL_SCHEME_HTTP   &&              /* convert stdout to be a globus_io_handle */
     source_url.scheme_type != GLOBUS_URL_SCHEME_HTTPS ) {              if ( globus_io_file_posix_convert (output_file, 0,
  printf (quot;can not copy from %s - wrong protnquot;, source_URL);                                             &dest_io_handle)
  return (-1);                                                               != GLOBUS_SUCCESS) {
}                                                                         printf (quot;Error converting the file handlenquot;);
globus_gass_copy_handleattr_init (&gass_copy_handleattr);                 return (-1);
globus_gass_copy_attr_init        (&source_gass_copy_attr);             }

globus_ftp_client_handleattr_init (&ftp_handleattr);                    result = globus_gass_copy_register_url_to_handle (
globus_io_fileattr_init           (&io_attr);                                    &gass_copy_handle, (char*)source_URL,
                                                                                 &source_gass_copy_attr, &dest_io_handle,
globus_gass_copy_attr_set_io      (&source_gass_copy_attr, &io_attr);            my_callback, NULL);
                                   &io_attr);                           if ( result != GLOBUS_SUCCESS ) {
globus_gass_copy_handleattr_set_ftp_attr                                  printf (quot;error: %snquot;, globus_object_printable_to_string
                                  (&gass_copy_handleattr,                         (globus_error_get (result)));
                                   &ftp_handleattr);                      return (-1);
globus_gass_copy_handle_init      (&gass_copy_handle,                   }
                                   &gass_copy_handleattr);              globus_url_destroy (&source_url);
                                                                        return (0);

10/22/2006                                                      LCSD'06                                                               36
                                                                        }
SAGA Example: Copy a File
                   High­level, uniform
#include <string>
#include <saga/saga.hpp>

void copy_file(std::string source_url, std::string target_url)
{
  try {
    saga::file f(source_url);
    f.copy(target_url);
  }
  catch (saga::exception const &e) {
    std::cerr << e.what() << std::endl;
  }
}

• Provides the high level abstraction, that application programmers 
  need; will work across different systems
• Shields gory details of lower­level m/w system
• Like MapReduce – leave details of distribution etc. out
10/22/2006                    LCSD'06                            37
SAGA: Scope
• Is:
    –  Simple API for Grid­Aware Applications
        •  Deal with distributed infrastructure explicitly
    – High­level (= application­level) abstraction
    – An uniform interface to different middleware(s)
    – Client­side software
• Is NOT:
    – Middleware
    – A service management interface!
    – Does not hide the resources ­ remote files, job (but 
      the details)

                             
SAGA API: Towards a Standard
• The need for a standard programming interface
    – “Go it alone” versus “Community” model 
    –  Reinventing the wheel again, yet again, and again
    –  MPI as a useful analogy of community standard
    – OGF the natural choice; establish SAGA­RG
•  “Tedium” of the standardisation process?
    – Not all technology needs to be standardised upfront 
    – Standardisation not a guarantee to success
• Requirements Document 
    –  Quick skim through the Requirements document re
    –  Design and requirements derived from 23 Use Cases
    –  Different projects, applications and functionality
                            
The SAGA Landscape




                
SAGA  C++ (LSU)
Implementation ­ Requirements
• Non­trivial set of requirements:
    – Allow heterogenous middleware to co­exist
    – Cope with evolving grid environments; dyn resources
    – Future SAGA API extensions 
    – Portable, syntactically and semantically platform 
      independent; permit latency hiding mechanisms
    – Ease of deployment, configuration, multiple­language 
      support, documentation etc.
    – Provide synchronous, asynchronous & task versions 
    Portability, modularity, flexibility, adaptabilty, extensibility
10/22/2006                   LCSD'06                          42
Job Submission API


    01:   // Submitting a simple job and wait for completition
    02:   //
    03:   saga::job_description jobdef;
    04:   jobdef.set_attribute (quot;Executablequot;, quot;job.shquot;);
    05:
    06:   saga::job_service js;
    07:   saga::job job = js.create_job (quot;remote.host.netquot;, jobdef);
    08:
    09:   job.run();
    10:
    11:   while( job.get_state() == saga::job::Running )
    12:   {
    13:     std::cout << “ Job running with ID: “
    14:               << job.get_attribute(“ JobID” ) << std::endl;
    15:     sleep(1);
    16:   }




                                         
SAGA Landscape




              
 GridSAT
           First Principles Grid Application

• Grid implementation  of the 
  satisfiability problem: To 
  determine if the variables of 
  given Boolean formula can 
  be assigned such as to 
  make it TRUE. 
• Adaptive: computation to 
  communication ratio 
  need/can be adjustable (!)
• Allows new domain science
    –  beats zChaff (time taken 
      and problem)                     Adapted from slides by Wolski & Chakrab
                                    
GridSAT Characteristics
• Parallel, distributed SAT solver
   – Both CPU and Memory Intensive
   – Splitting leads to better performance
   – Allows sharing: clause learned in solver shared
• Grid Aware Application:
   – Heterogenous (single, clusters & supercomputers)
   – Dynamical Resource Usage
       • Unpredictable runtime behaviour
           –  How much time? How many resources? When 
             to split? Which process splits first?
   – Problems vary: easy to hard, short to long
           – Need to be adaptive, “add resources as you go”
                              
GridSAT: Programming Requirements




• RPC, Dynamic resource & Job          
                                           Error Handling, scheduling and 
  management                               checkpointing

 SAGA provides the required programming functionality, at the 
 correct level of abstraction and thus makes it easier to  
 manage, deploy and extend (for new functionality) GridSAT
                                       
Legacy Application: Replica Exchange

• “Class of algorithm” used for bio­
  molecular simulations
        • e.g.,  Protein (mis­) folding
• Primarily used for 
       •  Enhanced sampling
       •  Determine transition rates
• Task Level Parallelism 
    – Embarrassingly distributable!




                                           
Replica Exchange Algorithm

• Create replicas of initial                                           R1
  configuration
• Spawn 'N' replicas over                                              R2
  different machine
                                                                       R3
• Run for time t
                                                                       RN
• Attempt configuration 
 swap of      Ri  <­> Rj
• Run for further time t                                         hot
• ...                           T
• Repeat till termination
                                        t   Exchange  attempts   300K
                                            t
RE: Programming Requirements

    RE can be implemented using following “primitives”
      • Read job description
         – # of processors, replicas, determine resources
      • Submit jobs
         – Move files, job launch
      • Access simulation data & analysis
      • Checkpoint and re­launch simulations
         – Exchange, RPC (to swap or not)

    Implement above using “grid primitives” provided by SAGA 

    Separated “distributed” logic from “simulation” logic
     
       Independent of underlying code/engine
     
       Science kernel is independent of details of distributed 
       resource management 
         
            Desktop akin to High­end supercomputer!!
                                       
Programming Distributed  Applications
                Parallel Programming Analogy 
Status of distributed programming today, (somewhat) similar 
to parallel programming pre­MPI days 
MPI was a “success” in that it helped many new applications
     –  MPI was simple
     –  MPI was a standard (stable and portable code!)
• SAGA is to the grid application developer, what MPI is to the 
parallel program developer  (“grid­primitives”)
• SAGA conception & trajectory similar to MPI
     –  SAGA is simple to use
     – OGF specification; on path to becoming a standard  
•Therefore, SAGA's Measure(s) of success: 
         •  Does SAGA enable “new” grid applications? 
                                
Outline
• Scientific Grid Applications 
      • Computing Free Energies in Biological Systems
         –  STIMD (2003­04), SPICE (2005­06)
• Challenges of Distributed Environments 
      • HARC: A tool for co­allocating resources
         – GENIUS: Grid­Enabled Neurosurgical Imaging Using 
           Simulations (2007­08)
      • Simple API for Grid Applications (SAGA)
• Regional CI Example: LONI (Now part of TeraGrid!)
      • Hardware: Compute + Network 
      • Software: Cactus, HARC, Petashare, SAGA... 
      • People: LONI Institute and NSF Cybertools
      • Novel e­Science Applications                           52
3 Axes:
                                ­ LONI
                                ­ CyberTools
                                ­ LONI Institute
                                       ~ 100TF IBM, Dell 
LA Tech                                Supercomputers
                                        National Lambda Rail
                   LSU
            SUBR
                             UNO
          UL­L      Tulane
Cybertools: Providing Application Software 




                      
Cybertools (2)




    WP4: Core Package!
                      
Integrating Applications into a Cyberenvironment




                       
57
• Goal: Enable underlying infrastructure to manage the 
  low­level data handling issues. 
• Novel approach: treat data storage resources and the 
  tasks related to data access as first class entities just like 
  computational resources and compute tasks.
• Key technologies being developed: data­aware storage 
  systems, data­aware schedulers (i.e. Stork), and cross­
  domain meta­data scheme. 
• PetaShare exploits 40 Gb/sec LONI connections 
  between 5 LA Universities : LSU, LaTech, Tulane, ULL & 
  UNO
      Cybertools: Not just compute!
      PI: Tevfik Kosar (CCT/LSU)
                                
Participating institutions in the PetaShare project, connected
                         through LONI. Sample research of the participating researchers
                         pictured (i.e. biomechanics by Kodiyalam & Wischusen, tangible
                         interaction by Ullmer, coastal studies by Walker, and molecular
                         biology by Bishop).




                     High Energy Physics
                     Biomedical Data Mining
      LaTech




                                              Coastal Modeling
                                              Petroleum Engineering
                                      LSU     Computational Fluid Dynamics
                                              Synchrotron X-ray Microtomography
                                       UNO

                                               Biophysics

    ULL                     Tulane

           Geology                     Molecular Biology
          Petroleum Engineering        Computational Cardiac Electrophysiology




                                   
LONI Institute 
• Build on LONI infrastructure, create bold new inter­
  university superstructure
   –   New faculty, staff, students;  train others.  Focus on CS, Bio, 
       Materials, but all disciplines impacted
   –   Promote collaborative research at interfaces for innovation
   –   Much stronger recruiting opportunities for all institutions
• Two new faculty at each institution (12 total)
   –   Six each in CS, Comp. Bio/Materials with half PKSFI matching;  
       fully covered after five years
• Six Computational Scientists
   –   Support 70­90 projects over five years; lead to external funding
• Graduate students
   –   36 new students funded, trained; two years each
Applications: Where it all comes together...




                                         61
Resource Performance Monitoring Application

                                • NWS, BQP – only 1 resource at a time!! 
• jfdja;
                                • How to choose M resources out of N ?  
                                    e.g. MPICH­G2 Application, which M
                                • Cactus + SAGA + LONI (Lightpaths)




                                                                        62
   CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
63
64
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Acknowledegments: The SAGA Team


Hartmut Kaiser
                           Thilo Kielmann

Andre Merzky
                           Ceriel Jacobs


Ole Weidner                Kees Verstop
Acknowledgements


• HARC: Jon Maclaren, LSU 

• GENIUS: Peter Coveney and Steve Manos, UCL

• PetaShare: Tevfik Kosar 

• Students & Research Staff @ CCT 

• LONI Staff

• Funding Agencies: NSF, EPSRC (UK), LA BoR                           66
  CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Conference Announcement

• MardiGras Conference 2008:  “Novel Distributed 
  Computing Applications and Technology” 
   –  http://mardigrasconference.org
• Dan Katz (Chair) & Shantenu Jha (co­Chair) 
• Craig Lee (PC Chair), Geoffrey Fox (Vice­Chair, 
  Emerging Technologies), Bill St. Arnaud (Vice 
  Chair, Network­Instensive Applications), Matei 
  Ripeanu (UBC) Publicity 
• Oct 31 Paper Submission Deadline
• Peer reviewed proceedings  to be published in 
  ACM library (ISBN I978­1­59593­835­0)          67

Mais conteúdo relacionado

Semelhante a Cyberinfrastructure: Helping Push Research Boundaries

PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...Jinwon Lee
 
Slices Of Performance in Java - Oleksandr Bodnar
Slices Of Performance in Java - Oleksandr BodnarSlices Of Performance in Java - Oleksandr Bodnar
Slices Of Performance in Java - Oleksandr BodnarGlobalLogic Ukraine
 
NASA_EPSCoR_poster_2015
NASA_EPSCoR_poster_2015NASA_EPSCoR_poster_2015
NASA_EPSCoR_poster_2015Longyin Cui
 
Learning biologically relevant features using convolutional neural networks f...
Learning biologically relevant features using convolutional neural networks f...Learning biologically relevant features using convolutional neural networks f...
Learning biologically relevant features using convolutional neural networks f...Wesley De Neve
 
“Practical DNN Quantization Techniques and Tools,” a Presentation from Facebook
“Practical DNN Quantization Techniques and Tools,” a Presentation from Facebook“Practical DNN Quantization Techniques and Tools,” a Presentation from Facebook
“Practical DNN Quantization Techniques and Tools,” a Presentation from FacebookEdge AI and Vision Alliance
 
Higgs Boson Challenge
Higgs Boson ChallengeHiggs Boson Challenge
Higgs Boson ChallengeRaouf KESKES
 
Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008
Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008
Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008Sergio Bossa
 
deep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural networkdeep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural networkJaey Jeong
 
Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...
Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...
Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...Subhajit Sahu
 
A data and task co scheduling algorithm for scientific cloud workflows
A data and task co scheduling algorithm for scientific cloud workflowsA data and task co scheduling algorithm for scientific cloud workflows
A data and task co scheduling algorithm for scientific cloud workflowsFinalyearprojects Toall
 
OpenACC Monthly Highlights March 2019
OpenACC Monthly Highlights March 2019OpenACC Monthly Highlights March 2019
OpenACC Monthly Highlights March 2019OpenACC
 
Technical_Report_on_ML_Library
Technical_Report_on_ML_LibraryTechnical_Report_on_ML_Library
Technical_Report_on_ML_LibrarySaurabh Chauhan
 
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGATO project
 
Scaling Analytics with Apache Spark
Scaling Analytics with Apache SparkScaling Analytics with Apache Spark
Scaling Analytics with Apache SparkQuantUniversity
 
Data Con LA 2022 - Pre - recorded - Quantum Computing, The next new technolog...
Data Con LA 2022 - Pre - recorded - Quantum Computing, The next new technolog...Data Con LA 2022 - Pre - recorded - Quantum Computing, The next new technolog...
Data Con LA 2022 - Pre - recorded - Quantum Computing, The next new technolog...Data Con LA
 
Project ppt on Rapid Battery Charger using Fuzzy Controller
Project ppt on Rapid Battery Charger using Fuzzy ControllerProject ppt on Rapid Battery Charger using Fuzzy Controller
Project ppt on Rapid Battery Charger using Fuzzy ControllerPriya_Srivastava
 
Linked In Lessons Learned And Growth And Scalability
Linked In Lessons Learned And Growth And ScalabilityLinked In Lessons Learned And Growth And Scalability
Linked In Lessons Learned And Growth And ScalabilityConSanFrancisco123
 

Semelhante a Cyberinfrastructure: Helping Push Research Boundaries (20)

Machine Learning - Principles
Machine Learning - PrinciplesMachine Learning - Principles
Machine Learning - Principles
 
Conflict Resolution In Kai
Conflict Resolution In KaiConflict Resolution In Kai
Conflict Resolution In Kai
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
 
Slices Of Performance in Java - Oleksandr Bodnar
Slices Of Performance in Java - Oleksandr BodnarSlices Of Performance in Java - Oleksandr Bodnar
Slices Of Performance in Java - Oleksandr Bodnar
 
NASA_EPSCoR_poster_2015
NASA_EPSCoR_poster_2015NASA_EPSCoR_poster_2015
NASA_EPSCoR_poster_2015
 
Learning biologically relevant features using convolutional neural networks f...
Learning biologically relevant features using convolutional neural networks f...Learning biologically relevant features using convolutional neural networks f...
Learning biologically relevant features using convolutional neural networks f...
 
“Practical DNN Quantization Techniques and Tools,” a Presentation from Facebook
“Practical DNN Quantization Techniques and Tools,” a Presentation from Facebook“Practical DNN Quantization Techniques and Tools,” a Presentation from Facebook
“Practical DNN Quantization Techniques and Tools,” a Presentation from Facebook
 
Higgs Boson Challenge
Higgs Boson ChallengeHiggs Boson Challenge
Higgs Boson Challenge
 
Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008
Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008
Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008
 
deep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural networkdeep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural network
 
Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...
Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...
Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...
 
A data and task co scheduling algorithm for scientific cloud workflows
A data and task co scheduling algorithm for scientific cloud workflowsA data and task co scheduling algorithm for scientific cloud workflows
A data and task co scheduling algorithm for scientific cloud workflows
 
Analysis
AnalysisAnalysis
Analysis
 
OpenACC Monthly Highlights March 2019
OpenACC Monthly Highlights March 2019OpenACC Monthly Highlights March 2019
OpenACC Monthly Highlights March 2019
 
Technical_Report_on_ML_Library
Technical_Report_on_ML_LibraryTechnical_Report_on_ML_Library
Technical_Report_on_ML_Library
 
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
 
Scaling Analytics with Apache Spark
Scaling Analytics with Apache SparkScaling Analytics with Apache Spark
Scaling Analytics with Apache Spark
 
Data Con LA 2022 - Pre - recorded - Quantum Computing, The next new technolog...
Data Con LA 2022 - Pre - recorded - Quantum Computing, The next new technolog...Data Con LA 2022 - Pre - recorded - Quantum Computing, The next new technolog...
Data Con LA 2022 - Pre - recorded - Quantum Computing, The next new technolog...
 
Project ppt on Rapid Battery Charger using Fuzzy Controller
Project ppt on Rapid Battery Charger using Fuzzy ControllerProject ppt on Rapid Battery Charger using Fuzzy Controller
Project ppt on Rapid Battery Charger using Fuzzy Controller
 
Linked In Lessons Learned And Growth And Scalability
Linked In Lessons Learned And Growth And ScalabilityLinked In Lessons Learned And Growth And Scalability
Linked In Lessons Learned And Growth And Scalability
 

Mais de Cybera Inc.

Cyber Summit 2016: Technology, Education, and Democracy
Cyber Summit 2016: Technology, Education, and DemocracyCyber Summit 2016: Technology, Education, and Democracy
Cyber Summit 2016: Technology, Education, and DemocracyCybera Inc.
 
Cyber Summit 2016: Understanding Users' (In)Secure Behaviour
Cyber Summit 2016: Understanding Users' (In)Secure BehaviourCyber Summit 2016: Understanding Users' (In)Secure Behaviour
Cyber Summit 2016: Understanding Users' (In)Secure BehaviourCybera Inc.
 
Cyber Summit 2016: Insider Threat Indicators: Human Behaviour
Cyber Summit 2016: Insider Threat Indicators: Human BehaviourCyber Summit 2016: Insider Threat Indicators: Human Behaviour
Cyber Summit 2016: Insider Threat Indicators: Human BehaviourCybera Inc.
 
Cyber Summit 2016: Research Data and the Canadian Innovation Challenge
Cyber Summit 2016: Research Data and the Canadian Innovation ChallengeCyber Summit 2016: Research Data and the Canadian Innovation Challenge
Cyber Summit 2016: Research Data and the Canadian Innovation ChallengeCybera Inc.
 
Cyber Summit 2016: Knowing More and Understanding Less in the Age of Big Data
Cyber Summit 2016: Knowing More and Understanding Less in the Age of Big DataCyber Summit 2016: Knowing More and Understanding Less in the Age of Big Data
Cyber Summit 2016: Knowing More and Understanding Less in the Age of Big DataCybera Inc.
 
Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse
Cyber Summit 2016: Privacy Issues in Big Data Sharing and ReuseCyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse
Cyber Summit 2016: Privacy Issues in Big Data Sharing and ReuseCybera Inc.
 
Cyber Summit 2016: Establishing an Ethics Framework for Predictive Analytics ...
Cyber Summit 2016: Establishing an Ethics Framework for Predictive Analytics ...Cyber Summit 2016: Establishing an Ethics Framework for Predictive Analytics ...
Cyber Summit 2016: Establishing an Ethics Framework for Predictive Analytics ...Cybera Inc.
 
Cyber Summit 2016: The Data Tsunami vs The Network: How More Data Changes Eve...
Cyber Summit 2016: The Data Tsunami vs The Network: How More Data Changes Eve...Cyber Summit 2016: The Data Tsunami vs The Network: How More Data Changes Eve...
Cyber Summit 2016: The Data Tsunami vs The Network: How More Data Changes Eve...Cybera Inc.
 
Cyber Summit 2016: Issues and Challenges Facing Municipalities In Securing Data
Cyber Summit 2016: Issues and Challenges Facing Municipalities In Securing DataCyber Summit 2016: Issues and Challenges Facing Municipalities In Securing Data
Cyber Summit 2016: Issues and Challenges Facing Municipalities In Securing DataCybera Inc.
 
Cyber Summit 2016: Using Law Responsibly: What Happens When Law Meets Technol...
Cyber Summit 2016: Using Law Responsibly: What Happens When Law Meets Technol...Cyber Summit 2016: Using Law Responsibly: What Happens When Law Meets Technol...
Cyber Summit 2016: Using Law Responsibly: What Happens When Law Meets Technol...Cybera Inc.
 
Privacy, Security & Access to Data
Privacy, Security & Access to DataPrivacy, Security & Access to Data
Privacy, Security & Access to DataCybera Inc.
 
Do Universities Dream of Big Data
Do Universities Dream of Big DataDo Universities Dream of Big Data
Do Universities Dream of Big DataCybera Inc.
 
Predicting the Future With Microsoft Bing
Predicting the Future With Microsoft BingPredicting the Future With Microsoft Bing
Predicting the Future With Microsoft BingCybera Inc.
 
Analytics 101: How to not fail at analytics
Analytics 101: How to not fail at analyticsAnalytics 101: How to not fail at analytics
Analytics 101: How to not fail at analyticsCybera Inc.
 
Are MOOC's past their peak?
Are MOOC's past their peak?Are MOOC's past their peak?
Are MOOC's past their peak?Cybera Inc.
 
Opening the doors of the laboratory
Opening the doors of the laboratoryOpening the doors of the laboratory
Opening the doors of the laboratoryCybera Inc.
 
Open City - Edmonton
Open City - EdmontonOpen City - Edmonton
Open City - EdmontonCybera Inc.
 
Unlocking the power of healthcare data
Unlocking the power of healthcare dataUnlocking the power of healthcare data
Unlocking the power of healthcare dataCybera Inc.
 
Checking in on Healthcare Data Analytics
Checking in on Healthcare Data AnalyticsChecking in on Healthcare Data Analytics
Checking in on Healthcare Data AnalyticsCybera Inc.
 
Open access and open data: international trends and strategic context
Open access and open data: international trends and strategic contextOpen access and open data: international trends and strategic context
Open access and open data: international trends and strategic contextCybera Inc.
 

Mais de Cybera Inc. (20)

Cyber Summit 2016: Technology, Education, and Democracy
Cyber Summit 2016: Technology, Education, and DemocracyCyber Summit 2016: Technology, Education, and Democracy
Cyber Summit 2016: Technology, Education, and Democracy
 
Cyber Summit 2016: Understanding Users' (In)Secure Behaviour
Cyber Summit 2016: Understanding Users' (In)Secure BehaviourCyber Summit 2016: Understanding Users' (In)Secure Behaviour
Cyber Summit 2016: Understanding Users' (In)Secure Behaviour
 
Cyber Summit 2016: Insider Threat Indicators: Human Behaviour
Cyber Summit 2016: Insider Threat Indicators: Human BehaviourCyber Summit 2016: Insider Threat Indicators: Human Behaviour
Cyber Summit 2016: Insider Threat Indicators: Human Behaviour
 
Cyber Summit 2016: Research Data and the Canadian Innovation Challenge
Cyber Summit 2016: Research Data and the Canadian Innovation ChallengeCyber Summit 2016: Research Data and the Canadian Innovation Challenge
Cyber Summit 2016: Research Data and the Canadian Innovation Challenge
 
Cyber Summit 2016: Knowing More and Understanding Less in the Age of Big Data
Cyber Summit 2016: Knowing More and Understanding Less in the Age of Big DataCyber Summit 2016: Knowing More and Understanding Less in the Age of Big Data
Cyber Summit 2016: Knowing More and Understanding Less in the Age of Big Data
 
Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse
Cyber Summit 2016: Privacy Issues in Big Data Sharing and ReuseCyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse
Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse
 
Cyber Summit 2016: Establishing an Ethics Framework for Predictive Analytics ...
Cyber Summit 2016: Establishing an Ethics Framework for Predictive Analytics ...Cyber Summit 2016: Establishing an Ethics Framework for Predictive Analytics ...
Cyber Summit 2016: Establishing an Ethics Framework for Predictive Analytics ...
 
Cyber Summit 2016: The Data Tsunami vs The Network: How More Data Changes Eve...
Cyber Summit 2016: The Data Tsunami vs The Network: How More Data Changes Eve...Cyber Summit 2016: The Data Tsunami vs The Network: How More Data Changes Eve...
Cyber Summit 2016: The Data Tsunami vs The Network: How More Data Changes Eve...
 
Cyber Summit 2016: Issues and Challenges Facing Municipalities In Securing Data
Cyber Summit 2016: Issues and Challenges Facing Municipalities In Securing DataCyber Summit 2016: Issues and Challenges Facing Municipalities In Securing Data
Cyber Summit 2016: Issues and Challenges Facing Municipalities In Securing Data
 
Cyber Summit 2016: Using Law Responsibly: What Happens When Law Meets Technol...
Cyber Summit 2016: Using Law Responsibly: What Happens When Law Meets Technol...Cyber Summit 2016: Using Law Responsibly: What Happens When Law Meets Technol...
Cyber Summit 2016: Using Law Responsibly: What Happens When Law Meets Technol...
 
Privacy, Security & Access to Data
Privacy, Security & Access to DataPrivacy, Security & Access to Data
Privacy, Security & Access to Data
 
Do Universities Dream of Big Data
Do Universities Dream of Big DataDo Universities Dream of Big Data
Do Universities Dream of Big Data
 
Predicting the Future With Microsoft Bing
Predicting the Future With Microsoft BingPredicting the Future With Microsoft Bing
Predicting the Future With Microsoft Bing
 
Analytics 101: How to not fail at analytics
Analytics 101: How to not fail at analyticsAnalytics 101: How to not fail at analytics
Analytics 101: How to not fail at analytics
 
Are MOOC's past their peak?
Are MOOC's past their peak?Are MOOC's past their peak?
Are MOOC's past their peak?
 
Opening the doors of the laboratory
Opening the doors of the laboratoryOpening the doors of the laboratory
Opening the doors of the laboratory
 
Open City - Edmonton
Open City - EdmontonOpen City - Edmonton
Open City - Edmonton
 
Unlocking the power of healthcare data
Unlocking the power of healthcare dataUnlocking the power of healthcare data
Unlocking the power of healthcare data
 
Checking in on Healthcare Data Analytics
Checking in on Healthcare Data AnalyticsChecking in on Healthcare Data Analytics
Checking in on Healthcare Data Analytics
 
Open access and open data: international trends and strategic context
Open access and open data: international trends and strategic contextOpen access and open data: international trends and strategic context
Open access and open data: international trends and strategic context
 

Último

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 

Último (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 

Cyberinfrastructure: Helping Push Research Boundaries

  • 1. Cyberinfrastructure: Helping Push Research Boundaries  Shantenu Jha* Asst. Res. Professor (CS)   Sr. Research Scientist (CCT) *also affiliated with National e­Science Centre (UK) & UCL
  • 2. CI: Helping Push Research Boundaries • Developing CI: A one step process?  – “If we build it, will they come?” “Will it be usable?” • Interplay of (sustainable, long­term, and broadly  usable) CI and Research more complex  – Research & Applications requirements  inform the  development of CI  – In response, developed CI “roughly” sets the  boundaries of applications and their usage mode – Novel applications and usage modes that can  exploit CI will push the boundaries of research... 2
  • 3. Outline • Scientific Grid Applications  • Computing Free Energies in Biological Systems –  STIMD (2003­04), SPICE (2005­06) • Challenges of Distributed Environments  • HARC: A tool for co­allocating resources – GENIUS: Grid­Enabled Neurosurgical Imaging  Using Simulations (2007­08) • Simple API for Grid Applications (SAGA) • Regional CI Example ­ Louisiana • Software: Cactus, HARC, Petashare, SAGA...  • People: LONI Institute and NSF Cybertools • Novel e­Science Applications 3
  • 4.     Source: NSF report on Cyberinfrastructure for Biology
  • 5. Computing Free Energies: Motivation Cellullar  messangers      e.g.  Growth  Factors,  Thermodynamic  quantity  of  maximum cytokines significance.  Transmembrane  receptor Characterizes the binding accurately: Recognition of    Inhibit specific protein domains      activated receptor by    Cell signalling events   Intelligent drug design ... SH2 domain Rapid  &  accurate  determination    is  critical:   Gene switched     ­  where  FE  difference  maybe  just  one part of the overall “system”      ­ library of ligands to explore    
  • 6. Scientific Grid Computing: An Exemplar • Computing FE is computationally very expensive. Balance  between  accuracy  and  rapid  determination.  Some  experimental time­scales are 2­3 days.  • Algorithmic  advances  (e.g.,  O(N  logN))  have  helped;  but  more than just algorithmic advances are required.  • Computational  ‘Grid’  Science:  Which  approaches  can  be  adapted  to  exploit  grid­based  computation?  Interplay  of  physical algorithm(s) and grid  architecture(s).      
  • 7. Computing a Free Energy Difference Using Thermodynamic Integration (TI) Src SH2 domain λ=0 λ=1 ∆G1 ∆GB ∆GA ligand Free  energy  of  binding  of  the  ligand  to  the  ∆G2 larger  protein  characterises  the  strength  of  the  binding.  TI provides a formalism to compute the difference of free energy of binding (GAB) between   two  somewhat  similar,  yet  different  peptides.  The  key  concept  in  TI    is  that  of  a  thermodynamic cycle ­ varying the value of lambda from 0 (peptide A) to 1 (peptide B).      ∆ ∆GAB  =   ∆GB ­ ∆GA      =   ∆G1 ­ ∆G2    
  • 8. TI Calculation: Modified Workflow H λ Launch initial job, use real time analysis to  determine  when  to  spawn  next  simulation  Starting  Check for convergence t at  a  new  λ  value.  Spawning  simulation  conformation continues  until  sufficient  data  collected.  Need to control several jobs.  λ=0.10 time H λ=0.25 λ λ=0.33 In  general,  use  realtime  analysis  to  dynamically  determine best next value of lambda.  λ … λ=0.9 Combine  and  calculate  data  from  all  runs  to  compute the integral to get  ∆∆GAB.    
  • 9. Infrastructure (Software) Developed for Application Steering  Library:  Correct  functional    abstraction  from  app.  perspective.  steering_control(), register_params(), emit_data()   Library  determines  destination,  transport  mechanism,  fopen,  fwrite Architecture of a steered application. Details of infrastructure & middleware (layers) hidden from app    
  • 10. Extensions of the Distributed TI  Concept  Computational Techniques ­ Replica Exchange Methods     Need for “intelligent” infrastructure to be coupled with analysis method ­ Ensemble MD – Simulate each system many times from same starting position – Allows conformational sampling. Can’t say how much a priori. Start Conformation Series of Runs End Conformations C1 Cx C2 C3 Equilibration Protocols eq1  eq2  eq3  eq4  eq5  eq6  eq7  eq8 C4
  • 11. RNA Translocation Through Protein Pores Molecular  Biology:  Critical  and  ubiquitous  process. A model for: ­ Gene expression in eukaryotic cells ­ Viral infections rely on import of viral  genome via nuclear pore ­Translocation  across  bacterial  membrane during  phage infection                       Technical Applications: Artificial pores  (similar  to  natural  pore)  for  high­ throughput DNA screening Theoretical  Physics:  Long  standing  problem  of  a  semi­flexible  polymer  motion in a confined geometry    
  • 12. Simulated Pore Interactive Computing Environment Size,  complexity  &  timescale:  Computations  expensive.      Millions of CPU hours using “vanilla” MD.  Not good enough. Free  Energy  Profile:  Extremely  challenging  but  yields  maximal insight and understanding of translocation process. Novel  Algorithm:  Steered  Molecular  Dynamics  to  “pull  DNA  through  the  pore”.  Jarzynksi's  Equation    to  compute  equilibrium  free  energy  profile  from  such  non­equilibrium  pulling.    e ­ß∆F  = ‹ e ­ßW ›    
  • 13. Grid Computing Using Novel Algorithms  SMD+JE:  Need  to  determine  “optimal”  parameters  before  simulations at the optimal values.   Requires:  Interactive  simulations  and  distributing  many  large,  parallel simulations  Interactive  “Live  coupling”:  use  visualization to steer simulation Reduces computational cost by a factor of ca. 100.  Solve  a  computationally  “intractable”  problem  using  novel  algorithm.  Our  solution  not  only  exploits  grid  infrastructure,  but  requires it.     
  • 14. SPICE: Computing the Free Energy Profile (FEP) Replace single long running ‘vanilla’ MD simulation with following scheme: Step I: Understand structural features using static visualization  Step II:  Interactive simulations for dynamic and energetic features     ­  Steered simulations: bidirectional communication.  Qualitative + Quantitative (SMD+JE)                   ­  Haptic interaction: Use haptic  to feel feedback forces   Step III:  Simulations  to compute “optimal” parameter values:                e.g., 75 simulations on 128/256 processors each.        Step IV: Use computed “optimal” values to calculate full FEP   along the cylindrical axis of the pore.                
  • 15. Grid Computing, Interactivity and Analysis •  Interactive  simulations  used  to  determine: Optimal  value  of  force­constant  &  pulling  velocity,  choice  of  sub­trajectory  length  and  location for optimal value simulations •  Use  visualization  to  provide  input  to    the  running  simulation.  Require  256px  (or  more)  of  HPC  for  interactivity.  Steady­state  data  stream (up & down) Interactive  simulations  perform  better  when  using  optical  lightpaths  between simulation and visualization Due to network characteristics. Typical “legacy” app (NAMD) not written  for network I/O. “Unreliable” transfer can stall simulations.      
  • 16. “Global” Grid Infrastructure UK NGS NGS HPCx Leeds US TeraGrid Starlight (Chicago) Manchester Netherlight (Amsterdam) SDSC Oxford NCSA PSC RAL UKLight App Sci DEISA All sites connected by production network Computation Network PoP Visualization
  • 17. Recap: FE Exemplars  Both FE Algorithms are good candidates for distributed  resource utilization –  i.e., “pleasantly” distributable  Similar Infrastructure  –  Software (ReG Steering Services), middleware..  –  Federated Grids  SPICE more complex than STIMD: – Complexity of tasks different –  Needs co­scheduling of heterogenous resources – number of components/degree­of­freedom different
  • 18. VORTONICS:  Vortex Dynamics on Transatlantic Federated Grids US­UK TG­NGS Joint Projects Supported by NSF,  EPSRC, and TeraGrid Computational challenge: Enormous problem sizes, memory  requirements, and long run times: Largest runs require  geographically distributed domain decomposition (GD3)    
  • 19. Run Sizes to Date / Performance • Using an early version of MPICH­G2, 3D lattice  sizes up to 6453  across six sites on TG/NGS • NCSA, SDSC, ANL, TACC, PSC, CSAR (UK) • Amount of data injected into network.   Strongly bandwidth limited. • Effective SUPS/processor • Reduced by factor approximately equal to  1 2 3 4 sites number of sites • Therefore SUPS approximately constant as  problem grows in size sites kSUPS/Proc – If too large to fit onto one machine, GD3   over  1 600 N resources simultaneously is no worse than  2 300 N sequential runs 4 149 6 75    
  • 20. Outline • Scientific Grid Applications  • Computing Free Energies in Biological Systems –  STIMD (2003­04), SPICE (2005­06) • Challenges of Distributed Environments  • HARC: A tool for co­allocating resources – GENIUS: Grid­Enabled Neurosurgical Imaging  Using Simulations (2007­08) • Simple API for Grid Applications (SAGA) • Regional CI Example • Software: Cactus, HARC, Petashare, SAGA...  • People: LONI Institute and NSF Cybertools • Novel e­Science Applications 20
  • 21. Challenges of Distributed Environments Lessons learnt from Pilot Projects • Hiding the Heterogeneity; providing uniformity We interface application code to grid middleware through well  defined user­level APIs. No code refactoring required. Hides heterogeneity of software stack and site­specific details: Vortonics: MPICH­G2 hides low­level details   (communication,  network­topology,  resource­allocation and management) SPICE ­­ RealityGrid steering library  • Need for usable, stable and  extensible infrastructure Infrastructure relatively “easy” for demo(s); difficult for routine use!  Science requires stable & persistent infrastructure Motivation for SAGA Efforts at OGF    
  • 22. Challenges of Distributed Environments • Machine Configuration Issues: ­ Variants of the same problem faced, e.g., hidden IP issue         for MPICH­G2 and RealityGrid steering   ­ Same problem on different resources, e.g., PSC and HPCx.  PSC  = qsocket + Access Gateway Node         performance issues remain due to protocol constraints HPCx= Same solution does not work for ReG Steering;                   port­forwarding being tested    
  • 23. Challenges of Distributed Environments Federated Grids • Current barrier to utilise federated grids still high: Many degrees­of­freedom need coordination Collective Inter­Grid Debugging required • Federated Grids must be interoperable in practice: Stress test using real applications Requires additional “user level  middleware” (MPICH­G2,   ReG steering infrastructure), to work  across grids • Paper on the theory, implementation and experiences of the three  joint projects: (CLADE 2006, Sp. Issue Cluster Comp)        http://www.realitygrid.org/publications/triprojects_clade_final.pdf •Application level Interoperability; Influenced the creation of GIN    
  • 24. Challenges of Distributed Environments New policies for resource co­ordination are required  A common requirement of SPICE and VORTONICS: co­scheduling  of resources (computer, visualization, network)! Three levels of scheduling complexity: –  Advance single resource reservation –  Advanced, coordinated multiple reservations across a grid  –  Advance coordinated reservations across distinct grids! First breaks standard HPC usage model; Third  Cross­Grid Co­ scheduling is very hard today. Current levels of human intervention too high: Need  Automation              Motivation for HARC    
  • 25. HARC: Highly­Available Resource Co­allocator • What is Co­allocation? • Process of reserving multiple resources for use by a  single application or “thing” – but in a single step... • Can reserve the resources: – For the same time: • Metacomputing, large MPIg/MPICH­G2 jobs • Distributed visualization  – Or some coordinated set of times • Computational workflows   HARC is primarily  developed by Jon  Maclaren@ CCT  
  • 26. How does HARC Work? • Client makes request, from  command line, or other tool via  Client API • Request goes to the HARC  Acceptors, which manage the co­ allocation • The Acceptors talk to individual  Resource Managers which make  the individual reservations by  talking to the local schedulers    
  • 27. HARC is Extensible (Community Model) • Modular Design throughout – Not just compute resources.  New resource types can be  added, then co­allocated with all other types of resource – No modification to Acceptors is needed.  Just provide  Resource Manager code to schedule the resource – And extend the Client API with new classes (again, no mods to  existing code) – Even works from the command line • Example: Network Resource Manager $ harc-reserve -n EnLIGHTened/RA1-BTH -c bluedawg.loni.org/8 -s 12:00 -d 1:00 – Co­allocates a lightpath between LSU & MCNC, with 8  processors on bluedawg... – Was used to schedule lightpaths in EnLIGHTened testbed for  Thomas Sterling’s HPC Class, broadcast in High­def video    
  • 28. GENIUS: Overview PI: Coveney, UCL Goals:    Provide a better understanding of      cerebral fluid flow,   Inform clinicians of best surgical approaches. Approach:   Model large scale patient specific cerebral blood flow within clinically  relevant time frames Provides:    Reliable & effective patient­specific image­based models   Efficient LB blood flow simulation  Real time blood flow volume rendering – Visualisation    
  • 29. HemeLB fluid flow solver  A fast fluid flow simulation of a very large system  requires the use of an efficient parallel fluid solver  several processors • Lattice­Boltzmann method; Parallel MPI code, • Efficient algorithms for sparse geometries, • Topology­aware graph growing partitioning technique, • Optimized inter­ and intra­machine communication  patterns, • Full checkpoint capabilities...    
  • 32. NGS UK NGS US/UK Grid Infrastructure HPCx Leeds Manchester TeraGrid Oxford RAL LONI LSU HSCTech La ULM NSU The GENIUS project makes use of Alex SLU SU LSU infrastructure provided by LONI,  TeraGrid and NGS, connected by  ULL UNO McNeese LSU HSC dedicated switched optical light paths Tulane    
  • 33. Using HARC... • Our aim is to get HARC available to users as part of  the basic Grid infrastructure • Current deployments – LONI (Louisiana Optical Network Initiative) • production mode – UK NGS, Manchester, Leeds and Oxford NGS2  – TeraGrid Co­scheduling testbed machines  (SDSC/NCSA IA­64) – NW­Grid (Lancaster, Manchester) • Everything is open source too • See:   – http://www.cct.lsu.edu/~maclaren/HARC/  
  • 34. Rough Taxonomy of Applications • Some applications are Grid­unaware and want to remain so – Use tools/environments (e.g, NanoHub, GridChem) – May run on Grid­aware/Grid­enabled environments (e.g.   Condor) or programming environment (e.g, MPICH­G2) • Some applications are explicitly Grid­aware – Control, Interact & Exploit distributed systems at the  application level 34
  • 35. SAGA: In a Nutshell • A lack of:  • Programming interface that provides common grid  functionality with the correct level of abstractions?  • Ability to hide underlying complexities, varying semantics,  heterogenities and changes from application  program(er)  • Simple, integrated, stable, uniform, high­level  interface • Simplicity: Restricted in scope, 80/20  • Measure(s) of success: • Does SAGA enable quick development  of “new”  distributed applications? • Does it enable greater functionality using less code?
  • 36. Copy a File: Globus GASS int copy_file (char const* source, char const* target) if (source_url.scheme_type == GLOBUS_URL_SCHEME_GSIFTP || { source_url.scheme_type == GLOBUS_URL_SCHEME_FTP ) { globus_url_t source_url; globus_ftp_client_operationattr_init (&source_ftp_attr); globus_io_handle_t dest_io_handle; globus_gass_copy_attr_set_ftp (&source_gass_copy_attr, globus_ftp_client_operationattr_t source_ftp_attr; &source_ftp_attr); globus_result_t result; } globus_gass_transfer_requestattr_t source_gass_attr; else { globus_gass_copy_attr_t source_gass_copy_attr; globus_gass_transfer_requestattr_init (&source_gass_attr, globus_gass_copy_handle_t gass_copy_handle; source_url.scheme); globus_gass_copy_handleattr_t gass_copy_handleattr; globus_gass_copy_attr_set_gass(&source_gass_copy_attr, globus_ftp_client_handleattr_t ftp_handleattr; &source_gass_attr); globus_io_attr_t io_attr; } int output_file = -1; output_file = globus_libc_open ((char*) target, if ( globus_url_parse (source_URL, &source_url) != GLOBUS_SUCCESS ) { O_WRONLY | O_TRUNC | O_CREAT, printf (quot;can not parse source_URL quot;%squot;nquot;, source_URL); S_IRUSR | S_IWUSR | S_IRGRP | return (-1); S_IWGRP); } if ( output_file == -1 ) { printf (quot;could not open the file quot;%squot;nquot;, target); if ( source_url.scheme_type != GLOBUS_URL_SCHEME_GSIFTP && return (-1); source_url.scheme_type != GLOBUS_URL_SCHEME_FTP && } source_url.scheme_type != GLOBUS_URL_SCHEME_HTTP && /* convert stdout to be a globus_io_handle */ source_url.scheme_type != GLOBUS_URL_SCHEME_HTTPS ) { if ( globus_io_file_posix_convert (output_file, 0, printf (quot;can not copy from %s - wrong protnquot;, source_URL); &dest_io_handle) return (-1); != GLOBUS_SUCCESS) { } printf (quot;Error converting the file handlenquot;); globus_gass_copy_handleattr_init (&gass_copy_handleattr); return (-1); globus_gass_copy_attr_init (&source_gass_copy_attr); } globus_ftp_client_handleattr_init (&ftp_handleattr); result = globus_gass_copy_register_url_to_handle ( globus_io_fileattr_init (&io_attr); &gass_copy_handle, (char*)source_URL, &source_gass_copy_attr, &dest_io_handle, globus_gass_copy_attr_set_io (&source_gass_copy_attr, &io_attr); my_callback, NULL); &io_attr); if ( result != GLOBUS_SUCCESS ) { globus_gass_copy_handleattr_set_ftp_attr printf (quot;error: %snquot;, globus_object_printable_to_string (&gass_copy_handleattr, (globus_error_get (result))); &ftp_handleattr); return (-1); globus_gass_copy_handle_init (&gass_copy_handle, } &gass_copy_handleattr); globus_url_destroy (&source_url); return (0); 10/22/2006 LCSD'06  36 }
  • 37. SAGA Example: Copy a File High­level, uniform #include <string> #include <saga/saga.hpp> void copy_file(std::string source_url, std::string target_url) { try { saga::file f(source_url); f.copy(target_url); } catch (saga::exception const &e) { std::cerr << e.what() << std::endl; } } • Provides the high level abstraction, that application programmers  need; will work across different systems • Shields gory details of lower­level m/w system • Like MapReduce – leave details of distribution etc. out 10/22/2006 LCSD'06  37
  • 38. SAGA: Scope • Is: –  Simple API for Grid­Aware Applications •  Deal with distributed infrastructure explicitly – High­level (= application­level) abstraction – An uniform interface to different middleware(s) – Client­side software • Is NOT: – Middleware – A service management interface! – Does not hide the resources ­ remote files, job (but  the details)    
  • 39. SAGA API: Towards a Standard • The need for a standard programming interface – “Go it alone” versus “Community” model  –  Reinventing the wheel again, yet again, and again –  MPI as a useful analogy of community standard – OGF the natural choice; establish SAGA­RG •  “Tedium” of the standardisation process? – Not all technology needs to be standardised upfront  – Standardisation not a guarantee to success • Requirements Document  –  Quick skim through the Requirements document re –  Design and requirements derived from 23 Use Cases –  Different projects, applications and functionality    
  • 42. Implementation ­ Requirements • Non­trivial set of requirements: – Allow heterogenous middleware to co­exist – Cope with evolving grid environments; dyn resources – Future SAGA API extensions  – Portable, syntactically and semantically platform  independent; permit latency hiding mechanisms – Ease of deployment, configuration, multiple­language  support, documentation etc. – Provide synchronous, asynchronous & task versions  Portability, modularity, flexibility, adaptabilty, extensibility 10/22/2006 LCSD'06  42
  • 43. Job Submission API 01: // Submitting a simple job and wait for completition 02: // 03: saga::job_description jobdef; 04: jobdef.set_attribute (quot;Executablequot;, quot;job.shquot;); 05: 06: saga::job_service js; 07: saga::job job = js.create_job (quot;remote.host.netquot;, jobdef); 08: 09: job.run(); 10: 11: while( job.get_state() == saga::job::Running ) 12: { 13: std::cout << “ Job running with ID: “ 14: << job.get_attribute(“ JobID” ) << std::endl; 15: sleep(1); 16: }    
  • 45.  GridSAT  First Principles Grid Application • Grid implementation  of the  satisfiability problem: To  determine if the variables of  given Boolean formula can  be assigned such as to  make it TRUE.  • Adaptive: computation to  communication ratio  need/can be adjustable (!) • Allows new domain science –  beats zChaff (time taken  and problem) Adapted from slides by Wolski & Chakrab    
  • 46. GridSAT Characteristics • Parallel, distributed SAT solver – Both CPU and Memory Intensive – Splitting leads to better performance – Allows sharing: clause learned in solver shared • Grid Aware Application: – Heterogenous (single, clusters & supercomputers) – Dynamical Resource Usage • Unpredictable runtime behaviour –  How much time? How many resources? When  to split? Which process splits first? – Problems vary: easy to hard, short to long – Need to be adaptive, “add resources as you go”    
  • 47. GridSAT: Programming Requirements • RPC, Dynamic resource & Job   Error Handling, scheduling and  management checkpointing SAGA provides the required programming functionality, at the  correct level of abstraction and thus makes it easier to   manage, deploy and extend (for new functionality) GridSAT    
  • 48. Legacy Application: Replica Exchange • “Class of algorithm” used for bio­ molecular simulations • e.g.,  Protein (mis­) folding • Primarily used for  •  Enhanced sampling •  Determine transition rates • Task Level Parallelism  – Embarrassingly distributable!    
  • 49. Replica Exchange Algorithm • Create replicas of initial  R1 configuration • Spawn 'N' replicas over  R2 different machine R3 • Run for time t RN • Attempt configuration  swap of   Ri  <­> Rj • Run for further time t hot • ... T • Repeat till termination     t Exchange  attempts 300K t
  • 50. RE: Programming Requirements  RE can be implemented using following “primitives” • Read job description – # of processors, replicas, determine resources • Submit jobs – Move files, job launch • Access simulation data & analysis • Checkpoint and re­launch simulations – Exchange, RPC (to swap or not)  Implement above using “grid primitives” provided by SAGA   Separated “distributed” logic from “simulation” logic  Independent of underlying code/engine  Science kernel is independent of details of distributed  resource management    Desktop akin to High­end supercomputer!!    
  • 51. Programming Distributed  Applications    Parallel Programming Analogy  Status of distributed programming today, (somewhat) similar  to parallel programming pre­MPI days  MPI was a “success” in that it helped many new applications –  MPI was simple –  MPI was a standard (stable and portable code!) • SAGA is to the grid application developer, what MPI is to the  parallel program developer  (“grid­primitives”) • SAGA conception & trajectory similar to MPI –  SAGA is simple to use – OGF specification; on path to becoming a standard   •Therefore, SAGA's Measure(s) of success:    •  Does SAGA enable “new” grid applications?   
  • 52. Outline • Scientific Grid Applications  • Computing Free Energies in Biological Systems –  STIMD (2003­04), SPICE (2005­06) • Challenges of Distributed Environments  • HARC: A tool for co­allocating resources – GENIUS: Grid­Enabled Neurosurgical Imaging Using  Simulations (2007­08) • Simple API for Grid Applications (SAGA) • Regional CI Example: LONI (Now part of TeraGrid!) • Hardware: Compute + Network  • Software: Cactus, HARC, Petashare, SAGA...  • People: LONI Institute and NSF Cybertools • Novel e­Science Applications 52
  • 53. 3 Axes: ­ LONI ­ CyberTools ­ LONI Institute ~ 100TF IBM, Dell  LA Tech Supercomputers National Lambda Rail LSU SUBR UNO UL­L Tulane
  • 55. Cybertools (2) WP4: Core Package!    
  • 57. 57
  • 58. • Goal: Enable underlying infrastructure to manage the  low­level data handling issues.  • Novel approach: treat data storage resources and the  tasks related to data access as first class entities just like  computational resources and compute tasks. • Key technologies being developed: data­aware storage  systems, data­aware schedulers (i.e. Stork), and cross­ domain meta­data scheme.  • PetaShare exploits 40 Gb/sec LONI connections  between 5 LA Universities : LSU, LaTech, Tulane, ULL &  UNO Cybertools: Not just compute! PI: Tevfik Kosar (CCT/LSU)    
  • 59. Participating institutions in the PetaShare project, connected through LONI. Sample research of the participating researchers pictured (i.e. biomechanics by Kodiyalam & Wischusen, tangible interaction by Ullmer, coastal studies by Walker, and molecular biology by Bishop). High Energy Physics Biomedical Data Mining LaTech Coastal Modeling Petroleum Engineering LSU Computational Fluid Dynamics Synchrotron X-ray Microtomography UNO Biophysics ULL Tulane Geology Molecular Biology Petroleum Engineering Computational Cardiac Electrophysiology    
  • 60. LONI Institute  • Build on LONI infrastructure, create bold new inter­ university superstructure – New faculty, staff, students;  train others.  Focus on CS, Bio,  Materials, but all disciplines impacted – Promote collaborative research at interfaces for innovation – Much stronger recruiting opportunities for all institutions • Two new faculty at each institution (12 total) – Six each in CS, Comp. Bio/Materials with half PKSFI matching;   fully covered after five years • Six Computational Scientists – Support 70­90 projects over five years; lead to external funding • Graduate students – 36 new students funded, trained; two years each
  • 62. Resource Performance Monitoring Application • NWS, BQP – only 1 resource at a time!!  • jfdja; • How to choose M resources out of N ?   e.g. MPICH­G2 Application, which M • Cactus + SAGA + LONI (Lightpaths) 62 CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
  • 63. 63
  • 65. Acknowledegments: The SAGA Team Hartmut Kaiser Thilo Kielmann Andre Merzky Ceriel Jacobs Ole Weidner Kees Verstop
  • 66. Acknowledgements • HARC: Jon Maclaren, LSU  • GENIUS: Peter Coveney and Steve Manos, UCL • PetaShare: Tevfik Kosar  • Students & Research Staff @ CCT  • LONI Staff • Funding Agencies: NSF, EPSRC (UK), LA BoR 66 CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
  • 67. Conference Announcement • MardiGras Conference 2008:  “Novel Distributed  Computing Applications and Technology”  –  http://mardigrasconference.org • Dan Katz (Chair) & Shantenu Jha (co­Chair)  • Craig Lee (PC Chair), Geoffrey Fox (Vice­Chair,  Emerging Technologies), Bill St. Arnaud (Vice  Chair, Network­Instensive Applications), Matei  Ripeanu (UBC) Publicity  • Oct 31 Paper Submission Deadline • Peer reviewed proceedings  to be published in  ACM library (ISBN I978­1­59593­835­0) 67