2. Agenda
q Principles of Service Oriented Architecture
q The Globus Toolkit
q Web Services Basics
q Grid Services
q What people punt on ?
x Intro to Globus Security, Service Registries
q Workflows we created
q Lessons learned
3. Principles of Service Oriented
Architecture
q Guiding principles define the ground rules for
development, maintenance, and usage of the
SOA
x Reuse, granularity, modularity, composability,
componentization and interoperability
x Standards compliance (both common and
industry-specific)
x Services identification and categorization,
provisioning and delivery, and monitoring and
tracking
4. Architectural Principles
q Service encapsulation – Many web services are
consolidated to be used under the SOA.
q Service loose coupling – Services maintain a
relationship that minimizes dependencies and only
requires that they maintain an awareness of each
other
q Service contract – Services adhere to a
communications agreement, as defined collectively
by one or more service description documents
q Service abstraction – Beyond what is described in
the service contract, services hide logic from the
outside world
5. Architectural Principles
q Service reusability – Logic is divided into
services with the intention of promoting
reuse
q Service composability – Collections of
services can be coordinated and assembled
to form composite services
q Service autonomy – Services have control
over the logic they encapsulate
6. Architectural Principles
q Service optimization – All else equal, high-
quality services are generally considered
preferable to low-quality ones
q Service Discoverability - Services are
designed to be outwardly descriptive so
that they can be found and assessed via
available discovery mechanisms
q Service Relevance – Functionality is
presented at a granularity recognized by
the user as a meaningful service
7. Globus Software: dev.globus.org
Globus Projects
OGSA-DAI GT4
MPICH-
G2 Java Data Replica
Delegation MyProxy
Runtime Rep Location
GridWay C GSI-
CAS GridFTP MDS4
Runtime OpenSSH
Incubator Reliable
Mgmt Python
C Sec GRAM File GT4 Docs
Runtime
Transfer
Common Execution Info
Security Data Mgmt Other
Runtime Mgmt Services
8. Web Service Basics
q Web Services are basic distributed
computing technology that let us construct
client-server interactions
Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html
9. Web Service Basics 2
q Web services are platform independent
and language independent
x Client and server program can be written in
diff langs, run in diff envt’s and still interact
q Web services describe themselves
x Once located you can ask it how to use it
q Web services are ideal for loosely coupled
systems
x Unlike CORBA, EJB, etc.
10. WSDL: Web Services
Description Language
Define expected messages for a service,
and their (input or output parameters)
An interface groups together a number of
messages (operations)
Bind an Interface via a definition
to a specific transport (e.g. The network location where the service is
HTTP) and messaging (e.g. implemented , e.g. http://localhost:8080
SOAP) protocol
11. Real Web Service Invocation
Discover
Describe
Invoke
Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html
12. Web Services Server Applications
q Web service – software that
exposes a set of operations
q SOAP Engine – handle SOAP
requests and responses
(Apache Axis)
q Application Server – provides
Container
“living space” for applications
that must be accessed by
different clients (Tomcat)
q HTTP server- also called a
Web server, handles http
messages
Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html
13. Let’s talk about state
q Plain Web services are stateless
Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html
14. However, Many Grid
Applications Require State
Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html
15. Keep the Web Service
and the State Separate
q Instead of putting state in a Web
service, we keep it in a resource
q Each resource has a unique key
Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html
16. Resources Can Be Anything Stored
Web Service
+
Resource
=
WS-Resource
Address of a WS-
resource is called
an end-point
reference
17. Web Services So Far
q Basic client-server interactions
q Stateless, but with associated resources
q Self describing using WSDL
q But we’d really like is a common way to
x Name and do bindings
x Start and end services
x Query, subscription, and notification
x Share error messages
18. Standard Interfaces
q Service information
q State representation
x Resource
GetRP
x Resource Property
GetMultRPs
q State identification
SetRP x Endpoint Reference
Client Web
QueryRPs
Service
q State Interfaces
Subscribe x GetRP, QueryRPs,
GetMultipleRPs, SetRP
SetTerm
Time q Lifetime Interfaces
Destroy x SetTerminationTime
x ImmediateDestruction
q Notification Interfaces
x Subscribe
x Notify
q ServiceGroups
19. WSRF & WS-Notification
q Naming and bindings (basis for virtualization)
x Every resource can be uniquely referenced, and has one or
more associated services for interacting with it
q Lifecycle (basis for fault resilient state management)
x Resources created by services following factory pattern
x Resources destroyed immediately or scheduled
q Information model (basis for monitoring & discovery)
x Resource properties associated with resources
x Operations for querying and setting this info
x Asynchronous notification of changes to properties
q Service Groups (basis for registries & collective svcs)
x Group membership rules & membership management
q Base Fault type
20. WSRF vs XML/SOAP
q The definition of WSRF means that the
Grid and Web services communities can
move forward on a common base
q Why Not Just Use XML/SOAP?
x WSRF and WS-N are just XML and SOAP
x WSRF and WS-N are just Web services
q Benefits of following the specs:
x These patterns represent best practices that
have been learned in many Grid
applications
x There is a community behind them
x Why reinvent the wheel?
x Standards facilitate interoperability
21. WS Core Enables Frameworks:
E.g., Resource Management
Applications of the framework
(Compute, network, storage provisioning,
job reservation & submission, data management,
application service QoS, …)
WS-Agreement WS Distributed Management
(Agreement negotiation) (Lifecycle, monitoring, …)
WS-Resource Framework & WS-Notification (*)
(Resource identity, lifetime, inspection, subscription, …)
Web services
(WSDL, SOAP, WS-Security, WS-ReliableMessaging, …)
* An evolution of Open Grid Services Infrastructure (OGSI)
22. Globus and Web Services
User Applications
Globus
(e.g., Apache Axis)
Globus Container
and Admin
WSRF Web
Registry
Services
WS-A, WSRF, WS-Notification
WSDL, SOAP, WS-Security
Globus Core: Java , C (fast, small footprint), Python
23. Globus and Web Services
User Applications
Custom Globus
(e.g., Apache Axis)
Globus Container
and Admin
WSRF Web
Registry
Custom WSRF
Web Services Services
Services
WS-A, WSRF, WS-Notification
WSDL, SOAP, WS-Security
Globus Core: Java , C (fast, small footprint), Python
24. Globus Security
q Extensible authorization framework
based on Web services standards
x SAML-based authorization callout
q Security Assertion Markup Language, OASIS
standard
q Used for Web Browers authentication often
q Very short-lived bearer credentials
x Integrated policy decision engine
q XACML (eXtensible Access Control Markup
Language) policy language, per-operation
policies, pluggable
25. Delegation Service
q Higher level Hosting Environment
service Service1
q Authentication Service2
Resources
protocol EPR Delegation Service
independent Service3
Delegate Refresh
q Refresh
interface Refresh
q Delegate once, EPR
Delegate
share across
services and
Client
invocation
Rachana Ananthakrishnan
26. Delegation
q Secure Conversation
x Can delegate as part of protocol
x Extra round trip with delegation
x Types: Full or Limited delegation
x Delegation Service is preferred way of
delegating
q Secure Message and Secure Transport
x Cannot delegate as part of protocol
Rachana Ananthakrishnan
27. Globus’s Use of
Security Standards
Supported, Supported, Fastest,
but slow but insecure so default
28. Monitoring and Discovery System
(MDS4)
q Grid-level monitoring system
x Aid user/agent to identify host(s) on which to
run an application
x Warn on errors
q Uses standard interfaces to provide publishing
of data, discovery, and data access, including
subscription/notification
x WS-ResourceProperties, WS-BaseNotification,
WS-ServiceGroup
q Functions as an hourglass to provide a
common interface to lower-level monitoring
tools
29. Taverna
A sample
caGrid
workflow
caGrid Scavenger with semantic/
metadata
based caGrid service query
30. Sample Workflow with caDSR
q Scientific value Workflow
input
x To find all the UML packages
related to a given context
(‘caCore’). caGrid
services
x Not a real scientific
experiment.
q Simple. “Shim”
q Important in caGrid. services
q Steps
x Querying Project object.
x Do data transformation.
x Querying Packages object Workflow
output
and get the result.
31. Protein sequence information query
q Scientific value
x To query protein sequence
information out of 3 caGrid
data services: caBIO, CPAS and
GridPIR.
x To analyze a protein sequence
from different data sources.
q Steps
x Querying CPAS and get the id,
name, value of the sequence.
x Querying caBIO and GridPIR
using the id or name obtained
from CPAS.
32. Microarray clustering*
q Scientific value
x A common routine to group
genes or experiments into
clusters with similar profiles.
x To identify functional groups of
genes.
q Steps
x Querying and retrieving the
microarray data of interest from
a caArrayScrub data service at
Columbia University
x Preprocessing, or normalize the
microarray data using the
GenePattern analytical service Workflow in/output
at the Broad Institute at MIT
caGrid services
x Running hierarchical clustering
using the geWorkbench others “Shim” services
analytical service at Columbia
University
*Wei Tan, Ravi Madduri, Kiran Keshav, Baris E. Suzek, Scott
Oster, Ian Foster. Orchestrating caGrid Services in Taverna.
ICWS 08.
33. Execution Execution result as
trace xml
1936 gene expressions
34. Lymphoma prediction type prediction
q Scientific value *
x Using gene-expression patterns
associated with DLBCL and FL to
predict the lymphoma type of an
unknown sample.
x Using SVM (Support Vector
Machine) to classify data, and
predicting the tumor types of
unknown examples.
q (Major) steps
x Querying training data from
experiments stored in caArray.
x Preprocessing, or normalize the
microarray data.
x Adding training and testing data
into SVM service to get
classification result.
*Fig. from MA Shipp. Diffuse large B-cell lymphoma outcome prediction by
gene-expression profiling and supervised machine learning. Nature medicine,
36. Lymphoma type prediction
q Result snippet *Classification errors are
highlighted.
Acknowledgement:
Juli Klemm, Xiaopeng Bian, Rashmi Srinivasa (NCI)
Jared Nedzel (MIT)
37. Lessons Learned
q Service abstraction not applicable to
everything
q Virtual Organization concepts still good
q Web services is one way to create service
oriented architectures but not always the
best way
q Make implementation agnostic of tools
underneath
q True value in ability to create workflows
38. Service-Oriented Science
q People create services (data or functions) …
q which I discover (& decide whether to use) …
q & compose to create a new function ...
q & then publish as a new service.
q I find “someone else” to host services,
so I don’t have to become an expert in operating
services & computers!
!! q I hope that this “someone else” can
manage security, reliability, scalability, …
“Service-Oriented Science”, Science, 2005