The SBGrid Science Portal provides multi-modal access to computational infrastructure, data storage, and data analysis tools for the structural biology community. It incorporates features not previously seen in cyberinfrastructure science gateways. It enables researchers to securely share a computational study area, including large volumes of data and active computational workflows. A rich identity management system has been developed that simplifies federated access to US national cyberinfrastructure, distributed data storage, and high performance file transfer tools. It integrates components from the Virtual Data Toolkit, Condor, glideinWMS, the Globus Toolkit and Globus Online, the FreeIPA identity management system, Apache web server, and the Django web framework.
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
SBGrid Science Portal - eScience 2012
1. The
SBGrid
Science
Portal:
An
integrated
environment
for
protein
structure
studies
Ian
Stokes-‐Rees
Harvard
Medical
School
eScience
2012,
Chicago,
October
2012
3. What’s
interesting
about
another
Science
Portal?
✦ Interface
modalities
• Web
forms,
RESTful
interfaces,
command
line
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
4. What’s
interesting
about
another
Science
Portal?
✦ Interface
modalities
• Web
forms,
RESTful
interfaces,
command
line
✦ Access
model
• Browser
SSO,
X.509,
LDAP,
.htaccess,
GACL
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
5. What’s
interesting
about
another
Science
Portal?
✦ Interface
modalities
• Web
forms,
RESTful
interfaces,
command
line
✦ Access
model
• Browser
SSO,
X.509,
LDAP,
.htaccess,
GACL
✦ Identity
management
• Streamlined
grid
account
creation
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
6. What’s
interesting
about
another
Science
Portal?
✦ Interface
modalities
• Web
forms,
RESTful
interfaces,
command
line
✦ Access
model
• Browser
SSO,
X.509,
LDAP,
.htaccess,
GACL
✦ Identity
management
• Streamlined
grid
account
creation
✦ Computational
capability
• local,
cluster,
and
grid
computing
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
7. What’s
interesting
about
another
Science
Portal?
✦ Interface
modalities
• Web
forms,
RESTful
interfaces,
command
line
✦ Access
model
• Browser
SSO,
X.509,
LDAP,
.htaccess,
GACL
✦ Identity
management
• Streamlined
grid
account
creation
✦ Computational
capability
• local,
cluster,
and
grid
computing
✦ Data
management
• Web
(HTTP),
scp,
GridFTP,
GlobusOnline
• Tiered
staging
of
data
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
8. I’m
still
skeptical.
What
about
Taverna,
GridSphere,
Galaxy,
or
HubZero?
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
9. I’m
still
skeptical.
What
about
Taverna,
GridSphere,
Galaxy,
or
HubZero?
✦ All
great
if
• the
portal
or
application
plugin
already
exists;
and
• the
application
workGlows
closely
match
your
requirements
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
10. I’m
still
skeptical.
What
about
Taverna,
GridSphere,
Galaxy,
or
HubZero?
✦ All
great
if
• the
portal
or
application
plugin
already
exists;
and
• the
application
workGlows
closely
match
your
requirements
✦ Not-‐so-‐great
if
• you
have
to
implement
a
new
portal
on
top
of
one
of
those
frameworks
• you
want
to
adapt
the
workGlow
• your
data
model
changes
• you
want
to
add
a
new
application
• you
want
to
explore
the
data
in
an
unanticipated
way
• command-‐line
access
is
also
important
to
you
• you
are
working
with
others
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
12. Outline
✦ Community
• Who
the
SBGrid
Science
Portal
is
meant
to
serve
✦ Objectives
• What
was
the
vision
for
the
Science
Portal
✦ Implementation
• Software
and
service
architectures
✦ Security,
Collaboration,
and
IdM
• ...
or
“How
I
learned
to
stop
worrying
and
love
X.509”
✦ Data
• Tiered
data
distribution
model
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
13. Washington U. School of Med. Cornell U.
R. Cerione NE-CAT
T. Ellenberger
B. Crane R. Oswald
D. Fremont
S. Ealick C. Parrish
Rosalind Franklin NIH M. Jin H. Sondermann
D. Harrison M. Mayer
A. Ke UMass Medical
U. Washington
T. Gonen
U. Maryland W. Royer
E. Toth
Brandeis U.
UC Davis N. Grigorieff
H. Stahlberg Tufts U.
K. Heldwein
UCSF Columbia U.
JJ Miranda Q. Fan
Y. Cheng
Community
Rockefeller U.
Stanford R. MacKinnon
A. Brunger Yale U.
K. Garcia T. Boggon K. Reinisch
T. Jardetzky D. Braddock J. Schlessinger
Y. Ha F. Sigworth
CalTech E. Lolis F. Zhou
P. Bjorkman Harvard and Affiliates
W. Clemons N. Beglova A. Leschziner
G. Jensen Rice University S. Blacklow K. Miller
D. Rees E. Nikonowicz B. Chen A. Rao
Y. Shamoo Vanderbilt J. Chou T. Rapoport
Y.J. Tao Center for Structural Biology J. Clardy M. Samso
WesternU
W. Chazin C. Sanders M. Eck P. Sliz
M. Swairjo
B. Eichman B. Spiller B. Furie T. Springer
M. Egli M. Stone R. Gaudet G. Verdine
UCSD B. Lacy M. Waterman M. Grant G. Wagner
T. Nakagawa M. Ohi S.C. Harrison L. Walensky
H. Viadiu Thomas Jefferson J. Hogle S.Walker
J. Williams D. Jeruzalmi T.Walz
D. Kahne J. Wang
Not Pictured:
University of Toronto: L. Howell, E. Pai, F. Sicheri; NHRI (Taiwan): G. Liou; Trinity College, Dublin: Amir Khan T. Kirchhausen S. Wong
14. Structural
Biology:
Study
of
Protein
Structure
and
Function
400m
1mm
j.mp/esci12-sbgrid 10nm
ijstokes@seas.harvard.edu
15. Structural
Biology:
Study
of
Protein
Structure
and
Function
400m
1mm
10nm
• Shared
scientiGic
data
collection
facility
• Data
intensive
(10-‐100
GB/day)
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
16. Consortium
By
The
Numbers
✦ ~200
member
labs
• representing
about
1500
users
✦ ~200
software
packages
• multi-‐platform
(Linux,
OS
X)
• multi-‐version
✦ 4
FTE
staff
✦ Automated
software
distribution
• 80
GB
for
full
package
• rsync+ssh
for
updates
✦ Everything
“Just
Works”
• So
labs
are
happy
to
renew
membership
and
refer
friends
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
17. Boston
Life
Sciences
Hub
• Biomedical
researchers
• Government
agencies
• Life
sciences Tufts
• Universities
Universit
y
School
of
Medicin
e
• Hospitals
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
18. Hug
a
Life
Scientist!
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
19. Hug
a
Life
Scientist!
✦ Let
them
know
you
care
...
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
20. Hug
a
Life
Scientist!
✦ Let
them
know
you
care
...
• ...
because
the
software
we
give
them
doesn’t
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
21. Hug
a
Life
Scientist!
✦ Let
them
know
you
care
...
• ...
because
the
software
we
give
them
doesn’t
• ...
and
neither
do
the
systems
we
subject
them
to
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
22. Hug
a
Life
Scientist!
✦ Let
them
know
you
care
...
• ...
because
the
software
we
give
them
doesn’t
• ...
and
neither
do
the
systems
we
subject
them
to
• ...
but
to
be
fair,
a
lot
of
the
pain
is
self-‐inZlicted
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
23. Hug
a
Life
Scientist!
✦ Let
them
know
you
care
...
• ...
because
the
software
we
give
them
doesn’t
• ...
and
neither
do
the
systems
we
subject
them
to
• ...
but
to
be
fair,
a
lot
of
the
pain
is
self-‐inZlicted
✦ SBGrid
came
into
existence
to
Zill
the
tech
void/pain
experienced
by
structural
biologists
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
24. Hug
a
Life
Scientist!
✦ Let
them
know
you
care
...
• ...
because
the
software
we
give
them
doesn’t
• ...
and
neither
do
the
systems
we
subject
them
to
• ...
but
to
be
fair,
a
lot
of
the
pain
is
self-‐inZlicted
✦ SBGrid
came
into
existence
to
Zill
the
tech
void/pain
experienced
by
structural
biologists
✦ Started
with
providing
reliable
compiled
software
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
25. Hug
a
Life
Scientist!
✦ Let
them
know
you
care
...
• ...
because
the
software
we
give
them
doesn’t
• ...
and
neither
do
the
systems
we
subject
them
to
• ...
but
to
be
fair,
a
lot
of
the
pain
is
self-‐inZlicted
✦ SBGrid
came
into
existence
to
Zill
the
tech
void/pain
experienced
by
structural
biologists
✦ Started
with
providing
reliable
compiled
software
✦ Expanded
into
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
26. Hug
a
Life
Scientist!
✦ Let
them
know
you
care
...
• ...
because
the
software
we
give
them
doesn’t
• ...
and
neither
do
the
systems
we
subject
them
to
• ...
but
to
be
fair,
a
lot
of
the
pain
is
self-‐inZlicted
✦ SBGrid
came
into
existence
to
Zill
the
tech
void/pain
experienced
by
structural
biologists
✦ Started
with
providing
reliable
compiled
software
✦ Expanded
into
• training
events
and
workshops
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
27. Hug
a
Life
Scientist!
✦ Let
them
know
you
care
...
• ...
because
the
software
we
give
them
doesn’t
• ...
and
neither
do
the
systems
we
subject
them
to
• ...
but
to
be
fair,
a
lot
of
the
pain
is
self-‐inZlicted
✦ SBGrid
came
into
existence
to
Zill
the
tech
void/pain
experienced
by
structural
biologists
✦ Started
with
providing
reliable
compiled
software
✦ Expanded
into
• training
events
and
workshops
• best
practice
guides
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
28. Hug
a
Life
Scientist!
✦ Let
them
know
you
care
...
• ...
because
the
software
we
give
them
doesn’t
• ...
and
neither
do
the
systems
we
subject
them
to
• ...
but
to
be
fair,
a
lot
of
the
pain
is
self-‐inZlicted
✦ SBGrid
came
into
existence
to
Zill
the
tech
void/pain
experienced
by
structural
biologists
✦ Started
with
providing
reliable
compiled
software
✦ Expanded
into
• training
events
and
workshops
• best
practice
guides
• shared
computational
infrastructure
(clusters!
OSG!
GlobusOnline!)
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
29. Hug
a
Life
Scientist!
✦ Let
them
know
you
care
...
• ...
because
the
software
we
give
them
doesn’t
• ...
and
neither
do
the
systems
we
subject
them
to
• ...
but
to
be
fair,
a
lot
of
the
pain
is
self-‐inZlicted
✦ SBGrid
came
into
existence
to
Zill
the
tech
void/pain
experienced
by
structural
biologists
✦ Started
with
providing
reliable
compiled
software
✦ Expanded
into
• training
events
and
workshops
• best
practice
guides
• shared
computational
infrastructure
(clusters!
OSG!
GlobusOnline!)
• web-‐based
collaborative
computational
and
data
services
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
30. Objectives
A.
Extensible
infrastructure
to
facilitate
development
and
deployment
of
novel
computational
workGlows
B.
Web-‐accessible
environment
for
collaborative,
compute
and
data
intensive
science
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
33. Objectives
(explained)
✦ Pareto
Principle
• 80%
of
the
time
users
are
happy
with
basic
web
form
interface
to
standard
application
workGlow
and
canned
result
analysis
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
34. Objectives
(explained)
✦ Pareto
Principle
• 80%
of
the
time
users
are
happy
with
basic
web
form
interface
to
standard
application
workGlow
and
canned
result
analysis
• 20%
of
the
effort
to
address
these
routine
cases
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
35. Objectives
(explained)
✦ Pareto
Principle
• 80%
of
the
time
users
are
happy
with
basic
web
form
interface
to
standard
application
workGlow
and
canned
result
analysis
• 20%
of
the
effort
to
address
these
routine
cases
• Science
Portals
are
a
big
win
over
cumbersome
and
complex
Fortran
code
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
36. Objectives
(explained)
✦ Pareto
Principle
• 80%
of
the
time
users
are
happy
with
basic
web
form
interface
to
standard
application
workGlow
and
canned
result
analysis
• 20%
of
the
effort
to
address
these
routine
cases
• Science
Portals
are
a
big
win
over
cumbersome
and
complex
Fortran
code
✦ Corollary
to
Pareto
Principle
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
37. Objectives
(explained)
✦ Pareto
Principle
• 80%
of
the
time
users
are
happy
with
basic
web
form
interface
to
standard
application
workGlow
and
canned
result
analysis
• 20%
of
the
effort
to
address
these
routine
cases
• Science
Portals
are
a
big
win
over
cumbersome
and
complex
Fortran
code
✦ Corollary
to
Pareto
Principle
• 20%
of
the
time
users
want
or
need
customized
application
work<low
and/or
result
analysis
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
38. Objectives
(explained)
✦ Pareto
Principle
• 80%
of
the
time
users
are
happy
with
basic
web
form
interface
to
standard
application
workGlow
and
canned
result
analysis
• 20%
of
the
effort
to
address
these
routine
cases
• Science
Portals
are
a
big
win
over
cumbersome
and
complex
Fortran
code
✦ Corollary
to
Pareto
Principle
• 20%
of
the
time
users
want
or
need
customized
application
work<low
and/or
result
analysis
• 80%
of
the
effort
to
make
possible
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
39. Objectives
(explained)
✦ Pareto
Principle
• 80%
of
the
time
users
are
happy
with
basic
web
form
interface
to
standard
application
workGlow
and
canned
result
analysis
• 20%
of
the
effort
to
address
these
routine
cases
• Science
Portals
are
a
big
win
over
cumbersome
and
complex
Fortran
code
✦ Corollary
to
Pareto
Principle
• 20%
of
the
time
users
want
or
need
customized
application
work<low
and/or
result
analysis
• 80%
of
the
effort
to
make
possible
• But
rare
that
anyone
knows
in
advance
whether
80
or
20
side
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
40. My
Experience
and
Perspective
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
41. Aud
My
Experience
and
iPerspective
enc
e
Pa
rtic
ipat
ion
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
42. Aud
My
Experience
and
iPerspective
enc
e
Pa
rtic
ipat
ion
✦ The
really
interesting
stuff
happens
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
43. Aud
My
Experience
and
iPerspective
enc
e
Pa
rtic
ipat
ion
✦ The
really
interesting
stuff
happens
• in
the
unpredictable
20%
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
44. Aud
My
Experience
and
iPerspective
enc
e
Pa
rtic
ipat
ion
✦ The
really
interesting
stuff
happens
• in
the
unpredictable
20%
✦ Innovative
analytical
strategies
require
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
45. Aud
My
Experience
and
iPerspective
enc
e
Pa
rtic
ipat
ion
✦ The
really
interesting
stuff
happens
• in
the
unpredictable
20%
✦ Innovative
analytical
strategies
require
• an
ability
to
rapidly
adjust
work<low
and
data
analysis
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
46. Aud
My
Experience
and
iPerspective
enc
e
Pa
rtic
ipat
ion
✦ The
really
interesting
stuff
happens
• in
the
unpredictable
20%
✦ Innovative
analytical
strategies
require
• an
ability
to
rapidly
adjust
work<low
and
data
analysis
✦ You’re
stuffed
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
47. Aud
My
Experience
and
iPerspective
enc
e
Pa
rtic
ipat
ion
✦ The
really
interesting
stuff
happens
• in
the
unpredictable
20%
✦ Innovative
analytical
strategies
require
• an
ability
to
rapidly
adjust
work<low
and
data
analysis
✦ You’re
stuffed
• if
workGlow
and
data
are
tightly
coupled
to
portal
framework
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
48. Aud
My
Experience
and
iPerspective
enc
e
Pa
rtic
ipat
ion
✦ The
really
interesting
stuff
happens
• in
the
unpredictable
20%
✦ Innovative
analytical
strategies
require
• an
ability
to
rapidly
adjust
work<low
and
data
analysis
✦ You’re
stuffed
• if
workGlow
and
data
are
tightly
coupled
to
portal
framework
✦ Collaboration
is
critical:
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
49. Aud
My
Experience
and
iPerspective
enc
e
Pa
rtic
ipat
ion
✦ The
really
interesting
stuff
happens
• in
the
unpredictable
20%
✦ Innovative
analytical
strategies
require
• an
ability
to
rapidly
adjust
work<low
and
data
analysis
✦ You’re
stuffed
• if
workGlow
and
data
are
tightly
coupled
to
portal
framework
✦ Collaboration
is
critical:
• you
need
to
be
able
to
share
your
work
(securely)
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
50. Aud
My
Experience
and
iPerspective
enc
e
Pa
rtic
ipat
ion
✦ The
really
interesting
stuff
happens
• in
the
unpredictable
20%
✦ Innovative
analytical
strategies
require
• an
ability
to
rapidly
adjust
work<low
and
data
analysis
✦ You’re
stuffed
• if
workGlow
and
data
are
tightly
coupled
to
portal
framework
✦ Collaboration
is
critical:
• you
need
to
be
able
to
share
your
work
(securely)
• the
web
is
the
obvious
(only!)
way
anyone
wants
to
do
this
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
52. Front
End
Interface
✦ Django
(Python)
web
framework
✦ Apache
web
server
✦ Per-‐user
protected
jobs
and
data
✦ WebDAV
to
data
✦ ssh
access
possible
✦ Richer
access
control
in
development
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
54. NoSQL
hierarchical
document
store
✦ The
SBGrid
Portal’s
leading
workGlow:
• 100,000
jobs
• 300,000
output
Giles
• 20-‐100k
CPU-‐hours
✦ Need
a
good
way
to
store
data
• Glexible
data
format
• Glexible
analysis
output
• Gine
grained,
user-‐driven
access
control
• parallel
access
• remote
access
✦ high
capacity
non-‐relational
hierarchical
storage
• ????
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
55. Operating
Systems
are
Pretty
Good
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
56. Operating
Systems
are
Pretty
Good
✦ File
systems
work
well
• organize
data
carefully
(hierarchically)
• include
meta-‐data
(mod_cern_meta,
Gile
system)
• serve
intelligently
via
multiple
protocols
(http,
gridftp)
• leverage
POSIX
ownerships
(user,
group,
other,
r/w)
• leverage
user,
group,
and
volume
quotas
• storage
management
and
backups
are
easy
easier
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
57. Operating
Systems
are
Pretty
Good
✦ File
systems
work
well
• organize
data
carefully
(hierarchically)
• include
meta-‐data
(mod_cern_meta,
Gile
system)
• serve
intelligently
via
multiple
protocols
(http,
gridftp)
• leverage
POSIX
ownerships
(user,
group,
other,
r/w)
• leverage
user,
group,
and
volume
quotas
• storage
management
and
backups
are
easy
easier
✦ Process
management
works
well
• execute
as
the
actual
user,
where
possible
• setuid,
su,
ssh,
suexec,
and
gsexec
can
all
help
with
this
• process
accounting
is
your
friend!
(pacct)
• leverage
ulimit
for
process
resource
limits
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
60. Open
Science
Grid
http://opensciencegrid.org
✦ US
National
Cyberinfrastructure
✦ Primarily
used
for
high
energy
physics
computing
5,073,293
hours
✦ 80
sites ~570
years
✦ O(1e5)
job
slots
✦ O(1e6)
core-‐hours
per
day
✦ PB
scale
aggregate
storage
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
61. Service
Architecture
GlobusOnline UC San Diego
@Argonne GUMS
User GUMS
GridFTP + glideinWMS
data Hadoop factory Open Science Grid
computations
MyProxy
@NCSA, UIUC
monitoring interfaces data computation ID mgmt
Ganglia scp Condor FreeIPA
Apache DOEGrids CA
Nagios GridFTP Cycle Server @Lawrence
GridSite LDAP
RSV SRM VDT Berkley Labs
Django VOMS
Globus
pacct WebDAV
Sage Math GUMS
glideinWMS Gratia Acct'ing
R-Studio GACL @FermiLab
file SQL
shell CLI server DB cluster
Monitoring
SBGrid Science Portal @ Harvard Medical School @Indiana
64. SBGrid
Portal:
Current
Status
✦ 262
users
(lifetime),
72
active
in
past
quarter
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
65. SBGrid
Portal:
Current
Status
✦ 262
users
(lifetime),
72
active
in
past
quarter
✦ 2.4
million
hours
on
OSG
last
12
months
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
66. SBGrid
Portal:
Current
Status
✦ 262
users
(lifetime),
72
active
in
past
quarter
✦ 2.4
million
hours
on
OSG
last
12
months
✦ Seamless
data
sharing
from
web
to
ssh?
• requires
NFSv4
to
allow
>12
POSIX
groups/user
• suexec
or
gsexec
possibility
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
67. SBGrid
Portal:
Current
Status
✦ 262
users
(lifetime),
72
active
in
past
quarter
✦ 2.4
million
hours
on
OSG
last
12
months
✦ Seamless
data
sharing
from
web
to
ssh?
• requires
NFSv4
to
allow
>12
POSIX
groups/user
• suexec
or
gsexec
possibility
✦ Account
integration
• PAM
(ssh/command
line)
+
web
through
FreeIPA
LDAP
• prototype
of
X.509
+
VOMS
+
MyProxy
(next
section!)
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
68. SBGrid
Portal:
Current
Status
✦ 262
users
(lifetime),
72
active
in
past
quarter
✦ 2.4
million
hours
on
OSG
last
12
months
✦ Seamless
data
sharing
from
web
to
ssh?
• requires
NFSv4
to
allow
>12
POSIX
groups/user
• suexec
or
gsexec
possibility
✦ Account
integration
• PAM
(ssh/command
line)
+
web
through
FreeIPA
LDAP
• prototype
of
X.509
+
VOMS
+
MyProxy
(next
section!)
✦ Collaboration
• shared
secret
(password)
• manual
.htaccess
or
.gacl
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
71. Big
Picture
✦ Federated
environment
requires
• federated
identity
management
• trusted
identity
providers
(“roots
of
trust”)
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
72. Big
Picture
✦ Federated
environment
requires
• federated
identity
management
• trusted
identity
providers
(“roots
of
trust”)
✦ Collaboration
requires
• user-‐driven
capacity
to
form
cross-‐organization
user
groups
(aka
“Virtual
Organizations”)
• roles
(or
at
least
privilege
levels)
within
VO
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
73. Big
Picture
✦ Federated
environment
requires
• federated
identity
management
• trusted
identity
providers
(“roots
of
trust”)
✦ Collaboration
requires
• user-‐driven
capacity
to
form
cross-‐organization
user
groups
(aka
“Virtual
Organizations”)
• roles
(or
at
least
privilege
levels)
within
VO
✦ State
of
Play
• InCommon
will
get
us
part
way
there
(waiting
on
adoption!)
• OpenID
nice
for
users,
but
no
trust
or
delegated
perms
• X.509
process
and
details
still
tough
for
end
user
• SSH
keys
lack
standard
root
of
trust
and
roles
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
74. X.509
Digital
CertiZicates
✦ Analogy
to
a
passport:
• Application
form
• Sponsor’s
attestation
• Consular
services
• veriGication
of
application,
sponsor,
and
accompanying
identiGication
and
eligibility
documents
• Passport
issuing
ofGice
✦ Portable,
digital
passport
• Gixed
and
secure
user
identiGiers
• name,
email,
home
institution
• signed
by
widely
trusted
issuer
• time
limited
• ISO
standard
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
75. X.509
Challenges
✦ Lots
of
“humans
in
the
loop”
to
get
usable
cert
• Registration
Agent,
Sponsor,
VO
Manager,
User
✦ Awkward
working
with
X.509
certs
• multiple
formats
• proxy
certs
and
VOMS
ACs
• proxy
servers
(MyProxy)
• expiry
(of
proxy,
of
base
cert,
of
VO
membership)
• browser
integration
and
import
process
• CA
cert
chain
• digital
token
needs
to
be
available
on
all
devices
• particularly
challenging
for
phones
and
tablets
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
76. X.509
Nirvana
(ours
at
least)
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
77. X.509
Nirvana
(ours
at
least)
✦ User
never
sees
X.509
anything
• unless
they
want
to
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
78. X.509
Nirvana
(ours
at
least)
✦ User
never
sees
X.509
anything
• unless
they
want
to
✦ X.509
request
+
VO
membership
+
account
creation
completed
in
one
step
by
one
person
• single
step
for
user
• single
step
for
one
administrator
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
79. X.509
Nirvana
(ours
at
least)
✦ User
never
sees
X.509
anything
• unless
they
want
to
✦ X.509
request
+
VO
membership
+
account
creation
completed
in
one
step
by
one
person
• single
step
for
user
• single
step
for
one
administrator
✦ Goodbye
passphrases
(and
forgotten
passphrases)
• hold
private
key
in
LDAP
and
use
LDAP
authentication
to
access
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
80. X.509
Nirvana
(ours
at
least)
✦ User
never
sees
X.509
anything
• unless
they
want
to
✦ X.509
request
+
VO
membership
+
account
creation
completed
in
one
step
by
one
person
• single
step
for
user
• single
step
for
one
administrator
✦ Goodbye
passphrases
(and
forgotten
passphrases)
• hold
private
key
in
LDAP
and
use
LDAP
authentication
to
access
✦ Automate
everything
• login
(web
or
command
line)
triggers
X.509
proxy
request
with
(default)
VOMS
AC,
and
loading
to
MyProxy
server
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
81. X.509
Nirvana
(ours
at
least)
✦ User
never
sees
X.509
anything
• unless
they
want
to
✦ X.509
request
+
VO
membership
+
account
creation
completed
in
one
step
by
one
person
• single
step
for
user
• single
step
for
one
administrator
✦ Goodbye
passphrases
(and
forgotten
passphrases)
• hold
private
key
in
LDAP
and
use
LDAP
authentication
to
access
✦ Automate
everything
• login
(web
or
command
line)
triggers
X.509
proxy
request
with
(default)
VOMS
AC,
and
loading
to
MyProxy
server
✦ VO
Management
System
run
by
users
• Users
need
to
be
able
to
self-‐manage
their
(sub-‐)
VOs
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
99. Data
Tiers
-‐
Scoping
• VO-‐wide:
all
sites,
admin
managed,
very
stable
• User
archive:
single
site,
user
managed,
very
stable,
10+
GB
• User
project:
all
sites,
user
managed,
1-‐10
weeks,
1-‐3
GB
• User
static:
all
sites,
user
managed,
indeZinite,
10
MB
• Job
set:
all
sites,
infrastructure
managed,
1-‐10
days,
0.1-‐1
GB
• Job:
direct
to
worker
node,
infrastructure
managed,
1
day,
<10
MB
• Job
indirect:
to
worker
node
via
UCSD,
infrastructure
managed,
1
day,
<10
GB
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
100. About
2PB
with
40
front
end
servers
for
high
bandwidth
parallel
Gile
transfer
Data
Movement
scp
(users)
rsync
(VO-‐wide)
grid-‐ftp
(UCSD)
curl
(WNs)
cp
(NFS)
htcp
(secure
web)
http(s)
(web)
j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
101. Globus
Online:
High
Performance
Reliable
3rd
Party
File
Transfer
GUMS
DN
to
user
mapping CertiGicate
Authority
VOMS root
of
trust
VO
membership
portal
cluster
Globus
Online
Zile
transfer
service
data collection
lab file
facility
server
j.mp/esci12-sbgrid desktop laptop ijstokes@seas.harvard.edu
102. facility file
server
SBGrid
Science
Portal
lab file
desktop
server
laptop
103. Ryan,
a
postdoc
in
the
Frank
Lab
at
Columbia
Access
NRAMM
facilities
securely
and
transfer
data
back
to
home
institute
facility file
server
SBGrid
Science
Portal
lab file
desktop
server
laptop
104. Ryan,
a
postdoc
in
the
Frank
Lab
at
Columbia
Access
NRAMM
facilities
securely
and
transfer
data
back
to
home
institute
/data/columbia/frank
facility file
server
SBGrid
Science
Portal
lab file
desktop
server
/nfs/data/rsmith
/Users/Ryan
laptop
105. Ryan,
a
postdoc
in
the
Frank
Lab
at
Columbia
Access
NRAMM
facilities
securely
and
transfer
data
back
to
home
institute
automated
X.509
application
/data/columbia/frank
facility file
server
SBGrid Ryan
applies
for
an
Science account
at
the
SBGrid
Portal Science
Portal
lab file
automated
desktop
server
Globus
Online
/nfs/data/rsmith
application/
X.509
linking
(wish
list!)
/Users/Ryan
laptop
106. Ryan,
a
postdoc
in
the
Frank
Lab
at
Columbia
Access
NRAMM
facilities
securely
and
transfer
data
back
to
home
institute
automated
X.509
application
/data/columbia/frank
facility file
server
veriZication
of
lab
membership
SBGrid Ryan
applies
for
an
Science account
at
the
SBGrid
Portal Science
Portal
lab file
automated
desktop
server
Globus
Online
/nfs/data/rsmith
application/
X.509
linking
(wish
list!)
/Users/Ryan
laptop
107. Ryan,
a
postdoc
in
the
Frank
Lab
at
Columbia
Access
NRAMM
facilities
securely
and
transfer
data
back
to
home
institute
automated
X.509
application
/data/columbia/frank
facility file
server
veriZication
of
lab
membership
SBGrid Ryan
applies
for
an
Science account
at
the
SBGrid
Portal Science
Portal
lab file
automated
desktop
server
Globus
Online
/nfs/data/rsmith
application/
X.509
linking
(wish
list!)
/Users/Ryan
laptop
108. Ryan,
a
postdoc
in
the
Frank
Lab
at
Columbia
Access
NRAMM
facilities
securely
and
transfer
data
back
to
home
institute
/data/columbia/frank
facility file
server
SBGrid
Science
Portal
lab file
desktop
server
/nfs/data/rsmith
/Users/Ryan
laptop
109. Ryan,
a
postdoc
in
the
Frank
Lab
at
Columbia
Access
NRAMM
facilities
securely
and
transfer
data
back
to
home
institute
/data/columbia/frank
facility file
server
SBGrid request
access
Science to
NRAMM
Portal facility
using
credential
held
by
SBGrid lab file
desktop
server
/nfs/data/rsmith
/Users/Ryan
laptop
110. Ryan,
a
postdoc
in
the
Frank
Lab
at
Columbia
Access
NRAMM
facilities
securely
and
transfer
data
back
to
home
institute
check
SBGrid
for
Ryan’s
group
membership /data/columbia/frank
facility file
in
Frank
Lab,
so
server
grant
access
to
Ziles
SBGrid request
access
Science to
NRAMM
Portal facility
using
credential
held
by
SBGrid lab file
desktop
server
/nfs/data/rsmith
/Users/Ryan
laptop