This document provides an overview of distributed data management for the Large Hadron Collider (LHC) experiments at CERN. It discusses the worldwide computing grid that is used to store, process, and analyze the immense volumes of data produced by the LHC experiments each year. The grid consists of Tier 0, 1, and 2 computing centers around the world. It has enabled scientists from many collaborating institutions to work together on data from the LHC experiments.
3. 4
Founded in 1954: “Science for Peace”
Member States: Austria, Belgium, Bulgaria, the Czech Republic, Denmark,
Finland, France, Germany, Greece, Hungary, Italy, the Netherlands, Norway,
Poland, Portugal, Slovakia, Spain, Sweden, Switzerland and
the United Kingdom
Candidate for Accession: Romania
Associate Members in the Pre-Stage to Membership: Israel, Serbia
Applicant States: Cyprus, Slovenia, Turkey
Observers to Council: India, Japan, the Russian Federation, the United
States of America, Turkey, the European Commission and UNESCO
~ 2300 staff
~ 1050 other paid personnel
~ 11000 users
Budget (2012) ~1000 MCHF
CERN: 20 member states
8. Ø 27 kilometre circle
Ø proton collisions at 7+7 TeV
Ø 10.000 magnets
Ø 8000 km super-conducting cables
Ø 120 t of liquid Helium
The Large Hadron Collider
The largest super conducting
installation in the word
9. Dirk Düllmann, CERN/IT 14
Precision ! The 27 km long ring is
sensitive to <1mm changes
Tides
Stray currents
Rainfall
LHC
10. Dirk Düllmann, CERN/IT 17
Ø 140 000 m3 rock removed
Ø 53 000 m3 concrete
Ø 6 000 tons steel reinforcement
Ø 55 meters long
Ø 30 meters wide
Ø 53 meters high
(10-storey building)
The ATLAS Cavern
11. 15
A
collision
at
LHC
26
June
2009
Ian
Bird,
CERN
12.
13. Ian
Bird,
CERN
18
The
Data
AcquisiIon
for
one
Experiment
15. 20
The
LHC
Computing
Challenge
ž Signal/Noise:
10-‐13
(10-‐9
offline)
ž Data
volume
— High
rate
*
large
number
of
channels
*
4
experiments
è
~15
PetaBytes
of
new
data
each
year
ž Compute
power
— Event
complexity
*
Nb.
events
*
thousands
users
è 200
k
CPUs
è 45
PB
of
disk
storage
ž Worldwide
analysis
&
funding
— CompuIng
funding
locally
in
major
regions
&
countries
— Efficient
analysis
everywhere
è
GRID
technology
à ~30 PB in 2012
à 170 PB
à 300 k CPU
16. CERN
Computer
Centre
CERN
computer
centre:
• Built
in
the
70s
on
the
CERN
site
• ~3000
m2
(on
three
machine
rooms)
• 3.5
MW
for
equipment
A
recent
extension:
• Located
at
Wigner
(Budapest,
Hungary)
• ~1000
m2
• 2.7
MW
for
equipment
• Connected
to
CERN
with
2x100Gb
links
21
17. • A
distributed
compuIng
infrastructure
to
provide
the
producIon
and
analysis
environments
for
the
LHC
experiments
• Managed
and
operated
by
a
worldwide
collaboraIon
between
the
experiments
and
the
parIcipaIng
computer
centres
• The
resources
are
distributed
–
for
funding
and
sociological
reasons
• Our
task
was
to
make
use
of
the
resources
available
to
us
–
no
mafer
where
they
are
located
23
World
Wide
Grid
–
what
and
why?
Tier-0 (CERN):
• Data recording
• Initial data reconstruction
• Data distribution
Tier-1 (11 centres):
• Permanent storage
• Re-processing
• Analysis
Tier-2 (~130 centres):
• Simulation
• End-user analysis
18. • The
grid
really
works
• All
sites,
large
and
small
can
contribute
– And
their
contribuIons
are
needed!
Ian.Bird@cern.ch
24
CPU
–
around
the
Tiers
CPU$delivered$+$January$2011$
CERN%
BNL%
CNAF%
KIT%
NL%LHC/Tier21%
RAL%
FNAL%
CC2IN2P3%
ASGC%
PIC%
NDGF%
TRIUMF%
Tier%2%
Tier%2%CPU%delivered%by%country%4%January%2011% USA$ UK$
France$ Germany$
Italy$ Russian$Federa7on$
Spain$ Canada$
Poland$ Switzerland$
Slovenia$ Czech$Republic$
China$ Portugal$
Japan$ Sweden$
Israel$ Romania$
Belgium$ Austria$
Hungary$ Taipei$
Australia$ Republic$of$Korea$
Norway$ Turkey$
Ukraine$ Finland$
India$ Pakistan$
Estonia$ Brazil$
Greece$
19. 25
Evolution
of
capacity:
CERN
&
WLCG
0"
200000"
400000"
600000"
800000"
1000000"
1200000"
1400000"
1600000"
1800000"
2000000"
2008" 2009" 2010" 2011" 2012" 2013"
WLCG%CPU%Growth%
Tier2%
Tier1%
CERN%
0"
20"
40"
60"
80"
100"
120"
140"
160"
180"
200"
2008" 2009" 2010" 2011" 2012" 2013"
WLCG%Disk%Growth%
Tier2%
Tier1%
CERN%
0"
100000"
200000"
300000"
400000"
500000"
600000"
2005" 2006" 2007" 2008" 2009" 2010" 2011" 2012" 2013"
CERN%Compu*ng%Capacity%
CERN"
2013/14:
modest
increases
to
process
“parked
data”
2015
à
budget
limited
?
-‐
experiments
will
push
trigger
rates
-‐
flat
budgets
give
~20%/year
growth
What
we
thought
was
needed
at
LHC
start
What
we
actually
used
at
LHC
start!
20. • Relies
on
– OPN,
GEANT,
US-‐LHCNet
– NRENs
&
other
naIonal
&
internaIonal
providers
Ian
Bird,
CERN
27
LHC
Networking
22. Physics Storage @ CERN: CASTOR and EOS
CASTOR
and
EOS
are
using
the
same
commodity
disk
servers
• With
RAID-‐1
for
CASTOR
• 2
copies
in
the
mirror
• JBOD
with
RAIN
for
EOS
• Replicas
spread
over
different
disk
servers
• Tunable
redundancy
Storage
Systems
developed
at
CERN
30
23. CERN Disk/Tape Storage Management @ storage-day.ch
CASTOR
-‐
Physics
Data
Archive
31
Data:
• ~90 PB of data on tape; 250 M files
• Up to 4.5 PB new data per month
• Over 10GB/s (R+W) peaks
Infrastructure:
• ~ 52K tapes (1TB, 4TB, 5TB)
• 9 Robotic libraries (IBM and Oracle)
• 80 production + 30 legacy tape drives
24. CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
Internet
Services
DSS
44.8 PB
136 (279)
Mio.
20.7k
32.1 PB
EOS Usage at CERN Today
26. CERN
openlab
in
a
nutshell
• A
science
–
industry
partnership
to
drive
R&D
and
innovaIon
with
over
a
decade
of
success
• Evaluate
state-‐of-‐the-‐art
technologies
in
a
challenging
environment
and
improve
them
• Test
in
a
research
environment
today
what
will
be
used
in
many
business
sectors
tomorrow
• Train
next
generaIon
of
engineers/employees
• Disseminate
results
and
outreach
to
new
audiences
40
27. 41
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
Internet
Services
DSS Ongoing R&D: Eg Cloud Storage
• CERN openlab
– joint project since Jan 12
– Testing scaling and TCO
gains with prototype
applications
• Huawei S3 storage
appliance (0.8 PB)
• logical replication
• fail-in-place
28. Thanks for your attention!
More at http://cern.ch
Accelerating Science and Innovation
45