SlideShare uma empresa Scribd logo
1 de 50
Baixar para ler offline
Grid Computing
at the Large Hadron Collider:
  Massive Computing at the
Limit of Scale, Space, Power
         and Budget
        Dr Helge Meinhard
      CERN, IT Department
  SNW Frankfurt, 27 October 2010
CERN (1)
§ Conseil européen
  pour la recherche
  nucléaire – aka
  European Laboratory
  for Particle Physics
  § Facilities for
    fundamental research
§ Between Geneva and
  the Jura mountains,
  straddling the Swiss-
  French border
§ Founded in 1954
CERN (2)
§ 20 member states
§ ~3300 staff members,
  fellows, students,
  apprentices
§ 10’000 users registered
  (~7’000 on site)
   § from more than 550
     institutes in more than 80
     countries
§ 1026 MCHF (~790 MEUR)
  annual budget
§ http://cern.ch/
Physics at the LHC (1)




 Matter particles: fundamental
        building blocks


        Force particles:
     bind matter particles
Physics at the LHC (2)

§ Four known forces:
  strong force,
  weak force,
  electromagnetism,
  gravitation
§ Standard model
  unifies three of them       § Higgs particle
  § Verified to                 § Higgs condensate fills
    0.1 percent level             vacuum
  § Too many free               § Acts like ‘molasse’,
    parameters                    slows other particles
     § E.g. particle masses       down, gives them
                                  mass
Physics at the LHC (3)
§ Open questions in particle physics:
  § Why are the parameters of the size as we observe
    them?
  § What gives the particles their masses?
  § How can gravity be integrated into a unified theory?
  § Why is there only matter and no anti-matter in the
    universe?
  § Are there more space-time dimensions than the 4 we
    know of?
  § What is dark energy and dark matter which makes
    up 98% of the universe?
§ Finding the Higgs and possible new physics with
  LHC will give the answers!
The Large Hadron Collider (1)
§ Accelerator for
  protons against
  protons – 14 TeV
  collision energy
  § By far the world’s
    most powerful
    accelerator
§ Tunnel of 27 km
  circumference, 4 m
  diameter, 50…150 m
  below ground
§ Detectors at four
  collision points
The Large Hadron Collider (2)
§ Approved 1994, first
  circulating beams on
  10 September 2008
§ Protons are bent by
  superconducting magnets
  (8 Tesla, operating at 2K
  = –271°C) all around the
  tunnel
§ Each beam: 3000
  bunches of 100 billion
  protons each
§ Up to 40 million bunch
  collisions per second at
  the centre of each of the
  four detectors
LHC Status and Future Plans
Date            Event
10-Sep-2008     First beam in LHC
19-Sep-2008     Leak when magnets ramped to full field for 7 TeV/beam
20-Nov-2009     First circulating beams since Sep-2008
30-Nov-2009     World record: 2 * 1.18 TeV, collisions soon after
19-Mar-2010     Another world record: 2 * 3.5 TeV
30-Mar-2010     First collisions at 2 * 3.5 TeV, special day for the press
26-Jul-2010     Experiments present first results at ICHEP conference
14-Oct-2010     Target luminosity for 2010 reached (10^32)
Until end 2011 Run at 2 * 3.5 TeV to collect 1 fb-1
2012            Shutdown to prepare machine for 2 * 7 TeV
2013 - …(?)     Run at 2 * 7 TeV
LHC Detectors (1)
           ATLAS




                   CMS

                   LHCb
LHC Detectors (2)




3’000 physicists (including 1’000 students)
    from 173 institutes of 37 countries
LHC Data
   (1)

The accelerator generates
40 million bunch collisions
(“events”) every second at
the centre of each of the
four experiments’
detectors
§ Per bunch collision, typically
  ~20 proton-proton
  interactions
§ Particles from previous bunch
  collision only 7.5 cm away
  from detector centre
LHC Data
    (2)
   Reduced by online
   computers that filter
   out a few hundred
   “good” events per
   second …


             15’000 Terabytes = 3 million DVDs

… which are recorded on disk                 15 Petabytes per
and magnetic tape at                           year for four
100…1’000 Megabytes/sec
                                               experiments
      1 event = few Megabytes
LHC Data (3)
Summary of Computing Resource Requirements
All experiments – 2008
From LCG TDR – June 2005
                                                  CERN All Tier-1s All Tier-2s                      Total
CPU (MSPECint2000s)                                  25                  56                 61        142
Disk (Petabytes)                                      7                  31                 19            57
Tape (Petabytes)                                     18                  35                               53

                                                  30’000 CPU servers,
                                                     110’000 disks:
                                                  Too much for CERN!
               CPU                                Disk                                      Tape
                     CERN                                 CERN
                      18%                                  12%
                                    All Tier-2s                                                    CERN
                                       33%                                                          34%
 All Tier-2s
    43%


                                                                              All Tier-1s
                      All Tier-1s                                                66%
                                                           All Tier-1s
                         39%
                                                              55%
Worldwide LHC Computing Grid (1)
§ Tier 0: CERN
   § Data acquisition and                    Uni x             Lab m
     initial processing                grid for a
   § Data distribution                 regional
                                       group                                    Uni a
   § Long-term curation                                 Nordic
                                                       Countries
§ Tier 1: 11 major centres         Lab a                            UK
                                                USA
   § Managed mass storage
   § Data-heavy analysis                                               France
                                            Spain
   § Dedicated 10 Gbps lines                 Tier 1
                                                                                  Uni n
     to CERN                                               CERN Nether-
                                               Italy       Tier 0 lands
§ Tier 2: More than 200            Tier2
  centres in more than 30                                       Germany
                               g                      Taiwan
  countries                        Lab b
                                                                          Lab c
   § Simulation                                                       grid for a
   § End-user analysis             b                                  physics
                                              Uni y                   study
§ Tier 3: from physicists’              a
                                                                Uni b group
  desktops to small
  workgroup cluster                                             Tier3
   § Not covered by MoU                                        physics
                                            Desktop         department
Worldwide LHC Computing Grid (2)
§ Grid middleware for “seemless”
 integration of services
  § Aim: Looks like single huge compute facility
  § Projects: EDG/EGEE/EGI, OSG
  § Big step from proof of concept to stable,
    large-scale production
§ Centres are autonomous, but lots of
 commonalities
  § Commodity hardware (e.g. x86 processors)
  § Linux (RedHat Enterprise Linux variant)
CERN
Computer Centre
Functions:
  § WLCG: Tier 0,
    some T1/T2
  § Support for smaller
    experiments at
    CERN
  § Infrastructure for
    the laboratory
  § …
Requirements and Boundaries (1)
§ High Energy Physics applications require mostly
  integer processor performance
§ Large amount of processing power and storage
  needed for aggregate performance
   § No need for parallelism / low-latency high-speed
     interconnects
   § Can use large numbers of components with
     performance below optimum level (“coarse-grain
     parallelism”)
§ Infrastructure (building, electricity,
  cooling) is a concern
   § Refurbished two machine rooms
     (1500 + 1200 m2) for total air cooled
     power consumption of 2.5 MWatts
   § Will run out of power in about 2014…
Requirements and Boundaries (2)
§ Major boundary
  condition: cost
                                  Purchased in 2004,
   § Getting maximum
     resources with fixed
                                     now retired
     budget…
   § … then dealing with cuts
     to “fixed” budget
§ Only choice: commodity
  equipment as far as
  possible, minimising
  TCO / performance
   § This is not always the
     solution with the cheapest
     investment cost!
The Bulk Resources – Event Data
Permanent storage on                                (simplified network topology)
tape
Disk as temporary
buffer
                                 R
                                         Ethernet                  R


                                         backbone
                                      (multiple 10GigE)
Data paths:                 R                                           R

tape « disk
disk « cpu
                                          Router
                        10GigE




              Tapes/         Disk                   CPU servers
              servers       servers
CERN CC currently (September 2010)

§ 8’500 systems, 54’000 processing cores
  § CPU servers, disk servers, infrastructure
    servers
§ 49’900 TB raw on 58’500 disk drives
§ 25’000 TB used, 50’000 tape cartridges
  total (70’000 slots), 160 tape drives
§ Tenders in progress or planned
  (estimates)
  § 800 systems, 11’000 processing cores
  § 16’000 TB raw on 8’500 disk drives
Disk Servers for Bulk Storage (1)
§ Target: temporary event data storage
   § More than 95% of disk storage capacity
§ Best TCO / performance: Integrated PC server
   §   One or two x86 processors, 8…16 GB, PCI RAID card(s)
   §   16…24 hot-swappable 7’200 rpm SATA disks in server chassis
   §   Gigabit or 10Gig Ethernet
   §   Linux (of course)
§ Adjudication based on total usable capacity with
  constraints
§ Power consumption taken into account
§ Systems procured recently: depending on specs,
  5…20 TB usable
   § Looking at software RAID, external iSCSI disk enclosures
§ Home-made optimised protocol (rfcp) and HSM
  software (Castor)
Disk Servers for Bulk Storage (2)
Disk Servers for Bulk Storage (3)
Other Disk-based Storage
§ For dedicated applications (not physics
 bulk data):
  § SAN/FC storage
  § NAS storage
  § iSCSI storage
§ Total represents well below 5% of disk
  capacity
§ Consolidation project ongoing
Procurement Guidelines
§ Qualify companies to participate in calls for
  tender
  § A-brands and their resellers
  § Highly qualified assemblers/integrators
§ Specify performance rather than box counts
  § Some constraints on choices for solution
  § Leave detailed system design to bidder
§ Decide based on TCO
  § Purchase price
  § Box count, network connections
  § Total power consumption
The Power Challenge – why bother?
§ Infrastructure limitations
   § E.g. CERN: 2.5 MW for IT equipment
      § Need to fit maximum capacity into given power envelope
§ Electricity costs money
   § Costs likely to raise (steeply) over the next few years
§ IT responsible of significant fraction of world energy
  consumption
   § Server farms in 2008: 1…2% of the world’s energy
     consumption (annual growth rate: 16…23%)
      § CERN’s data centre is 0.1 per mille of this…
   § Responsibility towards mankind demands using the energy
     as efficiently as possible
§ Saving a few percent of energy consumption makes a
  big difference
CERN’s Approach
§ Don’t look in detail at PSU, fans, CPUs, chipset,
  RAM, disk drives, VRMs, RAID controllers, …
§ Rather: Measure apparent (VA) power consumption
  in primary AC circuit
    § CPU servers: 80% full load, 20% idle
    § Storage and infrastructure servers: 50% full load, 50% idle
§ Add element reflecting power consumption to
  purchase price
§ Adjudicate on the sum of purchase price and power
  adjudication element
Power Efficiency: Lessons Learned
§ CPU servers: power efficiency increased
  by factor 12 in a little over four years
§ Need to benchmark concrete servers
  § Generic statements on platform are void
§ Fostering energy-efficient solutions
  makes a difference
§ Power supplies feeding more than one
  system usually more power-efficient
§ Redundant power supplies are inefficient
Future (1)
§ Is IT growth sustainable?
  § Demands continue to rise exponentially
  § Even if Moore’s law continues to apply, data
    centres will need to grow in number and size
  § IT already consuming 2% of world’s energy –
    where do we go?
  § How to handle growing demands within a
    given data centre?
     § Demands evolve very rapidly, technologies less
      so, infrastructure even at a slower pace – how to
      best match these three?
Future (2)
§ IT: Ecosystem of
  § Hardware
  § OS software and tools
  § Applications
§ Evolving at different paces: hardware
 fastest, applications slowest
  § How to make sure at any given time that
    they match reasonably well?
Future (3)
§ Example: single-core to multi-core to
 many-core
  § Most HEP applications currently single-
    threaded
  § Consider server with two quad-core CPUs as
    eight independent execution units
     § Model does not scale much further
  § Need to adapt applications to many-core
    machines
     § Large, long effort
Summary
§ The Large Hadron Collider (LHC) and its experiments is a
  very data (and compute) intensive project
§ LHC has triggered or pushed new technologies
   § E.g. Grid middleware, WANs
§ High-end or bleeding edge technology not necessary
  everywhere
   § That’s why we can benefit from the cost advantages of
     commodity hardware
§ Scaling computing to the requirements of LHC is hard
  work
§ IT power consumption/efficiency is a primordial concern
§ We are steadily taking collision data at 2 * 3.5 TeV, and
  have the capacity in place for dealing with this
§ We are on track for further ramp-ups of the computing
  capacity for future requirements
Thank you
Summary of Computing Resource Requirements
All experiments - 2008
From LCG TDR - June 2005
                           CERN    All Tier-1s   All Tier-2s   Total
CPU (MSPECint2000s)           25            56            61    142
Disk (PetaBytes)               7            31            19      57
Tape (PetaBytes)              18            35                    53
BACKUP SLIDES
CPU Servers (1)
§ Simple, stripped down, “HPC like” boxes
    § No fast low-latency interconnects
§ EM64T or AMD64 processors (usually 2),
    2 or 3 GB/core, 1 disk/processor
§   Open to multiple systems per enclosure
§   Adjudication based on total performance
    (SPECcpu2006 – all_cpp subset)
§   Power consumption taken into account
§   Linux (of course)
CPU Servers (2)
Tape Infrastructure (1)
§ 15 Petabytes per year
  § … and in 10 or 15 years’ time physicists will
    want to go back to 2010 data!
§ Requirements for permanent storage:
  §   Large capacity
  §   Sufficient bandwidth
  §   Proven for long-term data curation
  §   Cost-effective
§ Solution: High-end tape infrastructure
Tape Infrastructure (2)
Mass Storage System (1)
§ Interoperation challenge locally at CERN
  § 100+ tape drives
  § 1’000+ RAID volumes on disk servers
  § 10’000+ processing slots on worker nodes
§ HSM required
§ Commercial options carefully considered
  and rejected: OSM, HPSS
§ CERN development: CASTOR (CERN
  Advanced Storage Manager)
 http://cern.ch/castor
Mass Storage System (2)
§ Key CASTOR features
  § Database-centric layered architecture
     § Stateless agents; can restart easily on error
     § No direct connection from users to critical services
  § Scheduled access to I/O
     § No overloading of disk servers
         § Per-server limit set according to type of transfer
             § servers can support many random access style
               accesses, but only a few sustained data transfers
     § I/O requests can be scheduled according to priority
         § Fair shares access to I/O just as for CPU
         § Prioritise requests from privileged users
§ Performance and stability proven at the
 level required for Tier 0 operation
Box Management (1)
§ Many thousand boxes
  § Hardware management (install, repair, move,
    retire)
  § Software installation
  § Configuration
  § Monitoring and exception handling
  § State management
§ 2001…2002: Review of available packages
  § Commercial: Full Linux support rare, insufficient
    reduction on staff to justify licence fees
  § Open Source: Lack of features considered
    essential, didn’t scale to required level
Box Management (2)
§ ELFms (http://cern.ch/ELFms)
  § CERN development in collaboration with
    many HEP sites and in the context of the
    European DataGrid (EDG) project
§ Components:
  § Quattor: installation and configuration
  § Lemon: monitoring and corrective actions
  § Leaf: workflow and state management
Box Management (3): ELFms Overview

                    Leaf
                   Logistical
                  Management

          Lemon
    Performance
                                Configuration
    & Exception      Node
                                Management
     Monitoring



                     Node
                  Management
Box Management (4): Quattor
                               Configuration server
                                                                    Used by 18 organisations
                                          SQL                     besides CERN; including two
                                       SQL backend                distributed implementations
                       CLI                                             with 5 and 18 sites.




                               SOAP
                       GUI               CDB
                  scripts
                                       XML backend

                                         HTTP
                                                      XML configuration profiles


 SW server(s)                                                                         Install server
                                Node Configuration
                                  Manager NCM
                                                                                        Install
                               CompA CompB CompC




                                                                         HTTP / PXE
     SW                                                                                Manager
                HTTP




                               ServiceAServiceBServiceC
  Repository            RPMs                               base OS

                                      RPMs / PKGs                                        System
                               SW Package Manager                                       installer
                                     SPMA


                               Managed Nodes
Box Management (5): Lemon
                                  Repository




                                               SQL
                                   backend
                                                                   RRDTool
                                                                    / PHP
 Correlation                     Monitoring
                         SOAP




                                               SOAP
  Engines                        Repository                        apache
                                  TCP/UDP                             HTTP




   Nodes


           Monitoring Agent
                                                      Lemon          Web
                                                        CLI        browser

    Sensor     Sensor   Sensor                                 User

                                                      User Workstations
Box Management (6): Lemon
§ Apart from node parameters, non-node
 parameters are monitored as well
  § Power, temperatures, …
  § Higher-level views of Castor, batch queues
    on worker nodes etc.
§ Complemented by user view of service
 availability: Service Level Status
Box Management (7): Leaf
§ HMS (Hardware management system)
  § Track systems through lifecycle
  § Automatic ticket creation
  § GUI to physically find systems by host name
§ SMS (State management system)
  § Automatic handling and tracking of high-level
    configuration steps
     § E.g. reconfigure, drain and reboot all cluster
      nodes for a new kernel version
Box Management (8): Status
§ Many thousands of boxes managed
  successfully by ELFms, both at CERN and
  elsewhere, despite decreasing staff levels
§ No indication of problems scaling up further
§ Changes being applied wherever necessary
  § E.g. support for virtual machines
§ Large-scale farm operation remains a
  challenge
  § Purchasing, hardware failures, …

Mais conteúdo relacionado

Mais procurados

20190314 cern register v3
20190314 cern register v320190314 cern register v3
20190314 cern register v3Tim Bell
 
Demonstrating Quantum Speed-Up with a Two-Transmon Quantum Processor Ph.D. d...
Demonstrating Quantum Speed-Up  with a Two-Transmon Quantum Processor Ph.D. d...Demonstrating Quantum Speed-Up  with a Two-Transmon Quantum Processor Ph.D. d...
Demonstrating Quantum Speed-Up with a Two-Transmon Quantum Processor Ph.D. d...Andreas Dewes
 
Early Application experiences on Summit
Early Application experiences on Summit Early Application experiences on Summit
Early Application experiences on Summit Ganesan Narayanasamy
 
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...Ganesan Narayanasamy
 
Multi-phase-field simulations with OpenPhase
Multi-phase-field simulations with OpenPhaseMulti-phase-field simulations with OpenPhase
Multi-phase-field simulations with OpenPhasePFHub PFHub
 
BonFIRE TridentCom presentation
BonFIRE TridentCom presentationBonFIRE TridentCom presentation
BonFIRE TridentCom presentationBonFIRE
 
Numerical and analytical studies of single and multiphase starting jets and p...
Numerical and analytical studies of single and multiphase starting jets and p...Numerical and analytical studies of single and multiphase starting jets and p...
Numerical and analytical studies of single and multiphase starting jets and p...Ruo-Qian (Roger) Wang
 
MS Thesis Defense
MS Thesis DefenseMS Thesis Defense
MS Thesis DefenseJim Dunshee
 
Mobile Mapping Spatial Database Framework
Mobile Mapping Spatial Database FrameworkMobile Mapping Spatial Database Framework
Mobile Mapping Spatial Database FrameworkConor Mc Elhinney
 

Mais procurados (14)

Big Data Management at CERN: The CMS Example
Big Data Management at CERN: The CMS ExampleBig Data Management at CERN: The CMS Example
Big Data Management at CERN: The CMS Example
 
20190314 cern register v3
20190314 cern register v320190314 cern register v3
20190314 cern register v3
 
Demonstrating Quantum Speed-Up with a Two-Transmon Quantum Processor Ph.D. d...
Demonstrating Quantum Speed-Up  with a Two-Transmon Quantum Processor Ph.D. d...Demonstrating Quantum Speed-Up  with a Two-Transmon Quantum Processor Ph.D. d...
Demonstrating Quantum Speed-Up with a Two-Transmon Quantum Processor Ph.D. d...
 
Early Application experiences on Summit
Early Application experiences on Summit Early Application experiences on Summit
Early Application experiences on Summit
 
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
 
DarkSide_GDR_Perasso
DarkSide_GDR_PerassoDarkSide_GDR_Perasso
DarkSide_GDR_Perasso
 
2011 06 17
2011 06 172011 06 17
2011 06 17
 
Ibm quantum computing
Ibm quantum computingIbm quantum computing
Ibm quantum computing
 
Multi-phase-field simulations with OpenPhase
Multi-phase-field simulations with OpenPhaseMulti-phase-field simulations with OpenPhase
Multi-phase-field simulations with OpenPhase
 
BonFIRE TridentCom presentation
BonFIRE TridentCom presentationBonFIRE TridentCom presentation
BonFIRE TridentCom presentation
 
Chapter 01 projects
Chapter 01 projectsChapter 01 projects
Chapter 01 projects
 
Numerical and analytical studies of single and multiphase starting jets and p...
Numerical and analytical studies of single and multiphase starting jets and p...Numerical and analytical studies of single and multiphase starting jets and p...
Numerical and analytical studies of single and multiphase starting jets and p...
 
MS Thesis Defense
MS Thesis DefenseMS Thesis Defense
MS Thesis Defense
 
Mobile Mapping Spatial Database Framework
Mobile Mapping Spatial Database FrameworkMobile Mapping Spatial Database Framework
Mobile Mapping Spatial Database Framework
 

Semelhante a Cern intro 2010-10-27-snw

Big Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsBig Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsAndrew Lowe
 
Petabye scale data challenge
Petabye scale data challengePetabye scale data challenge
Petabye scale data challengeJason Shih
 
Accelerating science with Puppet
Accelerating science with PuppetAccelerating science with Puppet
Accelerating science with PuppetTim Bell
 
ServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERN
ServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERNServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERN
ServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERNRené Haeberlin
 
Computing Challenges at the Large Hadron Collider
Computing Challenges at the Large Hadron ColliderComputing Challenges at the Large Hadron Collider
Computing Challenges at the Large Hadron Colliderinside-BigData.com
 
TeraGrid and Physics Research
TeraGrid and Physics ResearchTeraGrid and Physics Research
TeraGrid and Physics Researchshandra_psc
 
Open stack in action cern _openstack_accelerating_science
Open stack in action  cern _openstack_accelerating_scienceOpen stack in action  cern _openstack_accelerating_science
Open stack in action cern _openstack_accelerating_scienceeNovance
 
20121205 open stack_accelerating_science_v3
20121205 open stack_accelerating_science_v320121205 open stack_accelerating_science_v3
20121205 open stack_accelerating_science_v3Tim Bell
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridSwiss Big Data User Group
 
Accelerating Science with OpenStack.pptx
Accelerating Science with OpenStack.pptxAccelerating Science with OpenStack.pptx
Accelerating Science with OpenStack.pptxOpenStack Foundation
 
20121017 OpenStack Accelerating Science
20121017 OpenStack Accelerating Science20121017 OpenStack Accelerating Science
20121017 OpenStack Accelerating ScienceTim Bell
 
20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating ScienceTim Bell
 
The Search for Gravitational Waves
The Search for Gravitational WavesThe Search for Gravitational Waves
The Search for Gravitational Wavesinside-BigData.com
 
Tackling Tomorrow’s Computing Challenges Today at CERN
Tackling Tomorrow’s Computing Challenges Today at CERNTackling Tomorrow’s Computing Challenges Today at CERN
Tackling Tomorrow’s Computing Challenges Today at CERNinside-BigData.com
 
OSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe HaenOSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe HaenNETWAYS
 
淺嚐 LHCb 數據分析的滋味 Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLib
淺嚐 LHCb 數據分析的滋味 Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLib淺嚐 LHCb 數據分析的滋味 Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLib
淺嚐 LHCb 數據分析的滋味 Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLibYuan CHAO
 
Niko Neufeld "A 32 Tbit/s Data Acquisition System"
Niko Neufeld "A 32 Tbit/s Data Acquisition System"Niko Neufeld "A 32 Tbit/s Data Acquisition System"
Niko Neufeld "A 32 Tbit/s Data Acquisition System"Yandex
 

Semelhante a Cern intro 2010-10-27-snw (20)

Big Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsBig Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle Physics
 
Petabye scale data challenge
Petabye scale data challengePetabye scale data challenge
Petabye scale data challenge
 
Accelerating science with Puppet
Accelerating science with PuppetAccelerating science with Puppet
Accelerating science with Puppet
 
ServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERN
ServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERNServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERN
ServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERN
 
Computing Challenges at the Large Hadron Collider
Computing Challenges at the Large Hadron ColliderComputing Challenges at the Large Hadron Collider
Computing Challenges at the Large Hadron Collider
 
TeraGrid and Physics Research
TeraGrid and Physics ResearchTeraGrid and Physics Research
TeraGrid and Physics Research
 
Open stack in action cern _openstack_accelerating_science
Open stack in action  cern _openstack_accelerating_scienceOpen stack in action  cern _openstack_accelerating_science
Open stack in action cern _openstack_accelerating_science
 
20121205 open stack_accelerating_science_v3
20121205 open stack_accelerating_science_v320121205 open stack_accelerating_science_v3
20121205 open stack_accelerating_science_v3
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC Datagrid
 
Rapid optimisation techniques
Rapid optimisation techniquesRapid optimisation techniques
Rapid optimisation techniques
 
Accelerating Science with OpenStack.pptx
Accelerating Science with OpenStack.pptxAccelerating Science with OpenStack.pptx
Accelerating Science with OpenStack.pptx
 
20121017 OpenStack Accelerating Science
20121017 OpenStack Accelerating Science20121017 OpenStack Accelerating Science
20121017 OpenStack Accelerating Science
 
20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science
 
The Search for Gravitational Waves
The Search for Gravitational WavesThe Search for Gravitational Waves
The Search for Gravitational Waves
 
Presentatie Freya Blekman
Presentatie Freya BlekmanPresentatie Freya Blekman
Presentatie Freya Blekman
 
DA-JPL-final
DA-JPL-finalDA-JPL-final
DA-JPL-final
 
Tackling Tomorrow’s Computing Challenges Today at CERN
Tackling Tomorrow’s Computing Challenges Today at CERNTackling Tomorrow’s Computing Challenges Today at CERN
Tackling Tomorrow’s Computing Challenges Today at CERN
 
OSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe HaenOSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe Haen
 
淺嚐 LHCb 數據分析的滋味 Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLib
淺嚐 LHCb 數據分析的滋味 Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLib淺嚐 LHCb 數據分析的滋味 Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLib
淺嚐 LHCb 數據分析的滋味 Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLib
 
Niko Neufeld "A 32 Tbit/s Data Acquisition System"
Niko Neufeld "A 32 Tbit/s Data Acquisition System"Niko Neufeld "A 32 Tbit/s Data Acquisition System"
Niko Neufeld "A 32 Tbit/s Data Acquisition System"
 

Cern intro 2010-10-27-snw

  • 1. Grid Computing at the Large Hadron Collider: Massive Computing at the Limit of Scale, Space, Power and Budget Dr Helge Meinhard CERN, IT Department SNW Frankfurt, 27 October 2010
  • 2. CERN (1) § Conseil européen pour la recherche nucléaire – aka European Laboratory for Particle Physics § Facilities for fundamental research § Between Geneva and the Jura mountains, straddling the Swiss- French border § Founded in 1954
  • 3. CERN (2) § 20 member states § ~3300 staff members, fellows, students, apprentices § 10’000 users registered (~7’000 on site) § from more than 550 institutes in more than 80 countries § 1026 MCHF (~790 MEUR) annual budget § http://cern.ch/
  • 4. Physics at the LHC (1) Matter particles: fundamental building blocks Force particles: bind matter particles
  • 5. Physics at the LHC (2) § Four known forces: strong force, weak force, electromagnetism, gravitation § Standard model unifies three of them § Higgs particle § Verified to § Higgs condensate fills 0.1 percent level vacuum § Too many free § Acts like ‘molasse’, parameters slows other particles § E.g. particle masses down, gives them mass
  • 6. Physics at the LHC (3) § Open questions in particle physics: § Why are the parameters of the size as we observe them? § What gives the particles their masses? § How can gravity be integrated into a unified theory? § Why is there only matter and no anti-matter in the universe? § Are there more space-time dimensions than the 4 we know of? § What is dark energy and dark matter which makes up 98% of the universe? § Finding the Higgs and possible new physics with LHC will give the answers!
  • 7. The Large Hadron Collider (1) § Accelerator for protons against protons – 14 TeV collision energy § By far the world’s most powerful accelerator § Tunnel of 27 km circumference, 4 m diameter, 50…150 m below ground § Detectors at four collision points
  • 8. The Large Hadron Collider (2) § Approved 1994, first circulating beams on 10 September 2008 § Protons are bent by superconducting magnets (8 Tesla, operating at 2K = –271°C) all around the tunnel § Each beam: 3000 bunches of 100 billion protons each § Up to 40 million bunch collisions per second at the centre of each of the four detectors
  • 9. LHC Status and Future Plans Date Event 10-Sep-2008 First beam in LHC 19-Sep-2008 Leak when magnets ramped to full field for 7 TeV/beam 20-Nov-2009 First circulating beams since Sep-2008 30-Nov-2009 World record: 2 * 1.18 TeV, collisions soon after 19-Mar-2010 Another world record: 2 * 3.5 TeV 30-Mar-2010 First collisions at 2 * 3.5 TeV, special day for the press 26-Jul-2010 Experiments present first results at ICHEP conference 14-Oct-2010 Target luminosity for 2010 reached (10^32) Until end 2011 Run at 2 * 3.5 TeV to collect 1 fb-1 2012 Shutdown to prepare machine for 2 * 7 TeV 2013 - …(?) Run at 2 * 7 TeV
  • 10. LHC Detectors (1) ATLAS CMS LHCb
  • 11. LHC Detectors (2) 3’000 physicists (including 1’000 students) from 173 institutes of 37 countries
  • 12. LHC Data (1) The accelerator generates 40 million bunch collisions (“events”) every second at the centre of each of the four experiments’ detectors § Per bunch collision, typically ~20 proton-proton interactions § Particles from previous bunch collision only 7.5 cm away from detector centre
  • 13. LHC Data (2) Reduced by online computers that filter out a few hundred “good” events per second … 15’000 Terabytes = 3 million DVDs … which are recorded on disk 15 Petabytes per and magnetic tape at year for four 100…1’000 Megabytes/sec experiments 1 event = few Megabytes
  • 15. Summary of Computing Resource Requirements All experiments – 2008 From LCG TDR – June 2005 CERN All Tier-1s All Tier-2s Total CPU (MSPECint2000s) 25 56 61 142 Disk (Petabytes) 7 31 19 57 Tape (Petabytes) 18 35 53 30’000 CPU servers, 110’000 disks: Too much for CERN! CPU Disk Tape CERN CERN 18% 12% All Tier-2s CERN 33% 34% All Tier-2s 43% All Tier-1s All Tier-1s 66% All Tier-1s 39% 55%
  • 16. Worldwide LHC Computing Grid (1) § Tier 0: CERN § Data acquisition and Uni x Lab m initial processing grid for a § Data distribution regional group Uni a § Long-term curation Nordic Countries § Tier 1: 11 major centres Lab a UK USA § Managed mass storage § Data-heavy analysis France Spain § Dedicated 10 Gbps lines Tier 1 Uni n to CERN CERN Nether- Italy Tier 0 lands § Tier 2: More than 200 Tier2 centres in more than 30 Germany g Taiwan countries Lab b Lab c § Simulation grid for a § End-user analysis b physics Uni y study § Tier 3: from physicists’ a Uni b group desktops to small workgroup cluster Tier3 § Not covered by MoU physics Desktop department
  • 17. Worldwide LHC Computing Grid (2) § Grid middleware for “seemless” integration of services § Aim: Looks like single huge compute facility § Projects: EDG/EGEE/EGI, OSG § Big step from proof of concept to stable, large-scale production § Centres are autonomous, but lots of commonalities § Commodity hardware (e.g. x86 processors) § Linux (RedHat Enterprise Linux variant)
  • 18. CERN Computer Centre Functions: § WLCG: Tier 0, some T1/T2 § Support for smaller experiments at CERN § Infrastructure for the laboratory § …
  • 19. Requirements and Boundaries (1) § High Energy Physics applications require mostly integer processor performance § Large amount of processing power and storage needed for aggregate performance § No need for parallelism / low-latency high-speed interconnects § Can use large numbers of components with performance below optimum level (“coarse-grain parallelism”) § Infrastructure (building, electricity, cooling) is a concern § Refurbished two machine rooms (1500 + 1200 m2) for total air cooled power consumption of 2.5 MWatts § Will run out of power in about 2014…
  • 20. Requirements and Boundaries (2) § Major boundary condition: cost Purchased in 2004, § Getting maximum resources with fixed now retired budget… § … then dealing with cuts to “fixed” budget § Only choice: commodity equipment as far as possible, minimising TCO / performance § This is not always the solution with the cheapest investment cost!
  • 21. The Bulk Resources – Event Data Permanent storage on (simplified network topology) tape Disk as temporary buffer R Ethernet R backbone (multiple 10GigE) Data paths: R R tape « disk disk « cpu Router 10GigE Tapes/ Disk CPU servers servers servers
  • 22. CERN CC currently (September 2010) § 8’500 systems, 54’000 processing cores § CPU servers, disk servers, infrastructure servers § 49’900 TB raw on 58’500 disk drives § 25’000 TB used, 50’000 tape cartridges total (70’000 slots), 160 tape drives § Tenders in progress or planned (estimates) § 800 systems, 11’000 processing cores § 16’000 TB raw on 8’500 disk drives
  • 23. Disk Servers for Bulk Storage (1) § Target: temporary event data storage § More than 95% of disk storage capacity § Best TCO / performance: Integrated PC server § One or two x86 processors, 8…16 GB, PCI RAID card(s) § 16…24 hot-swappable 7’200 rpm SATA disks in server chassis § Gigabit or 10Gig Ethernet § Linux (of course) § Adjudication based on total usable capacity with constraints § Power consumption taken into account § Systems procured recently: depending on specs, 5…20 TB usable § Looking at software RAID, external iSCSI disk enclosures § Home-made optimised protocol (rfcp) and HSM software (Castor)
  • 24. Disk Servers for Bulk Storage (2)
  • 25. Disk Servers for Bulk Storage (3)
  • 26. Other Disk-based Storage § For dedicated applications (not physics bulk data): § SAN/FC storage § NAS storage § iSCSI storage § Total represents well below 5% of disk capacity § Consolidation project ongoing
  • 27. Procurement Guidelines § Qualify companies to participate in calls for tender § A-brands and their resellers § Highly qualified assemblers/integrators § Specify performance rather than box counts § Some constraints on choices for solution § Leave detailed system design to bidder § Decide based on TCO § Purchase price § Box count, network connections § Total power consumption
  • 28. The Power Challenge – why bother? § Infrastructure limitations § E.g. CERN: 2.5 MW for IT equipment § Need to fit maximum capacity into given power envelope § Electricity costs money § Costs likely to raise (steeply) over the next few years § IT responsible of significant fraction of world energy consumption § Server farms in 2008: 1…2% of the world’s energy consumption (annual growth rate: 16…23%) § CERN’s data centre is 0.1 per mille of this… § Responsibility towards mankind demands using the energy as efficiently as possible § Saving a few percent of energy consumption makes a big difference
  • 29. CERN’s Approach § Don’t look in detail at PSU, fans, CPUs, chipset, RAM, disk drives, VRMs, RAID controllers, … § Rather: Measure apparent (VA) power consumption in primary AC circuit § CPU servers: 80% full load, 20% idle § Storage and infrastructure servers: 50% full load, 50% idle § Add element reflecting power consumption to purchase price § Adjudicate on the sum of purchase price and power adjudication element
  • 30. Power Efficiency: Lessons Learned § CPU servers: power efficiency increased by factor 12 in a little over four years § Need to benchmark concrete servers § Generic statements on platform are void § Fostering energy-efficient solutions makes a difference § Power supplies feeding more than one system usually more power-efficient § Redundant power supplies are inefficient
  • 31. Future (1) § Is IT growth sustainable? § Demands continue to rise exponentially § Even if Moore’s law continues to apply, data centres will need to grow in number and size § IT already consuming 2% of world’s energy – where do we go? § How to handle growing demands within a given data centre? § Demands evolve very rapidly, technologies less so, infrastructure even at a slower pace – how to best match these three?
  • 32. Future (2) § IT: Ecosystem of § Hardware § OS software and tools § Applications § Evolving at different paces: hardware fastest, applications slowest § How to make sure at any given time that they match reasonably well?
  • 33. Future (3) § Example: single-core to multi-core to many-core § Most HEP applications currently single- threaded § Consider server with two quad-core CPUs as eight independent execution units § Model does not scale much further § Need to adapt applications to many-core machines § Large, long effort
  • 34. Summary § The Large Hadron Collider (LHC) and its experiments is a very data (and compute) intensive project § LHC has triggered or pushed new technologies § E.g. Grid middleware, WANs § High-end or bleeding edge technology not necessary everywhere § That’s why we can benefit from the cost advantages of commodity hardware § Scaling computing to the requirements of LHC is hard work § IT power consumption/efficiency is a primordial concern § We are steadily taking collision data at 2 * 3.5 TeV, and have the capacity in place for dealing with this § We are on track for further ramp-ups of the computing capacity for future requirements
  • 35. Thank you Summary of Computing Resource Requirements All experiments - 2008 From LCG TDR - June 2005 CERN All Tier-1s All Tier-2s Total CPU (MSPECint2000s) 25 56 61 142 Disk (PetaBytes) 7 31 19 57 Tape (PetaBytes) 18 35 53
  • 37. CPU Servers (1) § Simple, stripped down, “HPC like” boxes § No fast low-latency interconnects § EM64T or AMD64 processors (usually 2), 2 or 3 GB/core, 1 disk/processor § Open to multiple systems per enclosure § Adjudication based on total performance (SPECcpu2006 – all_cpp subset) § Power consumption taken into account § Linux (of course)
  • 39. Tape Infrastructure (1) § 15 Petabytes per year § … and in 10 or 15 years’ time physicists will want to go back to 2010 data! § Requirements for permanent storage: § Large capacity § Sufficient bandwidth § Proven for long-term data curation § Cost-effective § Solution: High-end tape infrastructure
  • 41. Mass Storage System (1) § Interoperation challenge locally at CERN § 100+ tape drives § 1’000+ RAID volumes on disk servers § 10’000+ processing slots on worker nodes § HSM required § Commercial options carefully considered and rejected: OSM, HPSS § CERN development: CASTOR (CERN Advanced Storage Manager) http://cern.ch/castor
  • 42. Mass Storage System (2) § Key CASTOR features § Database-centric layered architecture § Stateless agents; can restart easily on error § No direct connection from users to critical services § Scheduled access to I/O § No overloading of disk servers § Per-server limit set according to type of transfer § servers can support many random access style accesses, but only a few sustained data transfers § I/O requests can be scheduled according to priority § Fair shares access to I/O just as for CPU § Prioritise requests from privileged users § Performance and stability proven at the level required for Tier 0 operation
  • 43. Box Management (1) § Many thousand boxes § Hardware management (install, repair, move, retire) § Software installation § Configuration § Monitoring and exception handling § State management § 2001…2002: Review of available packages § Commercial: Full Linux support rare, insufficient reduction on staff to justify licence fees § Open Source: Lack of features considered essential, didn’t scale to required level
  • 44. Box Management (2) § ELFms (http://cern.ch/ELFms) § CERN development in collaboration with many HEP sites and in the context of the European DataGrid (EDG) project § Components: § Quattor: installation and configuration § Lemon: monitoring and corrective actions § Leaf: workflow and state management
  • 45. Box Management (3): ELFms Overview Leaf Logistical Management Lemon Performance Configuration & Exception Node Management Monitoring Node Management
  • 46. Box Management (4): Quattor Configuration server Used by 18 organisations SQL besides CERN; including two SQL backend distributed implementations CLI with 5 and 18 sites. SOAP GUI CDB scripts XML backend HTTP XML configuration profiles SW server(s) Install server Node Configuration Manager NCM Install CompA CompB CompC HTTP / PXE SW Manager HTTP ServiceAServiceBServiceC Repository RPMs base OS RPMs / PKGs System SW Package Manager installer SPMA Managed Nodes
  • 47. Box Management (5): Lemon Repository SQL backend RRDTool / PHP Correlation Monitoring SOAP SOAP Engines Repository apache TCP/UDP HTTP Nodes Monitoring Agent Lemon Web CLI browser Sensor Sensor Sensor User User Workstations
  • 48. Box Management (6): Lemon § Apart from node parameters, non-node parameters are monitored as well § Power, temperatures, … § Higher-level views of Castor, batch queues on worker nodes etc. § Complemented by user view of service availability: Service Level Status
  • 49. Box Management (7): Leaf § HMS (Hardware management system) § Track systems through lifecycle § Automatic ticket creation § GUI to physically find systems by host name § SMS (State management system) § Automatic handling and tracking of high-level configuration steps § E.g. reconfigure, drain and reboot all cluster nodes for a new kernel version
  • 50. Box Management (8): Status § Many thousands of boxes managed successfully by ELFms, both at CERN and elsewhere, despite decreasing staff levels § No indication of problems scaling up further § Changes being applied wherever necessary § E.g. support for virtual machines § Large-scale farm operation remains a challenge § Purchasing, hardware failures, …