SlideShare uma empresa Scribd logo
1 de 33
Baixar para ler offline
The EventView
Analysis Framework



              Akira Shibata, QMUL
               Amir Farbin, CERN
               Kyle Cranmer, BNL

                                    1
Outline
          1.   Event Data Model and Analysis Model
               • Where we are.
               • Current issues.
          2.   What is EventView?
               • The EventView EDM class.
               • EventView framework.
               • PhysicsView packages.
          3.   Where is EventView?
               • What is the scope of EventView in PAT.
               • The scope of EventView in DA.
               • Current use cases.
          4.   Future developments
               • Future developments in EDM
               • Future developments in PANDA


                                                          2
Our Goal
• Is to do physics analysis to the level that we
  can publish our result.
• Great deal of effort went in. Lots of trial and
  error and it is still evolving very fast.
  (Computing Service Commissioning, CSC is
  one such place.)
• There are thousands of people trying to
  analyse huge amount of data and we need
  the right environment to do this.
• Physics analysis tools and distributed analysis
  are trying to solve the problems.

                                                    3
Acronyms
•   Athena: The Atlas Software (algorithms in C++, steering with Python). Used for Monte Carlo,
    Simulation, Recosntruction, and Analysis. Also used by the High Level Trigger.
•   RAW: Events as output by the Event Filter for reconstruction. The computing model assumes
    it is only at Tier0 site (1.6MB/event at rate of 200Hz).
•   CBNT: “ComBined NTuple” summarizes output of reconstruction. Read into Root, can’t be
    read in Athena. Not part of the computing model.
•   ESD: Event Summary Data, is the output of Reconstruction can be read into Athena. Only to
    be stored at Tier1 sites. (500kB/event).
•   AOD: Analysis Object Data, distributed to every one and meant for analysis, can be readin
    Athena(100kB/event).
•   DPD: Derived Physics Data, included in computing model, something analogous to an ntuple.
•   Tags: Event-level metadata used to quickly select “interesting events” (1kB/event).




This talk focuses mainly on DPD production from AOD.


                                                                                                  4
tier 0                    RAW

                              Production


                                ESD
  tier 1                                                                      EV Code
                                    Production



                                       AOD



                                                        Slim
                                                        AOD
                                           Tag
  tier 2 Athena Code
                              Athena Dump


                                                                 Event View




                           Private Dump                                 Private
               CBNT                               EV Ntuple
                               Ntuple                                Analysis Ntuple
home (DPD)
             impossible     no standard           standardized        personalized
                          waste of resource      common prod.


                                    Less Analysis in ROOT


                                                                                        5
Part 1
Event Data Model and Analysis Model




                                      6
! object usedused by reconstruction algorithm to carry state
         ! object by reconstruction algorithm to carry state
           between various tools
   Event Data Model
      between various tools                        History of our data format and
        Evolution: MigrationCBNT dumper to be dumped into
         Evolution of information toCBNTto POOL dumped into
         ! carry all    Workflow to the the dumper to history of our analysis.
    ! carry all information                             be
    Evolution “flat” of Workflow
          Evolution ntuple
           a of Workflow
      a “flat”Athena.exe
               ntuple                         Root     or paw/hbook
  Ancient History
    DC1
            Athena.exe
     Athena.exe                                                                  Root
                                                                             Athena         Rootpaw/hbook
                                                                                             or     or paw/hbook                                                                                                                  ROOT




                                                                                                                                                                                                                                               Signal significance
                                                                                                                                                                                                                                                                                                                               H 5 !!
                                                                                                                                                                                                                                                                                        $ L dt = 30 fb!1                       ttH (H 5 bb)
                                                                                                                                                                                                                                                                                        (no K!factors)                         H 5 ZZ
                                                                                                                                                                                                                                                                                                                                        (*)
                                                                                                                                                                                                                                                                                                                                              5 4l
                                                                                                                                                                                                                                                                                                                                          (*)
                                                                                                                                                                                                                                                                                        ATLAS                                  H 5 WW          5 l"l"
                                                                                                                                                                                                                                                                         10 2                                                                        (*)



                                                                                                        Event
         DC1


                                                                                                                                                                                                                                                                                                                               qqH 5         qq WW


                               ¯                                                                                                                                   Hadronic Top Mass                                                                                                                                           qqH 5         qq ##
                                                                                                                                                                                                                                                               ttbar (MC NLO)

                               b
 Before we used POOL to save Reconstructionthe input Event reconstruction
                              files,                         to
                                                                                                                                                                                                                                                                                                                                         Total significance




                                                                                                                                                                                                    Signal significance
                                                                                                                                                                                                                                                                                            H 5 !!
                                                                                                                                                                                                                                              $ L dt = 30 fb!1




                                                                                                                      Signal significance



                                                                                                                                                               Entry
                                                                                                                                                                                                  H 5 !!                                                                                    ttH (H 5 bb)
                                                                                                                                                                                     !1
                                                                                                                                                                       $ L dt = 30 fb             ttH (H 5 bb)         (no K!factors)
                                                                                                                                                                                                                           W + Jets (Alpgen)                                                H 5 ZZ
                                                                                                                                                                                                                                                                                                     (*)
                                                                                                                                                                                                                                                                                                           5 4l


                                                            Selection
                                                                                                                                                                                                                                                                                                       (*)
                                                                                                                                                                       (no K!factors)             H 5 ZZ
                                                                                                                                                                                                           (*)
                                                                                                                                                                                                                 5 4 l ATLAS                                                                H 5 WW          5 l"l"
                                                                                                                                                                       300

                                   ¯
                                                                                                                                                                                                  H 5 WW 2 5 l"l"
                                                                                                                                                                                                             (*)                                                                                                      (*)




                       Geant3
                                                                                                                                                                       ATLAS                              10
        DC1


                                                                                                                                                                                                                                                                                            qqH 5        qq WW

                                   q            Event
                                                                                                                                                      10 2                                                                              (*)                                                 qqH 5        qq ##
 DC1




                                                                                                                                                                                                  qqH 5                       qq WW
                                                                                                                                                                                                  qqH 5                       qq ##                                                                  Total significance
                                                                                                                                                                       250                                                                                               10
                                                                                                                                                                                                                          Total significance


              q
 was a ZEBRA file, and the EDM (egamma, etc.) had two purposes:
                                                          Analysis
                                               Selection Selection
                                                                  Reconstruction
                                                        Reconstruction                                                                                                 200




                Geant3 Geant3
                     W+       Reconstruction
                                   l+                                                                                                                                  150                                                  10


         t
                                                                                                                                                      10

              ν                                                                                                                                                        100
                                                                                                                                                                                                                                                                          1
                                                                                                                                                                                                                                                                                 100           120               140            160            180
                                                                                                                                                                                                                                                                                                                                                  mH (GeV/c )
                                                                                                                                                                                                                                                                                                                                                              200
                                                                                                                                                                                                                                                                                                                                                                    2



                                                                                                                                                                        50




        object used by reconstruction algorithm to carry state
                               b
    !
                                                                                                                                                                                                                              1
                                                                                                                                                                                                                                        100      120                              140          160         180                 200
                                                                                                                                                       1                 0 120                                                                                                                                                       2
                                                                                                                                                                100       0    50         140
                                                                                                                                                                                            100   160
                                                                                                                                                                                                  150                           180
                                                                                                                                                                                                                               200        250200 300                            350      400                      mH (GeV/c )
                                                                                                                                                                                                                                                                     2
                                                                                                                                                                                                                                    mH (GeV/c )                                    GeV




        between various tools
                                          root               Zebra                  CBNT                                                                                                                                                                                                                                                               PostSc
    ! carry all root
                  information to the CBNT dumper to be dumped into
                            Zebra    CBNTRoot
                                         root               Zebra
                                                             Zebra                 CBNT
                                                                                      CBNT                                                                                                                                              PostScriptPostScript
                                                                                                                                                                                                                                        PS

     Evolution of Workflow
      a “flat” ntuple
                                            or paw/hbook or PyRoot from athena −i −i
                                                   Athena or PyRoot ROOTathena
                                                     Root             from
          DC2/Rome/CSC
                Athena.py
          Athena.exe                                                               Root
   Much Athena.py current situation is related to
     of our bour
             of                                    Root this historical
Much Athena.py ¯current situation is related to this historical
                                            Root or PyRoot from athena −i
    situation.




                                                                                                                                                                                                                                               Signal significance
                                                                                                                                                                   Hadronic TopHMass




                                                                                                                                Signal significance
                                                                                                                                                                                                                                                               ttbar (MC NLO) $ L dt = 30 fb!1                                 H 5 !!




situation.
                                                                                                                                                                                5 !!                                                                                                                                           ttH (H 5 bb)
                                                                                                                                                                       $ L dt = 30 fb!1           ttH (H 5 bb)                                                                          (no K!factors)                                  (*)
                                                                                                                                                                                                                                                                                                                               H 5 ZZ         5 4l
                                                                                                                                                                       (no K!factors)             H 5 ZZ
                                                                                                                                                                                                           (*)
                                                                                                                                                                                                                 5 4l




                                                                                                                                                               Entry
                                                                                                                                                                                                                                                                                                                                          (*)




                                                                                                                                                                                                    Signal significance
                                                                                                                                                                                                                                                                                        ATLAS                                  H 5 WW          5 ll

                                   ¯
                                                                                                                                                                                                             (*)
                                                                                                                                                                       ATLAS                      H 5 WW          5 ll                           W + Jets (Alpgen) H 5 ! !
                                                                                                                                                                                                                                              $ L dt =10 2fb!1




                                                                                                                      Signal significance
                                                                                                                                                                                                                                                                                                                                                     (*)
                                                                                                                                                                                                                                                       30
                                   q                                                       Event  Tag
                                                                                                                                                      10   2                                                                            (*)                          ttH (H 5 bb)                                              qqH 5         qq WW
  DC1




                                                                                                                                                                                                  H 5 5 ! qq WW
                                                                                                                                                                                                  qqH !
                                                                                                                                                                       300dt = 30 fb!1
                                                                                                                                                                       $L                                                                     (no K!factors)                  (*)                                              qqH 5         qq ##

                                                                                            AOD                     Event
         DC2




                                                                                                                                                                                                  ttH (H5 5 qq ##
                                                                                                                                                                                                  qqH         bb)                                                    H 5 ZZ       5 4l
                                                                                                                                                                                                                                                                                                           (*)
                                                                                                                                                                       (no K!factors)             H 5 ZZ
                                                                                                                                                                                                            (*)
                                                                                                                                                                                                                  5 4 l ATLAS                                                               H 5       WW          5     ll             Total significance
                                                                                                                                                                                                        Total significance
                                                                                                                                                                                                  H 5 WW 2 5 ll
                                                                                                                                                                                                              (*)                                                                                                     (*)
                                                                                                                                                                       ATLAS                               10                                                                               qqH 5        qq WW

                                   q                                                      AOD  Tag Analysis Event
                                                                                                                                                      10   2           250                                                              (*)                                                 qqH 5        qq ##
         DC2




                                                                                                                                                                                                  qqH 5                       qq WW


                                                        Reconstruction          AOD  Tag Selection    Event
                                                                                                                                                                                                  qqH 5                       qq ##                                                                  Total significance
 DC2




                                   l+                               Reconstruction
                                                                     Reconstruction           Builders            Selection
                                                                                                                                                                                                                          Total significance



                                        Geant3 Geant4
                                                                                                                                                                       200
                         +
                     W                                            Reconstruction
                 t
                                                                                                                                                      10                                                                                                                 10

                                   ν                    Reconstruction            Builders Builders Selection Selection
                                                                                                                                                                       150



                                        Geant4 Geant4
                                                                                                                                                                                                                            10




Kyle Cranmer Cranmer (BNL)
        Kyle (BNL)                                  Analysis Model Workshop, October October 26, 2006
                                                             Analysis Model Workshop, 26, 2006
                                                                                                                                                      10



                                                                                                                                                                                                                                                                                                                                                                    4
                                                                                                                                                                       100



                        b                                                                                                                              1
                                                                                                                                                                100
                                                                                                                                                                        50
                                                                                                                                                                              120         140     160                             180          200                        1
                                                                                                                                                                                                                                                                                 100           120               140            160            180            200
                                                                                                                                                                                                                                                                     2
                                                                                                                                                                                                                                    mH (GeV/c )                                                                                                                     2
                                                                                                                                                                         0                                                    1200
                                                                                                                                                                                                                                                                                                                                                  mH (GeV/c )
                                                                                                                                                                          0         50      100    150                                250
                                                                                                                                                                                                                                    100      300
                                                                                                                                                                                                                                             120                                350
                                                                                                                                                                                                                                                                                  140    400 160           180                 200
                                                                                                                                                       1                                                                                                                                                                             2
                                                                                                                                                                100           120         140     160                           180      200                                       GeV                            mH (GeV/c )
                                                                                                                                                                                                                                                                     2
                                                                                                                                                                                                                                    mH (GeV/c )




                                         EVNT                  Hits                    ESD/
                                                   Zebra SimulDigi CBNT                                    Ntuple             PS
                                                                                                                            PostScript
                             root        Pool
                                           pool SimulDigi
                                                             Pool
                                                        SimulDigi ESD              ESD
                                                                                       AOD               AOD RootTuple RootTuple
                                                                                                       AOD           RootTuple              PostSc
                              pool         pool                                    ESD        AOD                            PostScriptPostScript
                                                   pool      pool
                                                            pool                              Tags       Tags
                                                                                                       Tags


Much Athena.py current situation is relatedRoot or PyRoot from athena −i
     of our                                 to this historical                                                                                                                                                                                                                                                                                                          7
•   CBNT
    • Flat ntuple with all information from reconstruction.
    • No common look and feel.
    • No official mechanism to share activity after reconstruction.
•   AOD/ESD is nice!
    • It has common interface to all objects.
    • With it, we can do analysis using Athena.
    • We need to publish and share analysis code for cross check and
      reproduction of results.
•   But why Athena?
    • All of our reconstruction alg is in Athena.
    • We’ll need to be able to re-run clustering, vertexing, track
      extrapolation, calibration etc on AOD.
•   How about Root
    • We always need to use Root for our final analysis.
    • Athena is good for event-level processing but not sample level.
•   Two stage analysis
    • We cannot do everything in one place. Some in Athena, rest in Root.
    • Support flexibility in analysis. But what to do where?
                                                                            8
Where to do Analysis?
                             Root lovers / Athena purists
                           physicists / computing specialists


1.   Pre-physics analysis: reconstruction.


2.   Baseline analysis: preselection  overlap removal, further processing such
     as jet re-calibration etc.


3.   Event level analysis: object reconstruction, event variable calculation etc.


4.   Sample level analysis: plot histogram, fitting templates, toy MC etc.



 Athena Central                   Athena Analysis on            Root Analysis
   Production                      Distributed Ana               Local or DA

                                                                                    9
Two-Stage Analysis                                                               A global view

                                                Feedback / port analysis to Athena

              Athena Analysis
           Development on Lxplus                                                                Root Analysis
                                                                                                  at Home




 AOD                   Athena          Ntuple                                          Ntuple                    Histogram




                                                      DPD Production
                                                         on DA                                                               Publish
                                                                                 Monitor and retrieve
                            DA Submission
           Feed back
                                                                                                Backnavigation
                  Re-Submission
 AOD                                            AOD         Ath          Ntupl
                                                                                                            DA Submission
analysis




                              Scope of this workshop:
                       Athena analysis and distributed analysis
                                                                                                                                       10
•   Athena analysis should offer:
    •  The official place for common analysis.
    •  Should be easy enough for physicists to use!
    •  Simple mechanism for sharing code - modular analysis.
    •  Should be more than an ntuple dumper, need to do
       “analysis”. (e.g. tune PID, re-calibrate)
•   AOD object definition is loose (needs to be):
    •  It has overlapping objects (e.g. electrons and jets).
    •  It has loose object selection (e.g. no isolation cut).
    •  It has multiple algorithms (e.g. Staco/Muid muons).
•   First step in analysis: Define your “view” of events:
    •  Do this by constructing EventView.
    •  Further baseline and event-level analysis using EventView
       tools and modules.
•   Produce Derived Physics Data, DPD (ntuple) for Root analysis:
    •  Private ntuple.
    •  Group standard DPD.
                                                                    11
Part 2
What is EventView?




                     12
What is EventView?
 EventView is a container which holds objects (or links to them) needed in physics analysis.




                                                                                               13
EventView Modular Analysis Framework
                                                   Calibrate               Reconst.
                         AOD                                                                  AOD     Insert
                                                   Electron                W and nu
         Preselect       BJet                                  Calculate                      Truth   Truth
  AOD                            AOD                              Mt
         Remove                        Calibrate
  Elec               Associate   MET                                                  Match
         Overlap                        MET
                       B-Jet                                                          Truth

  AOD
  Muon




Once we have defined our container, now analysis can be divided into pieces.
             Each step of analysis divided into separate tools.
     Each tool is made independent and general as much as possible.
         Analysis information is propagated through EventView.
   Each tool receives EventView and execute something on EventView.
          Gradually build up EventView with more information.

                                                                                                               14
EVToolLooper
                                                        Components of EV
   EVTool (Inserter)                                         Framework
               EV



                                                        •
          EVTool                        EVObjTool
        (ObjLooper)
                                Obj
                                      (ObjCalculator)       EVToolLooper (or just looper)
               EV
                                                            • Schedule all EV components.
Tool                 EVModule
                                                        •   EVTools (or just Tools)
                                                            •
               Obj
       EV              Tool
                                                              Smallest component in EV.
            Tool
                                                            • Written in C++.
               EV
                                                            • Mainly EV tool and EV obj tool.
                                                            •
                                      Module
              Tool              Obj    Tool                   Other types of tools too.
                                                        •
                                       Tool

               EV                                           EVModules (or just Modules)
              Tool                                          • Set of tools and configurations.
                                                            • Written in Python.
              DPD                                           • Can be EV level or obj level.

                                                                                                15
Extending the Framework
•   It is crucial that the framework is easily extendable.
    •   EventView framework is a collection of tools, extendibility is its
        natural feature.
•   Minimize work needed to add components.
    •   Subdivide usefulness of the tools into different types.
    •   Interface and functionality inherited through class hierarchy.
•   Skeleton code generated for typical type of tools.

                                     Generate skeleton




                                                     Just implement this function


                                                                                    16
AlgTool
                                         Object Oriented Design
                                                                                  Minimize effort in development

                                       EventViewBaseTool
           bool MultipleOutput
           bool MultipleInOut
           virtual StatusCode execute(EventView*)
           virtual StatusCode executeMO(EventView* EventViewContainer )
           virtual StatusCode executeMIMO(EventViewContainer , EventViewContainer )




                                          EVUDCalcBase
      EVUDEVCalcBase                                                              EVUDObjCalcBase
                                    bool m_EVUDTool
   executeEV(EventView*,                                              initializeTools()
                                    string m_prefix
   prefix, postfix)                                                     execute(*ev, *UD, *part, prefix, postfix)
                                    string m_postfix
                                                                      executeDummy(*ev, *UD, prefix, postfix)
                                    execute(ev, previx, postfix)



                                                                                                         T
     EVUDObjLooperBase                                                         EVUDObjCalcBaseT
  m_Labels                                                                 executeObj(*ev, *UD, *part,
  m_requireLabels                                                          prefix, postfix)
  m_rejectLabels                              A type of tools which
  copyUDtoEV
                                              calculates object level
                                                     variables
                                                                       EVUDElectronAll
EVUDFinalStateLooper
                                                                                 EVUDMuonAll
           EVUDInferredObjectLooper
                                                                                         EVUDParticleJetAll




                                                                                                                   17
Standard EventView Tool/Module Library
   •   EventViewBuilder (Container package)
       • EventViewBuilderUtils: Base classes of all tools
       • EventViewBuilderAlgs: Tool loopers
       • EventViewConfiguration: High level configuration interface
       • EventViewInserters: Tools for selection and overlap removal
       • EventViewDumpers: Output to screen, ntuple, and atlantis
       • EventViewUserData: Calculators and associators
       • EventViewTrigger: Tools for reading trigger info
       • EventViewPerformance: Performance study modules
       • EventViewCombiners: Combinatorics tools
       • EventViewSelectors: Event selection tools

   List growing thanks to contributions from people
      but we need more people to work on them.
                                                                       18
EVToolLooper
                                                                                               Multiple View (Branch)    An example EV analysis
                                  Reco




                                                                                                             •
                                  Branch Tool


Electron Branch                Tau Branch                   Tau Branch 2                                         One can define multiple
 Electron Inserter                Tau Inserter                Tau Inserter
                                                                                                                 views of an event:
  Muon Inserter               Electron Inserter             Electron Inserter

                                                                                                                 •  e.g. use different jet in
                                                                                                                    each view.
   Tau Inserter                Muon Inserter                 Muon Inserter

                                                                                             EVToolLooper




                                                                                                                 •
   Jet Inserter                   Jet Inserter                Jet Inserter                      Truth


                                                                                                                    e.g. use different object
    (cone04)                       (cone04)                    (cone07)

                                                                              Vtx


                                                                                                                    selection in each view.
                                                                             Filter         Truth Inserter




                                                                                                             •
                                                                                             Truth
                                                                                                                 Same analysis executed on
                                                 Analysis
                  Combinatorics                                                             Analysis


                  Selection
                                                                                                                 each EventView.
                                                                                                             •
                                                                                  NtupleDumper - Truth

                                                                                                                 Each view can talk to each
                                         KinematicCalc
            TruthAssociator



                                                                                                                 other:
                                                                                                                 •  compare views.
              NtupleDumper - Electron Branch
                                                                                Ntuple Output
                                                                     Trees


                                                                                                                 •
                  NtupleDumper - Tau Branch

                                                                                                                    order views.
                                                                                                Truth

                  NtupleDumper - Tau Branch 2



                                                                                                                 •
                                                                      Electron        Tau        Tau2

                                                                                                                    match views.
                              End

                                                                                                             •   Each view saved into separate
                                                                                                                 Root trees in ntuple.
                                                                                                                                                  19
Multiple View (Combinatorics)                  Another example analysis
                                         EVSort
                              Unsorted                                        Sorted
                                                                Comparator
EventView
Container                                                 EV0       EV1          EV0
                                 EV0         EV0
                                                                               (ex EV2)

                Combination                                                      EV1
    EventView                    EV1         EV1   Sort   EV1        EV2
                   Tool                                                        (ex EV1)

                                                                                 EV2
                                 EV2         EV2
                                                                               (ex EV0)




•      Multiple EV produced from combinatorics:
     •   e.g. take all combinations of dijets.
     •   e.g. combine one W and a b-jet to obtain top quark.
•      Manipulation of multiple EV through Multi Input (MI) tools:
     •   e.g. sort EV according to the pt of reconstructed top.
     •   e.g. remove EVs not satisfying condition.
•      Each view saved to separate tree or all views in one tree.

                                                                                           20
DPD Production
 •   EV analysis can easily produce custom ntuple with desired amount
     of information.
 •   It is not just copying information but a lot more needs to be done:
     •   Selection of objects
     •   Slimming contents
     •   Adding derived variables
     •   Calibration / correction
     •   you name it.
 •   Various physics groups started making their own ntuple:
     •   They are based on the same tools but the format is slightly
         different.
     •   Need common package to provide interface for producing DPD.
         HighPtView.
     •   Provide same look and feel but still support difference in
         contents.
     •   Central production of default ntuples. Good starting point and
         reference point.
                                                                           21
•   HighPtView implements a set
     HighPtView                                                                          of configurations to
                                                                                         manipulate tools/modules in
                                                                                         EventViewBuilder.
EventView
EVBuilder                                                                            •   Provides common interface
  EventView           EventView          EventView
                                                                  Performance
                                                                                         for DPD building for all
 Performance         Configuration
                                                                                         PhysicsView packages.
                                         Combiners
                                                                     Study



                                                                                     •
  EventView                              EventView                       Jet
  BuilderUtils
                      Package
                      EventView
                                         UserData
                                                                   Electron/Photon       Default physics contents
                                                                        Muon
                                                                                         (selection cuts) come from
 EventView                               EventView
 BuilderAlgs                             Dumpers
                                                                        Tau
  EventView
Transformation
                      EventView
                      Selectors
                                         EventView
                                          Inserters
                                                                   Flavour Tagging
                                                                                         performance study.
                                      Tools/
                                                      Feedback                       •   Physics groups may have their
                                                                                         own “view” packages (eg
                                    Interface



                     Tools/             HighPtView
                                                                                         TopView) for optimization
                                                                   Feedback
                                                                                         (override configuration) and
                   Interface



                                                Optimization
                                                                                         extra functionality (eg tools
                 GroupView                                                               for analysis).
                 SUSYView            TopView
                                                        HiggsTo
                                                        TauTau
                                                                     HiggsTo
                                                                     GamGam          •   Group view packages will
                                                                                         look similar inheriting
                                                                                         interface from HighPtView.
                                                                                                                         22
Part 3
Where is EventView?




                      23
People Involved
•   Core Developers:                     This is a VERY incomplete list of people.
    • Kyle Cranmer (BNL)               There are lots of other people contributing!

    • Amir Farbin (CERN)
    • Akira Shibata (QMUL)
•   Package Developers:
    • Attila Krasznahorkay (EventViewTrigger)
    • Jamie Boyd (EventViewPerformance)
    • Lorenzo Feligioni (Btagging)
    • Liza Mijovic (Development for release 13)
•   GroupArea management:
    • Akira (Lxplus), Fabien Tarrade (BNL), Sandrine Laplace (Lyon)
•   We have a lot of interaction with
    • physics groups and performance groups
    • distributed analysis and Athena core developers.


                                                                                      24
Where is EventView?
                                                EventViewTools
                               EventView
                                Trigger                                                   HiggsTo            HiggsTo
                                                                                          Gamma              TauTau
                                                        EventViewCore                     Gamma

                      EventView                                                                                        TopView
                       Inserters
                                                                                                    HighPt                                     Physics
                                                            EventView                                View                                      Groups
                                              EventView     BuilderAlgs
Performance
  Groups         EventView                    BuilderAlgs
                Performance                                                                                       SUSY
                                                                      EventView                                   View                          Physics
                                                                     Configuration                                                              Validation

                EventView
              Transformation                           Event                                                                     Distributed
                                                                                                                                  Analysis
                                                       View
                      EventView
                      UserData                                                                                     Central
                                                                                                                 Production
                                                                            EventView
                                                                            Combiners
                                           EventView
                                            Dumpers           EventView
                                                              Selectors
                                                                                                          Athena
                                                                                                           Core
                                                                                                        Developers
                  Analysis
                   EDM


                                                                                    Physics
                                                                                    Analysis
                                                                                     Tools




                                                                                                                                                            25
•   Interaction with various groups is VERY important to maintain a
    functional framework:
    •  Performance groups: Maintenance of relevant tools and input of
       physics contents (object selection, common performance study.)
    •  EDM: Compatibility of the tools to developing EDM. Optimal
       DPD format for the future
    •  Physics analysis tools: tool development and maintenance.
    •  Core developers: support for new feature request and help on
       core development.
    •  Central production: production of HighPtView ntuple.
    •  Distributed analysis: support for EV analysis.
    •  Physics validation: validation on standard DPD and RTT on
       Physics View packages.
    •  Physics groups: development of group tools and modules in
       PhysicsVies packages.
•   These things are just beginning to happen. More input and
    collaboration is required for:
    •  Development and maintenance of PhysicsView packages.
    •  Standardised performance study and selection tools.
                                                                        26
DA and EventView
EventView has a long experience with PANDA distributed framework because
           It is highly integrated with data management tool (dq2).
                    It supports GroupArea and configurables.
      It runs athena jobs seamlessly and no book keeping is necessary.

     1. Local test with sample fetched with DQ2.
      athena TopView_Top_jobOptions.py

     2. Set up grid/DQ2.
     source GridSetupLxplus.sh


     3. Send job (replace “athena” by “pathena”):
     pathena TopView_Top_jobOptions.py 
     --inDS csc11.005200.T1_McAtNlo_Jimmy.recon.AOD.v11004103 
     --outDS user.top_wg.TopView21.T1_McAtNlo_Jimmy.001


                                                                       27
TopView Use Case
              1.   Develop analysis based on EventView tools under TopView
                   runtime package: Write C++ tools, write python modules
  Athena           and write job options.
 Analysis
Developmnt
              2.   Run tests on lxplus with test samples fetched with DQ2
                   end user tool.
              3.   Commit changes to CVS and tag.
                   (iterate 1 and 2 until I’m happy; use HN if I need help)
Athena Ana.   4.   Send PANDA job specifying input/output dataset and a few
Submission         other parameters.
  on DA       5.   Check status on the web.
                   (if job failed, report Savannah bug and iterate 4 and 5)
              6.   Fetch results (AANTuple) using DQ2 end user tool and
Post-Athena        analyze with ROOT locally.
  Analysis
              7.   Go back to the AOD/ESD using the AANT and do further
                   analysis. (in future).
                                                                              28
Two-stage Iterative Analysis Process                                                                                   Again!
 1st stage: Define common optimal
preparation for a physics group. Includes
                                                                            2nd stage:     Interactive analysis in ROOT
                                                                             with quick turnaround. Use TMVA, Roofit
   selection, calibration, b-tagging. Use
                                                                                and other tools outside framework.
    modular analysis framework and
                                                                             Produce final plots to be published. Back-
feedback/tools from performance study.
                                                                                 navigate through DA if necessary.
 Parameterise using common tools and
          python configuration.
                                                 Feedback / port analysis to Athena

               Athena Analysis
            Development on Lxplus                                                                Root Analysis
                                                                                                   at Home




   AOD                  Athena          Ntuple                                          Ntuple                    Histogram




                                                       DPD Production
                                                          on DA                                                               Publish
                                                                                  Monitor and retrieve
                             DA Submission
            Feed back
                                                                                                 Backnavigation
                   Re-Submission
  AOD                                            AOD         Ath          Ntupl
                                                                                                             DA Submission
 analysis




                                     Group-wise central production on DA system.
                                                                                                                                        29
Part 4
Future Developments




                      30
•   Faster AOD access:              EDM in Release 13
    •   Factor of ~10 in I/O speed expected thanks to transient
        persistent separation.
    •   Broadens the use case of Athena analysis.
•   AOD access from root or structured DPD:
    •   Different technology now under test: pAOD, SAN, and new
        Scott’s recipe.
    •   Broadens the use case of Root analysis and lowers the wall
        between Athena and Root: higher portability of analysis code.
    •   We will still need common framework just like before!
        EventView will write out structured format.
•   ESD/AOD merger
    •   ESD / AOD classes are now merged: identical interface for ESD
        and AOD analysis.
    •   Can use ESD based tools on AOD objects: e.g. particle
        identification.
                                                                        31
Future developments in PANDA

•   Root analysis support on PANDA

    •   Using PROOF or otherwise

•   Support for more sites with improvements in operation

    •   BNL has been fully operational for a long time.

    •   Added lcg sites: ANALY_CERN, ANALY_LYON,
        ANALY_RAL, ANALY_FZK, ANALY_CNAF,
        ANALY_PIC, ANALY_SARA




                                                            32
Tutorial?

Let’s see how this all works.




                                33

Mais conteúdo relacionado

Destaque

人工知能をビジネスに活かす
人工知能をビジネスに活かす人工知能をビジネスに活かす
人工知能をビジネスに活かすAkira Shibata
 
PyData Tokyo Tutorial & Hackathon #1
PyData Tokyo Tutorial & Hackathon #1PyData Tokyo Tutorial & Hackathon #1
PyData Tokyo Tutorial & Hackathon #1Akira Shibata
 
The LHC Explained by CNN
The LHC Explained by CNNThe LHC Explained by CNN
The LHC Explained by CNNAkira Shibata
 
Top Cross Section Measurement
Top Cross Section MeasurementTop Cross Section Measurement
Top Cross Section MeasurementAkira Shibata
 
LHCにおける素粒子ビッグデータの解析とROOTライブラリ(Big Data Analysis at LHC and ROOT)
LHCにおける素粒子ビッグデータの解析とROOTライブラリ(Big Data Analysis at LHC and ROOT)LHCにおける素粒子ビッグデータの解析とROOTライブラリ(Big Data Analysis at LHC and ROOT)
LHCにおける素粒子ビッグデータの解析とROOTライブラリ(Big Data Analysis at LHC and ROOT)Akira Shibata
 
リクルートにおけるデータのインフラ化への取組
リクルートにおけるデータのインフラ化への取組リクルートにおけるデータのインフラ化への取組
リクルートにおけるデータのインフラ化への取組Recruit Technologies
 
DataRobot活用状況@リクルートテクノロジーズ
DataRobot活用状況@リクルートテクノロジーズDataRobot活用状況@リクルートテクノロジーズ
DataRobot活用状況@リクルートテクノロジーズRecruit Technologies
 

Destaque (9)

人工知能をビジネスに活かす
人工知能をビジネスに活かす人工知能をビジネスに活かす
人工知能をビジネスに活かす
 
PyData Tokyo Tutorial & Hackathon #1
PyData Tokyo Tutorial & Hackathon #1PyData Tokyo Tutorial & Hackathon #1
PyData Tokyo Tutorial & Hackathon #1
 
The LHC Explained by CNN
The LHC Explained by CNNThe LHC Explained by CNN
The LHC Explained by CNN
 
Top Cross Section Measurement
Top Cross Section MeasurementTop Cross Section Measurement
Top Cross Section Measurement
 
LHC for Students
LHC for StudentsLHC for Students
LHC for Students
 
LHCにおける素粒子ビッグデータの解析とROOTライブラリ(Big Data Analysis at LHC and ROOT)
LHCにおける素粒子ビッグデータの解析とROOTライブラリ(Big Data Analysis at LHC and ROOT)LHCにおける素粒子ビッグデータの解析とROOTライブラリ(Big Data Analysis at LHC and ROOT)
LHCにおける素粒子ビッグデータの解析とROOTライブラリ(Big Data Analysis at LHC and ROOT)
 
20150128 cross2015
20150128 cross201520150128 cross2015
20150128 cross2015
 
リクルートにおけるデータのインフラ化への取組
リクルートにおけるデータのインフラ化への取組リクルートにおけるデータのインフラ化への取組
リクルートにおけるデータのインフラ化への取組
 
DataRobot活用状況@リクルートテクノロジーズ
DataRobot活用状況@リクルートテクノロジーズDataRobot活用状況@リクルートテクノロジーズ
DataRobot活用状況@リクルートテクノロジーズ
 

Semelhante a Analysis Software Development

Analysis Software Benchmark
Analysis Software BenchmarkAnalysis Software Benchmark
Analysis Software BenchmarkAkira Shibata
 
Benchmarker - A Good Friend for Performance
Benchmarker - A Good Friend for PerformanceBenchmarker - A Good Friend for Performance
Benchmarker - A Good Friend for Performancekwatch
 
Ncm2010 ruo ando
Ncm2010 ruo andoNcm2010 ruo ando
Ncm2010 ruo andoRuo Ando
 
[F6]sooin lang vlab
[F6]sooin lang vlab[F6]sooin lang vlab
[F6]sooin lang vlabNAVER D2
 
SDEC2011 Replacing legacy Telco DB/DW to Hadoop and Hive
SDEC2011 Replacing legacy Telco DB/DW to Hadoop and HiveSDEC2011 Replacing legacy Telco DB/DW to Hadoop and Hive
SDEC2011 Replacing legacy Telco DB/DW to Hadoop and HiveKorea Sdec
 
Foreman presentation at NYC puppet users
Foreman presentation at NYC puppet usersForeman presentation at NYC puppet users
Foreman presentation at NYC puppet usersohadlevy
 
NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud
NERD meets NIF:  Lifting NLP Extraction Results to the Linked Data CloudNERD meets NIF:  Lifting NLP Extraction Results to the Linked Data Cloud
NERD meets NIF: Lifting NLP Extraction Results to the Linked Data CloudGiuseppe Rizzo
 
PA Develops an LTE PHY for Catapult
PA Develops an LTE PHY for CatapultPA Develops an LTE PHY for Catapult
PA Develops an LTE PHY for Catapultgrahambell
 
Machine Learning with Mahout
Machine Learning with MahoutMachine Learning with Mahout
Machine Learning with Mahoutbigdatasyd
 
High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...
High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...
High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...npinto
 
Inside Python [OSCON 2012]
Inside Python [OSCON 2012]Inside Python [OSCON 2012]
Inside Python [OSCON 2012]Tom Lee
 
Dataiku pig - hive - cascading
Dataiku   pig - hive - cascadingDataiku   pig - hive - cascading
Dataiku pig - hive - cascadingDataiku
 
WORKS 11 Presentation
WORKS 11 PresentationWORKS 11 Presentation
WORKS 11 Presentationdgarijo
 
Easydd program3
Easydd program3Easydd program3
Easydd program3Taha Sochi
 
産総研におけるプライベートクラウドへの取り組み
産総研におけるプライベートクラウドへの取り組み産総研におけるプライベートクラウドへの取り組み
産総研におけるプライベートクラウドへの取り組みRyousei Takano
 
RSJ2011 OSS Robotics and Tools OpenHRI Intro
RSJ2011 OSS Robotics and Tools OpenHRI IntroRSJ2011 OSS Robotics and Tools OpenHRI Intro
RSJ2011 OSS Robotics and Tools OpenHRI IntroYosuke Matsusaka
 
Debs Presentation 2009 July62009
Debs Presentation 2009 July62009Debs Presentation 2009 July62009
Debs Presentation 2009 July62009Opher Etzion
 
Jpylyzer, a validation and feature extraction tool developed in SCAPE project
Jpylyzer, a validation and feature extraction tool developed in SCAPE projectJpylyzer, a validation and feature extraction tool developed in SCAPE project
Jpylyzer, a validation and feature extraction tool developed in SCAPE projectSCAPE Project
 

Semelhante a Analysis Software Development (20)

Analysis Software Benchmark
Analysis Software BenchmarkAnalysis Software Benchmark
Analysis Software Benchmark
 
Benchmarker - A Good Friend for Performance
Benchmarker - A Good Friend for PerformanceBenchmarker - A Good Friend for Performance
Benchmarker - A Good Friend for Performance
 
Ncm2010 ruo ando
Ncm2010 ruo andoNcm2010 ruo ando
Ncm2010 ruo ando
 
[F6]sooin lang vlab
[F6]sooin lang vlab[F6]sooin lang vlab
[F6]sooin lang vlab
 
SDEC2011 Replacing legacy Telco DB/DW to Hadoop and Hive
SDEC2011 Replacing legacy Telco DB/DW to Hadoop and HiveSDEC2011 Replacing legacy Telco DB/DW to Hadoop and Hive
SDEC2011 Replacing legacy Telco DB/DW to Hadoop and Hive
 
Foreman presentation at NYC puppet users
Foreman presentation at NYC puppet usersForeman presentation at NYC puppet users
Foreman presentation at NYC puppet users
 
NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud
NERD meets NIF:  Lifting NLP Extraction Results to the Linked Data CloudNERD meets NIF:  Lifting NLP Extraction Results to the Linked Data Cloud
NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud
 
The NERD project
The NERD projectThe NERD project
The NERD project
 
PA Develops an LTE PHY for Catapult
PA Develops an LTE PHY for CatapultPA Develops an LTE PHY for Catapult
PA Develops an LTE PHY for Catapult
 
Machine Learning with Mahout
Machine Learning with MahoutMachine Learning with Mahout
Machine Learning with Mahout
 
High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...
High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...
High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...
 
Inside Python
Inside PythonInside Python
Inside Python
 
Inside Python [OSCON 2012]
Inside Python [OSCON 2012]Inside Python [OSCON 2012]
Inside Python [OSCON 2012]
 
Dataiku pig - hive - cascading
Dataiku   pig - hive - cascadingDataiku   pig - hive - cascading
Dataiku pig - hive - cascading
 
WORKS 11 Presentation
WORKS 11 PresentationWORKS 11 Presentation
WORKS 11 Presentation
 
Easydd program3
Easydd program3Easydd program3
Easydd program3
 
産総研におけるプライベートクラウドへの取り組み
産総研におけるプライベートクラウドへの取り組み産総研におけるプライベートクラウドへの取り組み
産総研におけるプライベートクラウドへの取り組み
 
RSJ2011 OSS Robotics and Tools OpenHRI Intro
RSJ2011 OSS Robotics and Tools OpenHRI IntroRSJ2011 OSS Robotics and Tools OpenHRI Intro
RSJ2011 OSS Robotics and Tools OpenHRI Intro
 
Debs Presentation 2009 July62009
Debs Presentation 2009 July62009Debs Presentation 2009 July62009
Debs Presentation 2009 July62009
 
Jpylyzer, a validation and feature extraction tool developed in SCAPE project
Jpylyzer, a validation and feature extraction tool developed in SCAPE projectJpylyzer, a validation and feature extraction tool developed in SCAPE project
Jpylyzer, a validation and feature extraction tool developed in SCAPE project
 

Mais de Akira Shibata

大規模言語モデル開発を支える分散学習技術 - 東京工業大学横田理央研究室の藤井一喜さん
大規模言語モデル開発を支える分散学習技術 - 東京工業大学横田理央研究室の藤井一喜さん大規模言語モデル開発を支える分散学習技術 - 東京工業大学横田理央研究室の藤井一喜さん
大規模言語モデル開発を支える分散学習技術 - 東京工業大学横田理央研究室の藤井一喜さんAkira Shibata
 
W&B monthly meetup#7 Intro.pdf
W&B monthly meetup#7 Intro.pdfW&B monthly meetup#7 Intro.pdf
W&B monthly meetup#7 Intro.pdfAkira Shibata
 
20230705 - Optuna Integration (to share).pdf
20230705 - Optuna Integration (to share).pdf20230705 - Optuna Integration (to share).pdf
20230705 - Optuna Integration (to share).pdfAkira Shibata
 
W&B Seminar #5(to share).pdf
W&B Seminar #5(to share).pdfW&B Seminar #5(to share).pdf
W&B Seminar #5(to share).pdfAkira Shibata
 
makoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdf
makoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdfmakoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdf
makoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdfAkira Shibata
 
LLM Webinar - シバタアキラ to share.pdf
LLM Webinar - シバタアキラ to share.pdfLLM Webinar - シバタアキラ to share.pdf
LLM Webinar - シバタアキラ to share.pdfAkira Shibata
 
Kaggle and data science
Kaggle and data scienceKaggle and data science
Kaggle and data scienceAkira Shibata
 
PyData NYC by Akira Shibata
PyData NYC by Akira ShibataPyData NYC by Akira Shibata
PyData NYC by Akira ShibataAkira Shibata
 
20141127 py datatokyomeetup2
20141127 py datatokyomeetup220141127 py datatokyomeetup2
20141127 py datatokyomeetup2Akira Shibata
 
Top quark physics at the LHC
Top quark physics at the LHCTop quark physics at the LHC
Top quark physics at the LHCAkira Shibata
 

Mais de Akira Shibata (12)

大規模言語モデル開発を支える分散学習技術 - 東京工業大学横田理央研究室の藤井一喜さん
大規模言語モデル開発を支える分散学習技術 - 東京工業大学横田理央研究室の藤井一喜さん大規模言語モデル開発を支える分散学習技術 - 東京工業大学横田理央研究室の藤井一喜さん
大規模言語モデル開発を支える分散学習技術 - 東京工業大学横田理央研究室の藤井一喜さん
 
W&B monthly meetup#7 Intro.pdf
W&B monthly meetup#7 Intro.pdfW&B monthly meetup#7 Intro.pdf
W&B monthly meetup#7 Intro.pdf
 
20230705 - Optuna Integration (to share).pdf
20230705 - Optuna Integration (to share).pdf20230705 - Optuna Integration (to share).pdf
20230705 - Optuna Integration (to share).pdf
 
W&B Seminar #5(to share).pdf
W&B Seminar #5(to share).pdfW&B Seminar #5(to share).pdf
W&B Seminar #5(to share).pdf
 
makoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdf
makoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdfmakoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdf
makoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdf
 
LLM Webinar - シバタアキラ to share.pdf
LLM Webinar - シバタアキラ to share.pdfLLM Webinar - シバタアキラ to share.pdf
LLM Webinar - シバタアキラ to share.pdf
 
W&B Seminar #4.pdf
W&B Seminar #4.pdfW&B Seminar #4.pdf
W&B Seminar #4.pdf
 
Kaggle and data science
Kaggle and data scienceKaggle and data science
Kaggle and data science
 
Data x
Data xData x
Data x
 
PyData NYC by Akira Shibata
PyData NYC by Akira ShibataPyData NYC by Akira Shibata
PyData NYC by Akira Shibata
 
20141127 py datatokyomeetup2
20141127 py datatokyomeetup220141127 py datatokyomeetup2
20141127 py datatokyomeetup2
 
Top quark physics at the LHC
Top quark physics at the LHCTop quark physics at the LHC
Top quark physics at the LHC
 

Analysis Software Development

  • 1. The EventView Analysis Framework Akira Shibata, QMUL Amir Farbin, CERN Kyle Cranmer, BNL 1
  • 2. Outline 1. Event Data Model and Analysis Model • Where we are. • Current issues. 2. What is EventView? • The EventView EDM class. • EventView framework. • PhysicsView packages. 3. Where is EventView? • What is the scope of EventView in PAT. • The scope of EventView in DA. • Current use cases. 4. Future developments • Future developments in EDM • Future developments in PANDA 2
  • 3. Our Goal • Is to do physics analysis to the level that we can publish our result. • Great deal of effort went in. Lots of trial and error and it is still evolving very fast. (Computing Service Commissioning, CSC is one such place.) • There are thousands of people trying to analyse huge amount of data and we need the right environment to do this. • Physics analysis tools and distributed analysis are trying to solve the problems. 3
  • 4. Acronyms • Athena: The Atlas Software (algorithms in C++, steering with Python). Used for Monte Carlo, Simulation, Recosntruction, and Analysis. Also used by the High Level Trigger. • RAW: Events as output by the Event Filter for reconstruction. The computing model assumes it is only at Tier0 site (1.6MB/event at rate of 200Hz). • CBNT: “ComBined NTuple” summarizes output of reconstruction. Read into Root, can’t be read in Athena. Not part of the computing model. • ESD: Event Summary Data, is the output of Reconstruction can be read into Athena. Only to be stored at Tier1 sites. (500kB/event). • AOD: Analysis Object Data, distributed to every one and meant for analysis, can be readin Athena(100kB/event). • DPD: Derived Physics Data, included in computing model, something analogous to an ntuple. • Tags: Event-level metadata used to quickly select “interesting events” (1kB/event). This talk focuses mainly on DPD production from AOD. 4
  • 5. tier 0 RAW Production ESD tier 1 EV Code Production AOD Slim AOD Tag tier 2 Athena Code Athena Dump Event View Private Dump Private CBNT EV Ntuple Ntuple Analysis Ntuple home (DPD) impossible no standard standardized personalized waste of resource common prod. Less Analysis in ROOT 5
  • 6. Part 1 Event Data Model and Analysis Model 6
  • 7. ! object usedused by reconstruction algorithm to carry state ! object by reconstruction algorithm to carry state between various tools Event Data Model between various tools History of our data format and Evolution: MigrationCBNT dumper to be dumped into Evolution of information toCBNTto POOL dumped into ! carry all Workflow to the the dumper to history of our analysis. ! carry all information be Evolution “flat” of Workflow Evolution ntuple a of Workflow a “flat”Athena.exe ntuple Root or paw/hbook Ancient History DC1 Athena.exe Athena.exe Root Athena Rootpaw/hbook or or paw/hbook ROOT Signal significance H 5 !! $ L dt = 30 fb!1 ttH (H 5 bb) (no K!factors) H 5 ZZ (*) 5 4l (*) ATLAS H 5 WW 5 l"l" 10 2 (*) Event DC1 qqH 5 qq WW ¯ Hadronic Top Mass qqH 5 qq ## ttbar (MC NLO) b Before we used POOL to save Reconstructionthe input Event reconstruction files, to Total significance Signal significance H 5 !! $ L dt = 30 fb!1 Signal significance Entry H 5 !! ttH (H 5 bb) !1 $ L dt = 30 fb ttH (H 5 bb) (no K!factors) W + Jets (Alpgen) H 5 ZZ (*) 5 4l Selection (*) (no K!factors) H 5 ZZ (*) 5 4 l ATLAS H 5 WW 5 l"l" 300 ¯ H 5 WW 2 5 l"l" (*) (*) Geant3 ATLAS 10 DC1 qqH 5 qq WW q Event 10 2 (*) qqH 5 qq ## DC1 qqH 5 qq WW qqH 5 qq ## Total significance 250 10 Total significance q was a ZEBRA file, and the EDM (egamma, etc.) had two purposes: Analysis Selection Selection Reconstruction Reconstruction 200 Geant3 Geant3 W+ Reconstruction l+ 150 10 t 10 ν 100 1 100 120 140 160 180 mH (GeV/c ) 200 2 50 object used by reconstruction algorithm to carry state b ! 1 100 120 140 160 180 200 1 0 120 2 100 0 50 140 100 160 150 180 200 250200 300 350 400 mH (GeV/c ) 2 mH (GeV/c ) GeV between various tools root Zebra CBNT PostSc ! carry all root information to the CBNT dumper to be dumped into Zebra CBNTRoot root Zebra Zebra CBNT CBNT PostScriptPostScript PS Evolution of Workflow a “flat” ntuple or paw/hbook or PyRoot from athena −i −i Athena or PyRoot ROOTathena Root from DC2/Rome/CSC Athena.py Athena.exe Root Much Athena.py current situation is related to of our bour of Root this historical Much Athena.py ¯current situation is related to this historical Root or PyRoot from athena −i situation. Signal significance Hadronic TopHMass Signal significance ttbar (MC NLO) $ L dt = 30 fb!1 H 5 !! situation. 5 !! ttH (H 5 bb) $ L dt = 30 fb!1 ttH (H 5 bb) (no K!factors) (*) H 5 ZZ 5 4l (no K!factors) H 5 ZZ (*) 5 4l Entry (*) Signal significance ATLAS H 5 WW 5 ll ¯ (*) ATLAS H 5 WW 5 ll W + Jets (Alpgen) H 5 ! ! $ L dt =10 2fb!1 Signal significance (*) 30 q Event Tag 10 2 (*) ttH (H 5 bb) qqH 5 qq WW DC1 H 5 5 ! qq WW qqH ! 300dt = 30 fb!1 $L (no K!factors) (*) qqH 5 qq ## AOD Event DC2 ttH (H5 5 qq ## qqH bb) H 5 ZZ 5 4l (*) (no K!factors) H 5 ZZ (*) 5 4 l ATLAS H 5 WW 5 ll Total significance Total significance H 5 WW 2 5 ll (*) (*) ATLAS 10 qqH 5 qq WW q AOD Tag Analysis Event 10 2 250 (*) qqH 5 qq ## DC2 qqH 5 qq WW Reconstruction AOD Tag Selection Event qqH 5 qq ## Total significance DC2 l+ Reconstruction Reconstruction Builders Selection Total significance Geant3 Geant4 200 + W Reconstruction t 10 10 ν Reconstruction Builders Builders Selection Selection 150 Geant4 Geant4 10 Kyle Cranmer Cranmer (BNL) Kyle (BNL) Analysis Model Workshop, October October 26, 2006 Analysis Model Workshop, 26, 2006 10 4 100 b 1 100 50 120 140 160 180 200 1 100 120 140 160 180 200 2 mH (GeV/c ) 2 0 1200 mH (GeV/c ) 0 50 100 150 250 100 300 120 350 140 400 160 180 200 1 2 100 120 140 160 180 200 GeV mH (GeV/c ) 2 mH (GeV/c ) EVNT Hits ESD/ Zebra SimulDigi CBNT Ntuple PS PostScript root Pool pool SimulDigi Pool SimulDigi ESD ESD AOD AOD RootTuple RootTuple AOD RootTuple PostSc pool pool ESD AOD PostScriptPostScript pool pool pool Tags Tags Tags Much Athena.py current situation is relatedRoot or PyRoot from athena −i of our to this historical 7
  • 8. CBNT • Flat ntuple with all information from reconstruction. • No common look and feel. • No official mechanism to share activity after reconstruction. • AOD/ESD is nice! • It has common interface to all objects. • With it, we can do analysis using Athena. • We need to publish and share analysis code for cross check and reproduction of results. • But why Athena? • All of our reconstruction alg is in Athena. • We’ll need to be able to re-run clustering, vertexing, track extrapolation, calibration etc on AOD. • How about Root • We always need to use Root for our final analysis. • Athena is good for event-level processing but not sample level. • Two stage analysis • We cannot do everything in one place. Some in Athena, rest in Root. • Support flexibility in analysis. But what to do where? 8
  • 9. Where to do Analysis? Root lovers / Athena purists physicists / computing specialists 1. Pre-physics analysis: reconstruction. 2. Baseline analysis: preselection overlap removal, further processing such as jet re-calibration etc. 3. Event level analysis: object reconstruction, event variable calculation etc. 4. Sample level analysis: plot histogram, fitting templates, toy MC etc. Athena Central Athena Analysis on Root Analysis Production Distributed Ana Local or DA 9
  • 10. Two-Stage Analysis A global view Feedback / port analysis to Athena Athena Analysis Development on Lxplus Root Analysis at Home AOD Athena Ntuple Ntuple Histogram DPD Production on DA Publish Monitor and retrieve DA Submission Feed back Backnavigation Re-Submission AOD AOD Ath Ntupl DA Submission analysis Scope of this workshop: Athena analysis and distributed analysis 10
  • 11. Athena analysis should offer: • The official place for common analysis. • Should be easy enough for physicists to use! • Simple mechanism for sharing code - modular analysis. • Should be more than an ntuple dumper, need to do “analysis”. (e.g. tune PID, re-calibrate) • AOD object definition is loose (needs to be): • It has overlapping objects (e.g. electrons and jets). • It has loose object selection (e.g. no isolation cut). • It has multiple algorithms (e.g. Staco/Muid muons). • First step in analysis: Define your “view” of events: • Do this by constructing EventView. • Further baseline and event-level analysis using EventView tools and modules. • Produce Derived Physics Data, DPD (ntuple) for Root analysis: • Private ntuple. • Group standard DPD. 11
  • 12. Part 2 What is EventView? 12
  • 13. What is EventView? EventView is a container which holds objects (or links to them) needed in physics analysis. 13
  • 14. EventView Modular Analysis Framework Calibrate Reconst. AOD AOD Insert Electron W and nu Preselect BJet Calculate Truth Truth AOD AOD Mt Remove Calibrate Elec Associate MET Match Overlap MET B-Jet Truth AOD Muon Once we have defined our container, now analysis can be divided into pieces. Each step of analysis divided into separate tools. Each tool is made independent and general as much as possible. Analysis information is propagated through EventView. Each tool receives EventView and execute something on EventView. Gradually build up EventView with more information. 14
  • 15. EVToolLooper Components of EV EVTool (Inserter) Framework EV • EVTool EVObjTool (ObjLooper) Obj (ObjCalculator) EVToolLooper (or just looper) EV • Schedule all EV components. Tool EVModule • EVTools (or just Tools) • Obj EV Tool Smallest component in EV. Tool • Written in C++. EV • Mainly EV tool and EV obj tool. • Module Tool Obj Tool Other types of tools too. • Tool EV EVModules (or just Modules) Tool • Set of tools and configurations. • Written in Python. DPD • Can be EV level or obj level. 15
  • 16. Extending the Framework • It is crucial that the framework is easily extendable. • EventView framework is a collection of tools, extendibility is its natural feature. • Minimize work needed to add components. • Subdivide usefulness of the tools into different types. • Interface and functionality inherited through class hierarchy. • Skeleton code generated for typical type of tools. Generate skeleton Just implement this function 16
  • 17. AlgTool Object Oriented Design Minimize effort in development EventViewBaseTool bool MultipleOutput bool MultipleInOut virtual StatusCode execute(EventView*) virtual StatusCode executeMO(EventView* EventViewContainer ) virtual StatusCode executeMIMO(EventViewContainer , EventViewContainer ) EVUDCalcBase EVUDEVCalcBase EVUDObjCalcBase bool m_EVUDTool executeEV(EventView*, initializeTools() string m_prefix prefix, postfix) execute(*ev, *UD, *part, prefix, postfix) string m_postfix executeDummy(*ev, *UD, prefix, postfix) execute(ev, previx, postfix) T EVUDObjLooperBase EVUDObjCalcBaseT m_Labels executeObj(*ev, *UD, *part, m_requireLabels prefix, postfix) m_rejectLabels A type of tools which copyUDtoEV calculates object level variables EVUDElectronAll EVUDFinalStateLooper EVUDMuonAll EVUDInferredObjectLooper EVUDParticleJetAll 17
  • 18. Standard EventView Tool/Module Library • EventViewBuilder (Container package) • EventViewBuilderUtils: Base classes of all tools • EventViewBuilderAlgs: Tool loopers • EventViewConfiguration: High level configuration interface • EventViewInserters: Tools for selection and overlap removal • EventViewDumpers: Output to screen, ntuple, and atlantis • EventViewUserData: Calculators and associators • EventViewTrigger: Tools for reading trigger info • EventViewPerformance: Performance study modules • EventViewCombiners: Combinatorics tools • EventViewSelectors: Event selection tools List growing thanks to contributions from people but we need more people to work on them. 18
  • 19. EVToolLooper Multiple View (Branch) An example EV analysis Reco • Branch Tool Electron Branch Tau Branch Tau Branch 2 One can define multiple Electron Inserter Tau Inserter Tau Inserter views of an event: Muon Inserter Electron Inserter Electron Inserter • e.g. use different jet in each view. Tau Inserter Muon Inserter Muon Inserter EVToolLooper • Jet Inserter Jet Inserter Jet Inserter Truth e.g. use different object (cone04) (cone04) (cone07) Vtx selection in each view. Filter Truth Inserter • Truth Same analysis executed on Analysis Combinatorics Analysis Selection each EventView. • NtupleDumper - Truth Each view can talk to each KinematicCalc TruthAssociator other: • compare views. NtupleDumper - Electron Branch Ntuple Output Trees • NtupleDumper - Tau Branch order views. Truth NtupleDumper - Tau Branch 2 • Electron Tau Tau2 match views. End • Each view saved into separate Root trees in ntuple. 19
  • 20. Multiple View (Combinatorics) Another example analysis EVSort Unsorted Sorted Comparator EventView Container EV0 EV1 EV0 EV0 EV0 (ex EV2) Combination EV1 EventView EV1 EV1 Sort EV1 EV2 Tool (ex EV1) EV2 EV2 EV2 (ex EV0) • Multiple EV produced from combinatorics: • e.g. take all combinations of dijets. • e.g. combine one W and a b-jet to obtain top quark. • Manipulation of multiple EV through Multi Input (MI) tools: • e.g. sort EV according to the pt of reconstructed top. • e.g. remove EVs not satisfying condition. • Each view saved to separate tree or all views in one tree. 20
  • 21. DPD Production • EV analysis can easily produce custom ntuple with desired amount of information. • It is not just copying information but a lot more needs to be done: • Selection of objects • Slimming contents • Adding derived variables • Calibration / correction • you name it. • Various physics groups started making their own ntuple: • They are based on the same tools but the format is slightly different. • Need common package to provide interface for producing DPD. HighPtView. • Provide same look and feel but still support difference in contents. • Central production of default ntuples. Good starting point and reference point. 21
  • 22. HighPtView implements a set HighPtView of configurations to manipulate tools/modules in EventViewBuilder. EventView EVBuilder • Provides common interface EventView EventView EventView Performance for DPD building for all Performance Configuration PhysicsView packages. Combiners Study • EventView EventView Jet BuilderUtils Package EventView UserData Electron/Photon Default physics contents Muon (selection cuts) come from EventView EventView BuilderAlgs Dumpers Tau EventView Transformation EventView Selectors EventView Inserters Flavour Tagging performance study. Tools/ Feedback • Physics groups may have their own “view” packages (eg Interface Tools/ HighPtView TopView) for optimization Feedback (override configuration) and Interface Optimization extra functionality (eg tools GroupView for analysis). SUSYView TopView HiggsTo TauTau HiggsTo GamGam • Group view packages will look similar inheriting interface from HighPtView. 22
  • 23. Part 3 Where is EventView? 23
  • 24. People Involved • Core Developers: This is a VERY incomplete list of people. • Kyle Cranmer (BNL) There are lots of other people contributing! • Amir Farbin (CERN) • Akira Shibata (QMUL) • Package Developers: • Attila Krasznahorkay (EventViewTrigger) • Jamie Boyd (EventViewPerformance) • Lorenzo Feligioni (Btagging) • Liza Mijovic (Development for release 13) • GroupArea management: • Akira (Lxplus), Fabien Tarrade (BNL), Sandrine Laplace (Lyon) • We have a lot of interaction with • physics groups and performance groups • distributed analysis and Athena core developers. 24
  • 25. Where is EventView? EventViewTools EventView Trigger HiggsTo HiggsTo Gamma TauTau EventViewCore Gamma EventView TopView Inserters HighPt Physics EventView View Groups EventView BuilderAlgs Performance Groups EventView BuilderAlgs Performance SUSY EventView View Physics Configuration Validation EventView Transformation Event Distributed Analysis View EventView UserData Central Production EventView Combiners EventView Dumpers EventView Selectors Athena Core Developers Analysis EDM Physics Analysis Tools 25
  • 26. Interaction with various groups is VERY important to maintain a functional framework: • Performance groups: Maintenance of relevant tools and input of physics contents (object selection, common performance study.) • EDM: Compatibility of the tools to developing EDM. Optimal DPD format for the future • Physics analysis tools: tool development and maintenance. • Core developers: support for new feature request and help on core development. • Central production: production of HighPtView ntuple. • Distributed analysis: support for EV analysis. • Physics validation: validation on standard DPD and RTT on Physics View packages. • Physics groups: development of group tools and modules in PhysicsVies packages. • These things are just beginning to happen. More input and collaboration is required for: • Development and maintenance of PhysicsView packages. • Standardised performance study and selection tools. 26
  • 27. DA and EventView EventView has a long experience with PANDA distributed framework because It is highly integrated with data management tool (dq2). It supports GroupArea and configurables. It runs athena jobs seamlessly and no book keeping is necessary. 1. Local test with sample fetched with DQ2. athena TopView_Top_jobOptions.py 2. Set up grid/DQ2. source GridSetupLxplus.sh 3. Send job (replace “athena” by “pathena”): pathena TopView_Top_jobOptions.py --inDS csc11.005200.T1_McAtNlo_Jimmy.recon.AOD.v11004103 --outDS user.top_wg.TopView21.T1_McAtNlo_Jimmy.001 27
  • 28. TopView Use Case 1. Develop analysis based on EventView tools under TopView runtime package: Write C++ tools, write python modules Athena and write job options. Analysis Developmnt 2. Run tests on lxplus with test samples fetched with DQ2 end user tool. 3. Commit changes to CVS and tag. (iterate 1 and 2 until I’m happy; use HN if I need help) Athena Ana. 4. Send PANDA job specifying input/output dataset and a few Submission other parameters. on DA 5. Check status on the web. (if job failed, report Savannah bug and iterate 4 and 5) 6. Fetch results (AANTuple) using DQ2 end user tool and Post-Athena analyze with ROOT locally. Analysis 7. Go back to the AOD/ESD using the AANT and do further analysis. (in future). 28
  • 29. Two-stage Iterative Analysis Process Again! 1st stage: Define common optimal preparation for a physics group. Includes 2nd stage: Interactive analysis in ROOT with quick turnaround. Use TMVA, Roofit selection, calibration, b-tagging. Use and other tools outside framework. modular analysis framework and Produce final plots to be published. Back- feedback/tools from performance study. navigate through DA if necessary. Parameterise using common tools and python configuration. Feedback / port analysis to Athena Athena Analysis Development on Lxplus Root Analysis at Home AOD Athena Ntuple Ntuple Histogram DPD Production on DA Publish Monitor and retrieve DA Submission Feed back Backnavigation Re-Submission AOD AOD Ath Ntupl DA Submission analysis Group-wise central production on DA system. 29
  • 31. Faster AOD access: EDM in Release 13 • Factor of ~10 in I/O speed expected thanks to transient persistent separation. • Broadens the use case of Athena analysis. • AOD access from root or structured DPD: • Different technology now under test: pAOD, SAN, and new Scott’s recipe. • Broadens the use case of Root analysis and lowers the wall between Athena and Root: higher portability of analysis code. • We will still need common framework just like before! EventView will write out structured format. • ESD/AOD merger • ESD / AOD classes are now merged: identical interface for ESD and AOD analysis. • Can use ESD based tools on AOD objects: e.g. particle identification. 31
  • 32. Future developments in PANDA • Root analysis support on PANDA • Using PROOF or otherwise • Support for more sites with improvements in operation • BNL has been fully operational for a long time. • Added lcg sites: ANALY_CERN, ANALY_LYON, ANALY_RAL, ANALY_FZK, ANALY_CNAF, ANALY_PIC, ANALY_SARA 32
  • 33. Tutorial? Let’s see how this all works. 33