HNSciCloud CMS status-report

•

0 gostou•119 visualizações

Presentation by Daniele Spiga and Katarzyna Dziedziniewicz-Wojcik at HNSciCloud procurer hosted event in Geneva, 14 June 2018

Software

HNSciCloud CMS
status-report
Daniele Spiga and Katarzyna Dziedziniewicz-Wojcik

Setup
● CMS integrates the HNSCi Cloud resources as two distinct sites that share
the very same storage element (CERN).
○ As a matter of fact there are two distinct entries in the Glidein Factory
● This allows CMS to easily evaluate and compare the two providers
○ But this is not yet so meaningful cause we’ve not enough statistic on RHEA (see later)
● From technical perspective mapping everything under a single CMS_Site is
possible

Workflow
● Current configuration is based on CMS overflow mechanism
○ “Anything that can run at CERN can also run on HNSci Cloud” but we’ve added a filter on top
of this
● For now, we pick the campaigns that do not require to high I/O
○ Main reason for this is to fully validate the setup and make everything working (rock solid)
○ Evaluate both success rate and cpu efficiency
● Now that everything is working we’ll open to any type of workflow
○ To-do list

Resource Usage during past 30 days
CMS way over 1/4 of the resources
Ben’s monitoring page
● CMS integrated first TSystem and couple of week later RHEA.
○ We agreed to fully validate the whole setup with one of the two..
○ RHEA was off for a 1+ week,. then only 4-Core VMs (CMS requires
8 cores)

Performances: early analysis
So most of these numbers refer just to TSystem
- 10k+ jobs completed
- 80%+ success rate
- Most of the failure are not CMS related
(network issues etc.)
- 70%+ CPU efficiency

Early comments …
● From 10Km far “Everything worked out of the box”
● All the issues (very few) where due to misconfigurations (some of mine :) )..
As such everything was fixed easily
○ Many thanks to Ben and the batch team for the effective support and for all the provided
feedbacks which helped to quickly spot mistakes
● CMS uses multicore.. In order to get proper share we need to keep constant
pressure..
○ If we lose the share it takes hours (even 12+ with RHEA) to get a slot.

Mais conteúdo relacionado

Semelhante a HNSciCloud CMS status-report

How to Develop and Operate Cloud First Data PlatformsAlluxio, Inc.

Moodle at scale why assigning a role can cause a catastrophesammarshall_ou

From 10 Deploys Per Year to 4 Per Day at DBS Bank: How Pivotal Platform Can R...VMware Tanzu

【Unite 2017 Tokyo】C#ジョブシステムによるモバイルゲームのパフォーマンス向上テクニックUnity Technologies Japan K.K.

Sprint 73ManageIQ

Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Brian Brazil

Sql server tips from the fieldJoAnna Cheshire

Our Story With ClickHouse at seo.doMetehan Çetinkaya

Successful DevOps implementation for small teams a true storyJakub Paweł Głazik

The hourly network outage - Booking.com.pdfSiteReliabilityEngin

Slack in the Age of PrometheusGeorge Luong

Berlin AWS meetup: here.com on AWSCristian Măgherușan-Stanciu

Csa Summit 2017 - Managing multicloud environmentsCSA Argentina

OSMC 2010 | NSClient++ - what's new? And what's coming! by Michael MedinNETWAYS

Rankwave MOMENT™ (English)HyoungEun Kim

About VisualDNA Architecture @ Rubyslava 2014Michal Harish

Using SaltStack to DevOps the enterpriseChristian McHugh

Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessAnant Corporation

#lspe Building a Monitoring Framework using DTrace and MongoDBdan-p-kimmel

Parallel Processing in TM1 - QueBIT ConsultingQueBIT Consulting

Semelhante a HNSciCloud CMS status-report (20)

How to Develop and Operate Cloud First Data Platforms

Moodle at scale why assigning a role can cause a catastrophe

From 10 Deploys Per Year to 4 Per Day at DBS Bank: How Pivotal Platform Can R...

【Unite 2017 Tokyo】C#ジョブシステムによるモバイルゲームのパフォーマンス向上テクニック

Sprint 73

Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)

Sql server tips from the field

Our Story With ClickHouse at seo.do

Successful DevOps implementation for small teams a true story

The hourly network outage - Booking.com.pdf

Slack in the Age of Prometheus

Berlin AWS meetup: here.com on AWS

Csa Summit 2017 - Managing multicloud environments

OSMC 2010 | NSClient++ - what's new? And what's coming! by Michael Medin

Rankwave MOMENT™ (English)

About VisualDNA Architecture @ Rubyslava 2014

Using SaltStack to DevOps the enterprise

Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness

#lspe Building a Monitoring Framework using DTrace and MongoDB

Parallel Processing in TM1 - QueBIT Consulting

Mais de Helix Nebula The Science Cloud

M-PIL-3.2 Public SessionHelix Nebula The Science Cloud

Deep Learning for Fast SimulationHelix Nebula The Science Cloud

Interactive Data Analysis for End Users on HN Science CloudHelix Nebula The Science Cloud

Container Federation Use CasesHelix Nebula The Science Cloud

CERN Batch in the HNSciCloudHelix Nebula The Science Cloud

LHCb on RHEA and T-SystemsHelix Nebula The Science Cloud

Helix Nebula Science Cloud usage by ALICEHelix Nebula The Science Cloud

Hybrid cloud for scienceHelix Nebula The Science Cloud

HNSciCloud PILOT PLATFORM OVERVIEWHelix Nebula The Science Cloud

HNSciCloud Overview Helix Nebula The Science Cloud

This Helix Nebula Science Cloud Pilot Phase Open SessionHelix Nebula The Science Cloud

Cloud Services for Education - HNSciCloud applied to the UP2U projectHelix Nebula The Science Cloud

Network experiences with Public Cloud Services @ TNC2017Helix Nebula The Science Cloud

EOSC in practice - Silvana Muscella (chair EOSC HLEG)Helix Nebula The Science Cloud

Helix Nebula Science Cloud Pilot Phase, 6 February 2018, Bologna, ItalyHelix Nebula The Science Cloud

Pilot phase Award Ceremony - INFN Introduction and welcomeHelix Nebula The Science Cloud

Early adopter group and closing of webinar - João Fernandes (CERN)Helix Nebula The Science Cloud

HNSciCloud pilot phase - Andrea Chierici (INFN)Helix Nebula The Science Cloud

Pilot phase Award Ceremony - T-SystemsHelix Nebula The Science Cloud

Pilot phase Award Ceremony - RHEAHelix Nebula The Science Cloud

Mais de Helix Nebula The Science Cloud (20)

M-PIL-3.2 Public Session

Deep Learning for Fast Simulation

Interactive Data Analysis for End Users on HN Science Cloud

Container Federation Use Cases

CERN Batch in the HNSciCloud

LHCb on RHEA and T-Systems

Helix Nebula Science Cloud usage by ALICE

Hybrid cloud for science

HNSciCloud PILOT PLATFORM OVERVIEW

HNSciCloud Overview

This Helix Nebula Science Cloud Pilot Phase Open Session

Cloud Services for Education - HNSciCloud applied to the UP2U project

Network experiences with Public Cloud Services @ TNC2017

EOSC in practice - Silvana Muscella (chair EOSC HLEG)

Helix Nebula Science Cloud Pilot Phase, 6 February 2018, Bologna, Italy

Pilot phase Award Ceremony - INFN Introduction and welcome

Early adopter group and closing of webinar - João Fernandes (CERN)

HNSciCloud pilot phase - Andrea Chierici (INFN)

Pilot phase Award Ceremony - T-Systems

Pilot phase Award Ceremony - RHEA

Último

Software Quality Assurance Interview QuestionsArshad QA

TECUNIQUE: Success Stories: IT Service providermohitmore19

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.

Optimizing AI for immediate response in Smart CCTVshikhaohhpro

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.

HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai

Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave

Right Money Management App For Your Financial GoalsJhone kinadey

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveCall Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls

How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc

SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI

A Secure and Reliable Document Management System is Essential.docxComplianceQuest1

Microsoft AI Transformation Partner Playbook.pdfWilly Marroquin (WillyDevNET)

HNSciCloud CMS status-report

1. HNSciCloud CMS status-report Daniele Spiga and Katarzyna Dziedziniewicz-Wojcik

2. Setup ● CMS integrates the HNSCi Cloud resources as two distinct sites that share the very same storage element (CERN). ○ As a matter of fact there are two distinct entries in the Glidein Factory ● This allows CMS to easily evaluate and compare the two providers ○ But this is not yet so meaningful cause we’ve not enough statistic on RHEA (see later) ● From technical perspective mapping everything under a single CMS_Site is possible

3. Workflow ● Current configuration is based on CMS overflow mechanism ○ “Anything that can run at CERN can also run on HNSci Cloud” but we’ve added a filter on top of this ● For now, we pick the campaigns that do not require to high I/O ○ Main reason for this is to fully validate the setup and make everything working (rock solid) ○ Evaluate both success rate and cpu efficiency ● Now that everything is working we’ll open to any type of workflow ○ To-do list

4. Resource Usage during past 30 days CMS way over 1/4 of the resources Ben’s monitoring page ● CMS integrated first TSystem and couple of week later RHEA. ○ We agreed to fully validate the whole setup with one of the two.. ○ RHEA was off for a 1+ week,. then only 4-Core VMs (CMS requires 8 cores)

5. Performances: early analysis So most of these numbers refer just to TSystem - 10k+ jobs completed - 80%+ success rate - Most of the failure are not CMS related (network issues etc.) - 70%+ CPU efficiency

6. Early comments … ● From 10Km far “Everything worked out of the box” ● All the issues (very few) where due to misconfigurations (some of mine :) ).. As such everything was fixed easily ○ Many thanks to Ben and the batch team for the effective support and for all the provided feedbacks which helped to quickly spot mistakes ● CMS uses multicore.. In order to get proper share we need to keep constant pressure.. ○ If we lose the share it takes hours (even 12+ with RHEA) to get a slot.

HNSciCloud CMS status-report

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a HNSciCloud CMS status-report

Semelhante a HNSciCloud CMS status-report (20)

Mais de Helix Nebula The Science Cloud

Mais de Helix Nebula The Science Cloud (20)

Último

Último (20)

HNSciCloud CMS status-report