SlideShare a Scribd company logo
1 of 15
Data Centre Compute and Overhead Costs
Delivering End-to-end KPIs
Michael Rudgyard (CTO)
Concurrent Thinking Ltd
Our Background

•

Background in High Performance Computing & Scale-out Computing
– Gives us a unique perspective on DCIM

•

Founded Concurrent Thinking in 2010
–
–
–
–

Focussed on tools for operational efficiency in the Data Centre
Exploit an existing & mature product that was originally developed for HPC
Investment from Carbon Trust Investments
Launched new products at DatacenterDynamics, Nov 2011
Bridging the Divides – Facilities, IT & Management
It‟s all about
virtualization

It‟s all about
procurement

What constitutes an efficient data centre ??
It‟s about staff
efficiency
It’s all about
cooling
What we do…
Data Centre Infrastructure Management
•

Continuous monitoring & active management of IT & Facilities systems
–
–
–
–
–
–

Building management systems
Environmental systems (temperature, humidity, air-conditioning..)
Power (at the distribution board, rack PDU and server PSU level …)
IT equipment (including server health)
Operating systems & Virtual Machines
Application Performance

• We leverage standards-based protocols
– OPC, Modbus, 1-wire, SNMP, IPMI, Intel Node Manager, WMI

• …and offer monitoring agents and extensible means to monitor
non-standard M&E equipment
Why Data Centre Infrastructure Management ?

•

Aims
–
–
–
–
–

•

To truly understand where operational savings can be made
To understand how factors vary over time / with load etc
To give ample warning of potential (often critical) issues
To report factual information to management
To drive continuous iterative improvement over time

Real energy and productivity savings require a „joined-up‟ approach
– Managing buildings, data-centre facilities and IT in a unified manner
– .. opening the door to the possibility of orchestration of the data-centre
Our Approach
• We provide a tool that:
–
–
–
–
–
–

Tracks power to the server/network (and OS/VM/application) level
Allows for reporting by department, customer or end-user
Offers a simple interface to present data for different purposes
Has integrated IT asset management
Generates business intelligence on end-to-end service delivery
Is both user-extensible and built to scale (visually & architecturally)
What are the important data centre metrics ?
•

We don‟t push particular metrics (eg. PUE, ITUE, ITEE, FVER..)

•

DCIM is a tool that should enable a customer to define his own KPIs

Compute
Utilisation
Effectiveness
1
0.8
0.6
0.4
0.2
0
Network
Utilisation
Effectiveness

Storage
Utilisation
Effectiveness
Example 1 – OS performance monitoring

•

Potential performance metrics:
– CPU utilisation (* CPU benchmark) per watt
– IOPS per watt
– Bytes per watt

• To produce these metrics we monitor:
–
–
–
–

OS metrics via SNMP (Linux/MS) or WMI (MS)
Server power usage (via a managed PDU or IPMI)
(CPU benchmark figure)
Power overhead for cooling and power
distribution etc (and apportion this for this
server)
– Power cost (at different times)
Example 2– Microsoft Exchange

•

For a typical MS Exchange service, the most useful metrics might be:
– Power usage per email (OPEX only)
– Cost per email (OPEX or OPEX + CAPEX)
– CO2 per Email

• WMI now provides the necessary application
performance metrics
– The number of email transactions
– Server power usage (as above)
– Power overhead for cooling and power distribution etc.
(as above)
– Power Cost (as above)
– Asset depreciation model
Example 3 – Linux MySQL Server

•

For a web service, the most useful metric might be:
– Power per database query
– Cost per database query
– CO2 per database query

• SNMP now provides the application
performance information
Example 4 – Linux Apache Web Server

•

For a web service, the most useful metric might be:
– Power per HTML query
– Cost per HTML query
– CO2 per HTML query

• Unfortunately, SNMP support for Apache is poor
– Best option was to install the Apache „status module‟
– Read the number of web transactions from the
status module web page
Application performance on virtual machines
• Assume a single application per virtual machine

• Issue now is: what is the power used by a virtual machine ?
• Our solution: „inferred metrics‟
– Use another metric (eg. CPU utilisation) as a proxy for power usage
– Attribute the power used by a server to individual VMs
Using this information (1)
• Which servers are underused/inefficient/should be virtualised ?

• Which servers are better at delivering a particular service ?
– Provides useful procurement information !
– (or which application gives better performance on the same hardware ?)

• When should I retire old servers ?
– Sweating IT assets is often a very bad idea indeed !
Using this information (2)
• Which departments are using their IT resources wisely ?
– Define server groups and report by department

• Charge departments for individual power usage
Conclusions and open questions
• It is straightforward to monitor many KPIs for a data centre
–
–
–
–

From PUE, to ITUE and “application utilisation efficiency”
Requires a proper monitoring & reporting tool, with inbuilt asset management
Requires power monitoring hardware (managed PDUs or modern servers)
Requires suitable configuration (relatively easy for small numbers of apps)

• It is straightforward to apportion costs by racks, servers and by
department (if application servers are not shared)
• The ROI can be very significant
• Can we monitor granular information by user at the app level ?
– On going collaborations with University of Hertfordshire and Surrey University
– Collaboration on HPC with HPC Wales and STFC Daresbury

More Related Content

What's hot

What's hot (20)

Eugrid SecureClient V2.0(Pre Notice)
Eugrid SecureClient V2.0(Pre Notice)Eugrid SecureClient V2.0(Pre Notice)
Eugrid SecureClient V2.0(Pre Notice)
 
Better Data Center Infrastructure Management
Better Data Center Infrastructure ManagementBetter Data Center Infrastructure Management
Better Data Center Infrastructure Management
 
Reducing PC Management Costs With NxTop
Reducing PC Management Costs With NxTopReducing PC Management Costs With NxTop
Reducing PC Management Costs With NxTop
 
Art of Cloud Workload Translation
Art of Cloud Workload TranslationArt of Cloud Workload Translation
Art of Cloud Workload Translation
 
Disaster Recovery: Is Your iSeries Recoverable?
Disaster Recovery: Is Your iSeries Recoverable?Disaster Recovery: Is Your iSeries Recoverable?
Disaster Recovery: Is Your iSeries Recoverable?
 
High performance computing with accelarators
High performance computing with accelaratorsHigh performance computing with accelarators
High performance computing with accelarators
 
DCIM Software: What & Why?
DCIM Software: What & Why?DCIM Software: What & Why?
DCIM Software: What & Why?
 
Distributedconcurrentandindependentaccesstoencryptedclouddatabases 1410150430...
Distributedconcurrentandindependentaccesstoencryptedclouddatabases 1410150430...Distributedconcurrentandindependentaccesstoencryptedclouddatabases 1410150430...
Distributedconcurrentandindependentaccesstoencryptedclouddatabases 1410150430...
 
DCIM: ERP for the Data Center Manager
DCIM: ERP for the Data Center ManagerDCIM: ERP for the Data Center Manager
DCIM: ERP for the Data Center Manager
 
Practical Considerations for Implementing Prefabricated Data Centers
Practical Considerations for Implementing Prefabricated Data CentersPractical Considerations for Implementing Prefabricated Data Centers
Practical Considerations for Implementing Prefabricated Data Centers
 
Homework2
Homework2Homework2
Homework2
 
PUE Reconsidered
PUE ReconsideredPUE Reconsidered
PUE Reconsidered
 
Trellis DCIM Platform
Trellis DCIM PlatformTrellis DCIM Platform
Trellis DCIM Platform
 
Data center tier standards
Data center tier standardsData center tier standards
Data center tier standards
 
Beyond the Box: Design to Operations
Beyond the Box: Design to OperationsBeyond the Box: Design to Operations
Beyond the Box: Design to Operations
 
DCIM Awareness Workshop
DCIM Awareness WorkshopDCIM Awareness Workshop
DCIM Awareness Workshop
 
Multi tiered hybrid data center design
Multi tiered hybrid data center designMulti tiered hybrid data center design
Multi tiered hybrid data center design
 
Hpc 1
Hpc 1Hpc 1
Hpc 1
 
DR hosting & cloud
DR hosting & cloudDR hosting & cloud
DR hosting & cloud
 
Cloud Modeling vs Internal vs Global Market using Burstorm Platform
Cloud Modeling vs Internal vs Global Market using Burstorm PlatformCloud Modeling vs Internal vs Global Market using Burstorm Platform
Cloud Modeling vs Internal vs Global Market using Burstorm Platform
 

Similar to Data Centre Compute and Overhead Costs - Delivering End-to-end KPIs

Ppt4 london - michael rudgyard ( concurrent thinking ) driving efficiencie...
Ppt4   london -  michael rudgyard ( concurrent thinking ) driving efficiencie...Ppt4   london -  michael rudgyard ( concurrent thinking ) driving efficiencie...
Ppt4 london - michael rudgyard ( concurrent thinking ) driving efficiencie...
JISC's Green ICT Programme
 
Data Center Infrastructure Management(DCIM)
Data Center Infrastructure Management(DCIM)Data Center Infrastructure Management(DCIM)
Data Center Infrastructure Management(DCIM)
MD. IFTEKARUL ALAM
 
Chapter 05- IT infrastucture and emerging technologies.pptx
Chapter 05- IT infrastucture and emerging technologies.pptxChapter 05- IT infrastucture and emerging technologies.pptx
Chapter 05- IT infrastucture and emerging technologies.pptx
jack732582
 
Dynamic datacenter planning and design
Dynamic datacenter   planning and designDynamic datacenter   planning and design
Dynamic datacenter planning and design
Yeonki Choi
 

Similar to Data Centre Compute and Overhead Costs - Delivering End-to-end KPIs (20)

Ppt4 london - michael rudgyard ( concurrent thinking ) driving efficiencie...
Ppt4   london -  michael rudgyard ( concurrent thinking ) driving efficiencie...Ppt4   london -  michael rudgyard ( concurrent thinking ) driving efficiencie...
Ppt4 london - michael rudgyard ( concurrent thinking ) driving efficiencie...
 
Data Center Infrastructure Management(DCIM)
Data Center Infrastructure Management(DCIM)Data Center Infrastructure Management(DCIM)
Data Center Infrastructure Management(DCIM)
 
Systemology presentation- System Center & the modern datacenter
Systemology presentation- System Center & the modern datacenterSystemology presentation- System Center & the modern datacenter
Systemology presentation- System Center & the modern datacenter
 
IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...
IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...
IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...
 
1. An Introduction to Embed Systems_DRKG.pptx
1. An Introduction to Embed Systems_DRKG.pptx1. An Introduction to Embed Systems_DRKG.pptx
1. An Introduction to Embed Systems_DRKG.pptx
 
Introduction to High Performance Computing
Introduction to High Performance ComputingIntroduction to High Performance Computing
Introduction to High Performance Computing
 
Introduction to High-Performance Computing
Introduction to High-Performance ComputingIntroduction to High-Performance Computing
Introduction to High-Performance Computing
 
Chapter 05- IT infrastucture and emerging technologies.pptx
Chapter 05- IT infrastucture and emerging technologies.pptxChapter 05- IT infrastucture and emerging technologies.pptx
Chapter 05- IT infrastucture and emerging technologies.pptx
 
UNIT I.pptx
UNIT I.pptxUNIT I.pptx
UNIT I.pptx
 
Actionable Insights - Thompson
Actionable Insights - ThompsonActionable Insights - Thompson
Actionable Insights - Thompson
 
UNIT I_Introduction.pptx
UNIT I_Introduction.pptxUNIT I_Introduction.pptx
UNIT I_Introduction.pptx
 
FInal Project - USMx CC605x Cloud Computing for Enterprises - Hugo Aquino
FInal Project - USMx CC605x Cloud Computing for Enterprises - Hugo AquinoFInal Project - USMx CC605x Cloud Computing for Enterprises - Hugo Aquino
FInal Project - USMx CC605x Cloud Computing for Enterprises - Hugo Aquino
 
What's new in informix v11.70
What's new in informix v11.70What's new in informix v11.70
What's new in informix v11.70
 
Xiv cloud machine_webinar_090414
Xiv cloud machine_webinar_090414Xiv cloud machine_webinar_090414
Xiv cloud machine_webinar_090414
 
Dynamic datacenter planning and design
Dynamic datacenter   planning and designDynamic datacenter   planning and design
Dynamic datacenter planning and design
 
Topic 4 -enterprize_system
Topic 4 -enterprize_systemTopic 4 -enterprize_system
Topic 4 -enterprize_system
 
ERTS_Unit 1_PPT.pdf
ERTS_Unit 1_PPT.pdfERTS_Unit 1_PPT.pdf
ERTS_Unit 1_PPT.pdf
 
5 - Infrastructure and Cloud Computing
5 - Infrastructure and Cloud Computing5 - Infrastructure and Cloud Computing
5 - Infrastructure and Cloud Computing
 
Embeddedsystem basic for Engineering Students
Embeddedsystem basic for Engineering StudentsEmbeddedsystem basic for Engineering Students
Embeddedsystem basic for Engineering Students
 
VET4SBO Level 2 module 6 - unit 3 - v1.0 en
VET4SBO Level 2   module 6 - unit 3 - v1.0 enVET4SBO Level 2   module 6 - unit 3 - v1.0 en
VET4SBO Level 2 module 6 - unit 3 - v1.0 en
 

More from JISC's Green ICT Programme

CARBS Project Presentation - Jisc Cost of IT Services 10-02-14
CARBS Project Presentation - Jisc Cost of IT Services 10-02-14CARBS Project Presentation - Jisc Cost of IT Services 10-02-14
CARBS Project Presentation - Jisc Cost of IT Services 10-02-14
JISC's Green ICT Programme
 
Ppt 1 le leeds - welcome alan real ( university of leeds )
Ppt 1 le   leeds - welcome alan real ( university of leeds )Ppt 1 le   leeds - welcome alan real ( university of leeds )
Ppt 1 le leeds - welcome alan real ( university of leeds )
JISC's Green ICT Programme
 
Ppt6 london - james harbridge ( intellect uk ) crc policy update
Ppt6   london - james harbridge ( intellect uk ) crc policy updatePpt6   london - james harbridge ( intellect uk ) crc policy update
Ppt6 london - james harbridge ( intellect uk ) crc policy update
JISC's Green ICT Programme
 
Ppt5 exp lonodn - kevin cope & alex yakimov ( imperial college ) data cent...
Ppt5   exp lonodn - kevin cope & alex yakimov ( imperial college )  data cent...Ppt5   exp lonodn - kevin cope & alex yakimov ( imperial college )  data cent...
Ppt5 exp lonodn - kevin cope & alex yakimov ( imperial college ) data cent...
JISC's Green ICT Programme
 
Ppt4 exp leeds - alan real and jon summers ( university of leeds ) experien...
Ppt4   exp leeds - alan real and jon summers ( university of leeds ) experien...Ppt4   exp leeds - alan real and jon summers ( university of leeds ) experien...
Ppt4 exp leeds - alan real and jon summers ( university of leeds ) experien...
JISC's Green ICT Programme
 
Ppt4 exp birmingham - steve bowes phipps ( university of hertfordshire ) - ...
Ppt4   exp birmingham - steve bowes phipps ( university of hertfordshire ) - ...Ppt4   exp birmingham - steve bowes phipps ( university of hertfordshire ) - ...
Ppt4 exp birmingham - steve bowes phipps ( university of hertfordshire ) - ...
JISC's Green ICT Programme
 
Ppt3 london - sophia ( operation intelligence ) what is the eu code of conduct
Ppt3   london - sophia ( operation intelligence ) what is the eu code of conductPpt3   london - sophia ( operation intelligence ) what is the eu code of conduct
Ppt3 london - sophia ( operation intelligence ) what is the eu code of conduct
JISC's Green ICT Programme
 
Ppt2 london - mike walker ( defra ) background to the eu code of conduct j...
Ppt2   london - mike walker ( defra )  background to the eu code of conduct j...Ppt2   london - mike walker ( defra )  background to the eu code of conduct j...
Ppt2 london - mike walker ( defra ) background to the eu code of conduct j...
JISC's Green ICT Programme
 
Ppt1 london -simon allen ( concurrent thinking ) welcome
Ppt1   london -simon allen ( concurrent thinking ) welcomePpt1   london -simon allen ( concurrent thinking ) welcome
Ppt1 london -simon allen ( concurrent thinking ) welcome
JISC's Green ICT Programme
 

More from JISC's Green ICT Programme (20)

CARBS Project Presentation - Jisc Cost of IT Services 10-02-14
CARBS Project Presentation - Jisc Cost of IT Services 10-02-14CARBS Project Presentation - Jisc Cost of IT Services 10-02-14
CARBS Project Presentation - Jisc Cost of IT Services 10-02-14
 
Circling the Square! - Understanding and Minimising the Costs of Providing Di...
Circling the Square! - Understanding and Minimising the Costs of Providing Di...Circling the Square! - Understanding and Minimising the Costs of Providing Di...
Circling the Square! - Understanding and Minimising the Costs of Providing Di...
 
US Trends in Data Centre Design with NREL Examples of Large Energy Savings
US Trends in Data Centre Design with NREL Examples of Large Energy Savings US Trends in Data Centre Design with NREL Examples of Large Energy Savings
US Trends in Data Centre Design with NREL Examples of Large Energy Savings
 
Migrating from a physical to a hosted Data Centre - Experiences of a small Un...
Migrating from a physical to a hosted Data Centre - Experiences of a small Un...Migrating from a physical to a hosted Data Centre - Experiences of a small Un...
Migrating from a physical to a hosted Data Centre - Experiences of a small Un...
 
The “Financial X-ray” of IT Services costs -
The “Financial X-ray” of IT Services costs - The “Financial X-ray” of IT Services costs -
The “Financial X-ray” of IT Services costs -
 
Energy Efficient Server Rooms at the University of Cambridge
Energy Efficient Server Rooms at the University of CambridgeEnergy Efficient Server Rooms at the University of Cambridge
Energy Efficient Server Rooms at the University of Cambridge
 
What does central IT really cost? An attempt to find out! - Heidi Fraser-Krau...
What does central IT really cost? An attempt to find out! - Heidi Fraser-Krau...What does central IT really cost? An attempt to find out! - Heidi Fraser-Krau...
What does central IT really cost? An attempt to find out! - Heidi Fraser-Krau...
 
Understanding Data Centre Costs: Lessons from e-InfraNet and JISC activities
Understanding Data Centre Costs: Lessons from e-InfraNet and JISC activitiesUnderstanding Data Centre Costs: Lessons from e-InfraNet and JISC activities
Understanding Data Centre Costs: Lessons from e-InfraNet and JISC activities
 
Ppt 1 le leeds - welcome alan real ( university of leeds )
Ppt 1 le   leeds - welcome alan real ( university of leeds )Ppt 1 le   leeds - welcome alan real ( university of leeds )
Ppt 1 le leeds - welcome alan real ( university of leeds )
 
Ppt6 london - james harbridge ( intellect uk ) crc policy update
Ppt6   london - james harbridge ( intellect uk ) crc policy updatePpt6   london - james harbridge ( intellect uk ) crc policy update
Ppt6 london - james harbridge ( intellect uk ) crc policy update
 
Ppt5 exp lonodn - kevin cope & alex yakimov ( imperial college ) data cent...
Ppt5   exp lonodn - kevin cope & alex yakimov ( imperial college )  data cent...Ppt5   exp lonodn - kevin cope & alex yakimov ( imperial college )  data cent...
Ppt5 exp lonodn - kevin cope & alex yakimov ( imperial college ) data cent...
 
Ppt4 exp leeds - alan real and jon summers ( university of leeds ) experien...
Ppt4   exp leeds - alan real and jon summers ( university of leeds ) experien...Ppt4   exp leeds - alan real and jon summers ( university of leeds ) experien...
Ppt4 exp leeds - alan real and jon summers ( university of leeds ) experien...
 
Ppt4 exp birmingham - steve bowes phipps ( university of hertfordshire ) - ...
Ppt4   exp birmingham - steve bowes phipps ( university of hertfordshire ) - ...Ppt4   exp birmingham - steve bowes phipps ( university of hertfordshire ) - ...
Ppt4 exp birmingham - steve bowes phipps ( university of hertfordshire ) - ...
 
Ppt3 london - sophia ( operation intelligence ) what is the eu code of conduct
Ppt3   london - sophia ( operation intelligence ) what is the eu code of conductPpt3   london - sophia ( operation intelligence ) what is the eu code of conduct
Ppt3 london - sophia ( operation intelligence ) what is the eu code of conduct
 
Ppt2 london - mike walker ( defra ) background to the eu code of conduct j...
Ppt2   london - mike walker ( defra )  background to the eu code of conduct j...Ppt2   london - mike walker ( defra )  background to the eu code of conduct j...
Ppt2 london - mike walker ( defra ) background to the eu code of conduct j...
 
Ppt1 london -simon allen ( concurrent thinking ) welcome
Ppt1   london -simon allen ( concurrent thinking ) welcomePpt1   london -simon allen ( concurrent thinking ) welcome
Ppt1 london -simon allen ( concurrent thinking ) welcome
 
The Welsh Video Network: Supporting Video Conferencing in Welsh Education: Us...
The Welsh Video Network: Supporting Video Conferencing in Welsh Education: Us...The Welsh Video Network: Supporting Video Conferencing in Welsh Education: Us...
The Welsh Video Network: Supporting Video Conferencing in Welsh Education: Us...
 
Videoconferencing in Dutch Higher Education
Videoconferencing in Dutch Higher EducationVideoconferencing in Dutch Higher Education
Videoconferencing in Dutch Higher Education
 
University of Warwick Video Conferencing Service
University of Warwick Video Conferencing ServiceUniversity of Warwick Video Conferencing Service
University of Warwick Video Conferencing Service
 
Greening Events II
Greening Events IIGreening Events II
Greening Events II
 

Recently uploaded

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Recently uploaded (20)

FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Buy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptxBuy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptx
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, Ocado
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreel
 

Data Centre Compute and Overhead Costs - Delivering End-to-end KPIs

  • 1. Data Centre Compute and Overhead Costs Delivering End-to-end KPIs Michael Rudgyard (CTO) Concurrent Thinking Ltd
  • 2. Our Background • Background in High Performance Computing & Scale-out Computing – Gives us a unique perspective on DCIM • Founded Concurrent Thinking in 2010 – – – – Focussed on tools for operational efficiency in the Data Centre Exploit an existing & mature product that was originally developed for HPC Investment from Carbon Trust Investments Launched new products at DatacenterDynamics, Nov 2011
  • 3. Bridging the Divides – Facilities, IT & Management It‟s all about virtualization It‟s all about procurement What constitutes an efficient data centre ?? It‟s about staff efficiency It’s all about cooling
  • 4. What we do… Data Centre Infrastructure Management • Continuous monitoring & active management of IT & Facilities systems – – – – – – Building management systems Environmental systems (temperature, humidity, air-conditioning..) Power (at the distribution board, rack PDU and server PSU level …) IT equipment (including server health) Operating systems & Virtual Machines Application Performance • We leverage standards-based protocols – OPC, Modbus, 1-wire, SNMP, IPMI, Intel Node Manager, WMI • …and offer monitoring agents and extensible means to monitor non-standard M&E equipment
  • 5. Why Data Centre Infrastructure Management ? • Aims – – – – – • To truly understand where operational savings can be made To understand how factors vary over time / with load etc To give ample warning of potential (often critical) issues To report factual information to management To drive continuous iterative improvement over time Real energy and productivity savings require a „joined-up‟ approach – Managing buildings, data-centre facilities and IT in a unified manner – .. opening the door to the possibility of orchestration of the data-centre
  • 6. Our Approach • We provide a tool that: – – – – – – Tracks power to the server/network (and OS/VM/application) level Allows for reporting by department, customer or end-user Offers a simple interface to present data for different purposes Has integrated IT asset management Generates business intelligence on end-to-end service delivery Is both user-extensible and built to scale (visually & architecturally)
  • 7. What are the important data centre metrics ? • We don‟t push particular metrics (eg. PUE, ITUE, ITEE, FVER..) • DCIM is a tool that should enable a customer to define his own KPIs Compute Utilisation Effectiveness 1 0.8 0.6 0.4 0.2 0 Network Utilisation Effectiveness Storage Utilisation Effectiveness
  • 8. Example 1 – OS performance monitoring • Potential performance metrics: – CPU utilisation (* CPU benchmark) per watt – IOPS per watt – Bytes per watt • To produce these metrics we monitor: – – – – OS metrics via SNMP (Linux/MS) or WMI (MS) Server power usage (via a managed PDU or IPMI) (CPU benchmark figure) Power overhead for cooling and power distribution etc (and apportion this for this server) – Power cost (at different times)
  • 9. Example 2– Microsoft Exchange • For a typical MS Exchange service, the most useful metrics might be: – Power usage per email (OPEX only) – Cost per email (OPEX or OPEX + CAPEX) – CO2 per Email • WMI now provides the necessary application performance metrics – The number of email transactions – Server power usage (as above) – Power overhead for cooling and power distribution etc. (as above) – Power Cost (as above) – Asset depreciation model
  • 10. Example 3 – Linux MySQL Server • For a web service, the most useful metric might be: – Power per database query – Cost per database query – CO2 per database query • SNMP now provides the application performance information
  • 11. Example 4 – Linux Apache Web Server • For a web service, the most useful metric might be: – Power per HTML query – Cost per HTML query – CO2 per HTML query • Unfortunately, SNMP support for Apache is poor – Best option was to install the Apache „status module‟ – Read the number of web transactions from the status module web page
  • 12. Application performance on virtual machines • Assume a single application per virtual machine • Issue now is: what is the power used by a virtual machine ? • Our solution: „inferred metrics‟ – Use another metric (eg. CPU utilisation) as a proxy for power usage – Attribute the power used by a server to individual VMs
  • 13. Using this information (1) • Which servers are underused/inefficient/should be virtualised ? • Which servers are better at delivering a particular service ? – Provides useful procurement information ! – (or which application gives better performance on the same hardware ?) • When should I retire old servers ? – Sweating IT assets is often a very bad idea indeed !
  • 14. Using this information (2) • Which departments are using their IT resources wisely ? – Define server groups and report by department • Charge departments for individual power usage
  • 15. Conclusions and open questions • It is straightforward to monitor many KPIs for a data centre – – – – From PUE, to ITUE and “application utilisation efficiency” Requires a proper monitoring & reporting tool, with inbuilt asset management Requires power monitoring hardware (managed PDUs or modern servers) Requires suitable configuration (relatively easy for small numbers of apps) • It is straightforward to apportion costs by racks, servers and by department (if application servers are not shared) • The ROI can be very significant • Can we monitor granular information by user at the app level ? – On going collaborations with University of Hertfordshire and Surrey University – Collaboration on HPC with HPC Wales and STFC Daresbury