SlideShare uma empresa Scribd logo
1 de 28
Baixar para ler offline
© 2014 Uptime Institute
Is your data center on the verge of a crisis?
Julian Kudritzki
Chief Operating Officer
Uptime Institute
What Defines a Crisis?
2
Tour of Operational Computer Room
3
Looking for Clues
4
Tour of ‘Live’ Critical Spaces
5
Daily Practices Compromise Uptime,
Safety, and Security
6
•  Overtime hours exceeding 10%
•  Voice mail boxes full
•  Emails not responded to
•  Email inbox size limit exceeded
•  Meetings missed or routinely cancelled
•  No time for training
•  Shortage of qualified staff
•  Personnel performing work outside their competency
•  Everything is an emergency
•  Personnel turnover
What Else Is Going On?
7
•  Break fix budget exceeded
•  Maintenance budget exceeded
•  Energy cost estimate exceeded or unknown
•  Last minute deployment requirements
•  No organization chart
•  No responsibilities matrix
•  No records of maintenance activities
•  No written policies & procedures
•  No preventive maintenance schedule
•  Back of the server looks like a spaghetti pot exploded
The Issues Add Up
8
•  Cabling is not labeled or worse incorrectly labeled
•  Equipment is not uniquely labeled
•  Loads are consistently out of balance
•  Capacities are not managed or tracked
•  Deferred maintenance exceeds 10%
•  Housekeeping: if it looks like a mess, it is a mess
Maybe you don’t have a crisis, but how do you know how well
your data center operation compares to rest of industry?
The Issues Add Up
9
Are you confident in your Facilities team’s capability to
manage a technologically advanced and highly efficient design
to your 24 x 7 uptime requirements?
•  Can you easily replace any member of that team?
•  Are you protected against poor operations practices
migrating from older sites to higher criticality data centers?
•  Do you have sites that operate in isolation, ignoring global
corporate standards?
•  Do you even have corporate global standards?
•  If you outsource any aspect of your data center operations,
how do you avoid losing responsibility and accountability?
•  Do you manage an outsourcing contract. . . . or direct an
expert team?
Ask the Tough Questions
10
•  Initial review
•  Gap analysis against industry best practices
§  Staffing and Organization
§  Maintenance
§  Training
§  Planning, Coordination & Management
§  Operating Conditions
•  Roadmap to operational excellence
•  Plan changes
•  Implement changes
•  Monitor & refine
•  Annual review
Path to Data Center Operations Success
11
Key Elements of Facilities Management
Staffing and Organization
•  Staffing
•  Qualifications
•  Organization
Maintenance
•  Preventative Maintenance (PM)
Program
•  Housekeeping Policies
•  Maintenance Management
System (MMS)
•  Vendor Support
•  Deferred Maint. Program
•  Predictive Maintenance
•  Life-Cycle Planning
•  Failure Analysis Program
12
Key Elements of Facilities Management
Training
•  Data Center Staff
•  Vendors
Planning, Coordination,
and Management
•  Site Policies
•  Financial Management
•  Reference Library
•  Computer Room Mgmt.
Operating Conditions
•  Load Management
•  Operating Set Points
•  Alternating Use of
Infrastructure Equipment
13
The Uptime Institute over the years has observed
management issues posing the largest risk to uptime
physical infrastructure
•  Inadequate staffing
•  Ineffective or non-existing maintenance and training programs
•  Lacking processes and procedures
•  Resulting in the majority of outages being caused by
‘human error’
No standard existed to help Owners/Operators determine
•  Common language/vocabulary  of  data  center  operations
•  Focus of data center management
•  Resource allocation
•  Resource requirements
Genesis of Industry Best Practices
14
Data Center Owners / Operators / End Users
•  Increased availability and cost savings
•  Multi-site consistency
•  Benchmark for continuous monitoring and refinement
Colocation / Managed Services Sites
•  All of the above plus…
•  Customer assurance of consistency
•  Competitive differentiator (attain & retain certification)
Industry Benchmark
•  No need to reply on opinions and anecdotes
Value of Industry Best Practices
15
Uptime Institute has been conducting Operational
Sustainability Reviews for approximately 3 years—
based upon decades of site operations knowledge and
experience:
•  Operational Sustainability Certifications: Tier + Gold, Silver, or Bronze
•  Management & Operations (M&O) Stamps of Approval
See http://uptimeinstitute.com/publications for
Tier Standard: Operational Sustainability
Best Practices Reviews
16
Staffing
•  Inadequate staffing
•  Excessive overtime (over 10%)
•  No escalation process
Qualification
•  No list of required qualifications
•  No experience with data center specific equipment
Organization
•  Roles and Responsibilities not documented
•  Data center organization not integrated
Staffing and Organization Significant Findings
17
Preventive Maintenance (PM)
•  No list of required PM activities
•  PM activities not fully scripted
•  No quality control process
Housekeeping
•  Combustibles in the data center
•  No documented housekeeping policy
Maintenance Management System (MMS)
•  No list of equipment
•  Missing critical data: warranty info, maintenance history, performance
data, etc.
Maintenance Significant Findings
18
Vendor Support
•  Contracts missing response times, call-in process, detail SOW, or
technician qualifications
Deferred Maintenance
•  Unable to produce Deferred maintenance report from MMS
Predictive Maintenance
•  No predictive maintenance program
•  Not comparing current results with previous results
Maintenance Significant Findings
19
Life-Cycle Planning
•  No life-cycle plan
•  Not using MMS data to develop plan
Failure Analysis
•  No record of outages or near misses
Maintenance Significant Findings
20
Data Center Staff
•  Undocumented On-the-Job (OJT) programs
•  No formal qualification program
•  No list of training required by position
•  No formal training program with lesson plans, etc.
Vendors
•  No briefing for escorted vendors
Training Significant Findings
21
Load Management
•  Alarm settings not documented
•  Alarms not set on PDUs to ensure maximum loads are not exceeded
Operating Set Points
•  Cooling set points are not document or part of
Change Management Process
•  Changing of set points is not controlled
Operating Conditions Significant Findings
22
Site Policies
•  Missing Site Policies
•  Especially Site Configuration Policy
Reference Library
•  No process for keeping documents up-to-date
Capacity Management
•  No process for forecasting future space, power, and cooling
requirements
•  No active tracking of cooling capacity
•  Ineffective management of Cold Aisles /Hot Aisles
•  Electrical power monitoring (balancing phases)
Planning, Coordination, and Management
Significant Findings
23
Facilities
•  Operate and maintain the critical facility infrastructure
•  Support the installation of IT equipment (space, power, & cooling)
IT Management
•  Operate and maintain IT hardware, software, applications, and
network connectivity
•  Manage the installation/de-installation of IT equipment
Security
•  Access Control
•  Physical Security
Typical Data Center Disciplines
24
Functionally Separate Organization
•  Corporate Real Estate (Facilities)
•  IT
•  Security
Communication between organizations was
typically poor
•  Data center activities conducted without coordination
•  Poor future space, power, and cooling planning
No individual responsible for all aspects of operating
a data center
Past Organizational Structures
25
Factors driving changes to organizational structure
•  Rapid changes in technology and speed at which capacity must be
brought online
•  Increased costs associate with IT and Facilities
•  Business objectives of continuous computing availability
Legacy organizations could not accommodate quickly
evolving business requirements
•  Slow to respond
•  Not integrated
Evolving Organizational Structure
26
The value of industry best practices is in the process of
continuous improvement
•  Discovery leads to learning
•  Learning leads to change
•  Change leads to improvement
•  Regular reviews leads to discovery
•  Crises can be avoided
Summary
27
For more information contact:
Julian Kudritzki
jkudritzki@uptimeinstitute.com
206.706.4143
Questions?
© 2014 Uptime Institute28

Mais conteúdo relacionado

Mais procurados

PMOpartners daptiv_in
PMOpartners daptiv_inPMOpartners daptiv_in
PMOpartners daptiv_inPMOpartners
 
Metrics That Matter for Business and IT
Metrics That Matter for Business and ITMetrics That Matter for Business and IT
Metrics That Matter for Business and ITBMC Software
 
Best Practices in Moving Hyperion Planning to the Cloud
Best Practices in Moving Hyperion Planning to the CloudBest Practices in Moving Hyperion Planning to the Cloud
Best Practices in Moving Hyperion Planning to the CloudDatavail
 
Tektronics casestudy- ERP Implementation
Tektronics casestudy- ERP ImplementationTektronics casestudy- ERP Implementation
Tektronics casestudy- ERP ImplementationRachna Gupta
 
Navigating the Build vs. Buy Decision for Your Finance Technology Needs
Navigating the Build vs. Buy Decision for Your Finance Technology NeedsNavigating the Build vs. Buy Decision for Your Finance Technology Needs
Navigating the Build vs. Buy Decision for Your Finance Technology NeedsGotransverse
 
5 Budgeting Mistakes You Should Avoid
5 Budgeting Mistakes You Should Avoid5 Budgeting Mistakes You Should Avoid
5 Budgeting Mistakes You Should AvoidAdaptive Insights
 
Keda case analysis- ERP Implementation
Keda case analysis- ERP ImplementationKeda case analysis- ERP Implementation
Keda case analysis- ERP ImplementationRachna Gupta
 
Better Together: The Winning Strategy of Unified Ownership - AppSphere16
Better Together: The Winning Strategy of Unified Ownership - AppSphere16Better Together: The Winning Strategy of Unified Ownership - AppSphere16
Better Together: The Winning Strategy of Unified Ownership - AppSphere16AppDynamics
 
Adaptive case study on setting up of PMO for large it organization
Adaptive case study on setting up of PMO for large it organizationAdaptive case study on setting up of PMO for large it organization
Adaptive case study on setting up of PMO for large it organizationLN Mishra CBAP
 
Webinar Presentation: Microsoft Dynamics 2013 Year End Close
Webinar Presentation: Microsoft Dynamics 2013 Year End Close Webinar Presentation: Microsoft Dynamics 2013 Year End Close
Webinar Presentation: Microsoft Dynamics 2013 Year End Close Emtec Inc.
 
4.16.2013 Prj & Port Mgmt SftDev - What is Application Portfolio Management -...
4.16.2013 Prj & Port Mgmt SftDev - What is Application Portfolio Management -...4.16.2013 Prj & Port Mgmt SftDev - What is Application Portfolio Management -...
4.16.2013 Prj & Port Mgmt SftDev - What is Application Portfolio Management -...IBM Rational
 
Enterprise performance management
Enterprise performance managementEnterprise performance management
Enterprise performance managementChangepoint
 
Bring Down Costs by Controlling Cloud Capacity
Bring Down Costs by Controlling Cloud Capacity Bring Down Costs by Controlling Cloud Capacity
Bring Down Costs by Controlling Cloud Capacity Precisely
 
Building a glass house baker hughes
Building a glass house baker hughesBuilding a glass house baker hughes
Building a glass house baker hughesShepherd Mlambo
 
Leveraging On-Demand Compensation Management In A Global Environment
Leveraging On-Demand Compensation Management In A Global EnvironmentLeveraging On-Demand Compensation Management In A Global Environment
Leveraging On-Demand Compensation Management In A Global EnvironmentCallidus Software
 
The ROI Of Sales Performance Management
The ROI Of Sales Performance ManagementThe ROI Of Sales Performance Management
The ROI Of Sales Performance ManagementCallidus Software
 
Standardizing Your Information Capture Plan
Standardizing Your Information Capture PlanStandardizing Your Information Capture Plan
Standardizing Your Information Capture PlanAIIM International
 
Simplifying it using a disciplined portfolio governance approach
Simplifying it using a disciplined portfolio governance approachSimplifying it using a disciplined portfolio governance approach
Simplifying it using a disciplined portfolio governance approachp6academy
 
Business Case4 Process Improvement
Business Case4 Process ImprovementBusiness Case4 Process Improvement
Business Case4 Process ImprovementAl Bennett
 

Mais procurados (20)

PMOpartners daptiv_in
PMOpartners daptiv_inPMOpartners daptiv_in
PMOpartners daptiv_in
 
Metrics That Matter for Business and IT
Metrics That Matter for Business and ITMetrics That Matter for Business and IT
Metrics That Matter for Business and IT
 
Best Practices in Moving Hyperion Planning to the Cloud
Best Practices in Moving Hyperion Planning to the CloudBest Practices in Moving Hyperion Planning to the Cloud
Best Practices in Moving Hyperion Planning to the Cloud
 
Tektronics casestudy- ERP Implementation
Tektronics casestudy- ERP ImplementationTektronics casestudy- ERP Implementation
Tektronics casestudy- ERP Implementation
 
Navigating the Build vs. Buy Decision for Your Finance Technology Needs
Navigating the Build vs. Buy Decision for Your Finance Technology NeedsNavigating the Build vs. Buy Decision for Your Finance Technology Needs
Navigating the Build vs. Buy Decision for Your Finance Technology Needs
 
5 Budgeting Mistakes You Should Avoid
5 Budgeting Mistakes You Should Avoid5 Budgeting Mistakes You Should Avoid
5 Budgeting Mistakes You Should Avoid
 
Keda case analysis- ERP Implementation
Keda case analysis- ERP ImplementationKeda case analysis- ERP Implementation
Keda case analysis- ERP Implementation
 
BCI & Plan B DR best practice presentation 110914
BCI &  Plan B DR best practice presentation 110914BCI &  Plan B DR best practice presentation 110914
BCI & Plan B DR best practice presentation 110914
 
Better Together: The Winning Strategy of Unified Ownership - AppSphere16
Better Together: The Winning Strategy of Unified Ownership - AppSphere16Better Together: The Winning Strategy of Unified Ownership - AppSphere16
Better Together: The Winning Strategy of Unified Ownership - AppSphere16
 
Adaptive case study on setting up of PMO for large it organization
Adaptive case study on setting up of PMO for large it organizationAdaptive case study on setting up of PMO for large it organization
Adaptive case study on setting up of PMO for large it organization
 
Webinar Presentation: Microsoft Dynamics 2013 Year End Close
Webinar Presentation: Microsoft Dynamics 2013 Year End Close Webinar Presentation: Microsoft Dynamics 2013 Year End Close
Webinar Presentation: Microsoft Dynamics 2013 Year End Close
 
4.16.2013 Prj & Port Mgmt SftDev - What is Application Portfolio Management -...
4.16.2013 Prj & Port Mgmt SftDev - What is Application Portfolio Management -...4.16.2013 Prj & Port Mgmt SftDev - What is Application Portfolio Management -...
4.16.2013 Prj & Port Mgmt SftDev - What is Application Portfolio Management -...
 
Enterprise performance management
Enterprise performance managementEnterprise performance management
Enterprise performance management
 
Bring Down Costs by Controlling Cloud Capacity
Bring Down Costs by Controlling Cloud Capacity Bring Down Costs by Controlling Cloud Capacity
Bring Down Costs by Controlling Cloud Capacity
 
Building a glass house baker hughes
Building a glass house baker hughesBuilding a glass house baker hughes
Building a glass house baker hughes
 
Leveraging On-Demand Compensation Management In A Global Environment
Leveraging On-Demand Compensation Management In A Global EnvironmentLeveraging On-Demand Compensation Management In A Global Environment
Leveraging On-Demand Compensation Management In A Global Environment
 
The ROI Of Sales Performance Management
The ROI Of Sales Performance ManagementThe ROI Of Sales Performance Management
The ROI Of Sales Performance Management
 
Standardizing Your Information Capture Plan
Standardizing Your Information Capture PlanStandardizing Your Information Capture Plan
Standardizing Your Information Capture Plan
 
Simplifying it using a disciplined portfolio governance approach
Simplifying it using a disciplined portfolio governance approachSimplifying it using a disciplined portfolio governance approach
Simplifying it using a disciplined portfolio governance approach
 
Business Case4 Process Improvement
Business Case4 Process ImprovementBusiness Case4 Process Improvement
Business Case4 Process Improvement
 

Destaque

Uptime Institute 2015 Industry Survey
Uptime Institute 2015 Industry SurveyUptime Institute 2015 Industry Survey
Uptime Institute 2015 Industry SurveyUptime Institute
 
Low Complexity + Low Cost = High Availability
Low Complexity + Low Cost = High AvailabilityLow Complexity + Low Cost = High Availability
Low Complexity + Low Cost = High AvailabilityUptime Institute
 
Buyer beware
Buyer bewareBuyer beware
Buyer bewaresflaig
 
Cisco Connected Grid Solutions
Cisco Connected Grid SolutionsCisco Connected Grid Solutions
Cisco Connected Grid SolutionsAmos Simoes
 
The_Future_of_Data-Centres_-_Prof._Ian_Bitterlin_Emerson
The_Future_of_Data-Centres_-_Prof._Ian_Bitterlin_EmersonThe_Future_of_Data-Centres_-_Prof._Ian_Bitterlin_Emerson
The_Future_of_Data-Centres_-_Prof._Ian_Bitterlin_EmersonJohann Hendry
 
Tier Stanadard Operational-Sustainability
Tier Stanadard Operational-SustainabilityTier Stanadard Operational-Sustainability
Tier Stanadard Operational-SustainabilityJohann Hendry
 
Tier program services, by Dana Smith. Data Center Summit
Tier program services, by Dana Smith. Data Center SummitTier program services, by Dana Smith. Data Center Summit
Tier program services, by Dana Smith. Data Center SummitDCC Mission Critical
 
IT Infrastructure of Jakarta Local Government
IT Infrastructure of Jakarta Local GovernmentIT Infrastructure of Jakarta Local Government
IT Infrastructure of Jakarta Local Governmentsimrc
 
Commissioning services for substation and power plants
Commissioning services for substation and power plantsCommissioning services for substation and power plants
Commissioning services for substation and power plantsEvaldas Paliliūnas
 
Data Center Design Guide 4 2
Data Center Design Guide 4 2Data Center Design Guide 4 2
Data Center Design Guide 4 2Fiyaz Syed
 
Enterprise data center design and methodology
Enterprise data center design and methodologyEnterprise data center design and methodology
Enterprise data center design and methodologyCarlos León Araujo
 
ISO 27001:2013 Implementation procedure
ISO 27001:2013 Implementation procedureISO 27001:2013 Implementation procedure
ISO 27001:2013 Implementation procedureUppala Anand
 
Iso 27001 2013 Standard Requirements
Iso 27001 2013 Standard RequirementsIso 27001 2013 Standard Requirements
Iso 27001 2013 Standard RequirementsUppala Anand
 

Destaque (16)

Uptime Institute 2015 Industry Survey
Uptime Institute 2015 Industry SurveyUptime Institute 2015 Industry Survey
Uptime Institute 2015 Industry Survey
 
Low Complexity + Low Cost = High Availability
Low Complexity + Low Cost = High AvailabilityLow Complexity + Low Cost = High Availability
Low Complexity + Low Cost = High Availability
 
Buyer beware
Buyer bewareBuyer beware
Buyer beware
 
Cisco Connected Grid Solutions
Cisco Connected Grid SolutionsCisco Connected Grid Solutions
Cisco Connected Grid Solutions
 
The_Future_of_Data-Centres_-_Prof._Ian_Bitterlin_Emerson
The_Future_of_Data-Centres_-_Prof._Ian_Bitterlin_EmersonThe_Future_of_Data-Centres_-_Prof._Ian_Bitterlin_Emerson
The_Future_of_Data-Centres_-_Prof._Ian_Bitterlin_Emerson
 
Tier Stanadard Operational-Sustainability
Tier Stanadard Operational-SustainabilityTier Stanadard Operational-Sustainability
Tier Stanadard Operational-Sustainability
 
Tier program services, by Dana Smith. Data Center Summit
Tier program services, by Dana Smith. Data Center SummitTier program services, by Dana Smith. Data Center Summit
Tier program services, by Dana Smith. Data Center Summit
 
Clasificacion tier
Clasificacion tierClasificacion tier
Clasificacion tier
 
IT Infrastructure of Jakarta Local Government
IT Infrastructure of Jakarta Local GovernmentIT Infrastructure of Jakarta Local Government
IT Infrastructure of Jakarta Local Government
 
Commissioning services for substation and power plants
Commissioning services for substation and power plantsCommissioning services for substation and power plants
Commissioning services for substation and power plants
 
ISO 27001:2013 - A transition guide
ISO 27001:2013 - A transition guideISO 27001:2013 - A transition guide
ISO 27001:2013 - A transition guide
 
ISO/IEC 27001:2005 naar ISO 27001:2013 Checklist
ISO/IEC 27001:2005 naar ISO 27001:2013  ChecklistISO/IEC 27001:2005 naar ISO 27001:2013  Checklist
ISO/IEC 27001:2005 naar ISO 27001:2013 Checklist
 
Data Center Design Guide 4 2
Data Center Design Guide 4 2Data Center Design Guide 4 2
Data Center Design Guide 4 2
 
Enterprise data center design and methodology
Enterprise data center design and methodologyEnterprise data center design and methodology
Enterprise data center design and methodology
 
ISO 27001:2013 Implementation procedure
ISO 27001:2013 Implementation procedureISO 27001:2013 Implementation procedure
ISO 27001:2013 Implementation procedure
 
Iso 27001 2013 Standard Requirements
Iso 27001 2013 Standard RequirementsIso 27001 2013 Standard Requirements
Iso 27001 2013 Standard Requirements
 

Semelhante a Is your data center on the verge of a crisis?

Share point governance webinar 3 real world scenarios (ron charity) - draft...
Share point governance webinar 3   real world scenarios (ron charity) - draft...Share point governance webinar 3   real world scenarios (ron charity) - draft...
Share point governance webinar 3 real world scenarios (ron charity) - draft...Ron Charity
 
Data center engineering operations
Data center engineering operationsData center engineering operations
Data center engineering operationsJagbir Sangwan
 
CII Mumbai_Align F&IT_24-Sep-19.pptx
CII Mumbai_Align F&IT_24-Sep-19.pptxCII Mumbai_Align F&IT_24-Sep-19.pptx
CII Mumbai_Align F&IT_24-Sep-19.pptxssuser4f7e3f
 
Beyond Automation: Extracting Actionable Intelligence from Clinical Trials
Beyond Automation: Extracting Actionable Intelligence from Clinical TrialsBeyond Automation: Extracting Actionable Intelligence from Clinical Trials
Beyond Automation: Extracting Actionable Intelligence from Clinical TrialsMontrium
 
ITIL Best Practice for Software Companies
ITIL Best Practice for Software CompaniesITIL Best Practice for Software Companies
ITIL Best Practice for Software CompaniesDaniel Brody
 
IHR Presentation
IHR PresentationIHR Presentation
IHR PresentationSynerionNA
 
Confessions of an Internal Auditor: IT Edition
Confessions of an Internal Auditor: IT EditionConfessions of an Internal Auditor: IT Edition
Confessions of an Internal Auditor: IT EditionBrad Adams
 
How important is IT auditing
How important is IT auditingHow important is IT auditing
How important is IT auditingLepide USA Inc
 
Transforming your procure to pay process
Transforming your procure to pay processTransforming your procure to pay process
Transforming your procure to pay processLisa Wilberding
 
Transforming procure to-pay
Transforming procure to-payTransforming procure to-pay
Transforming procure to-payEvery Angle US
 
GRCSG2014_Kumar_Lessons for ensuring_F2E [Compatibility Mode]
GRCSG2014_Kumar_Lessons for ensuring_F2E [Compatibility Mode]GRCSG2014_Kumar_Lessons for ensuring_F2E [Compatibility Mode]
GRCSG2014_Kumar_Lessons for ensuring_F2E [Compatibility Mode]Barun Kumar
 
Project Fusion Engagement Kick-Off
Project Fusion Engagement Kick-OffProject Fusion Engagement Kick-Off
Project Fusion Engagement Kick-OffRalph Hatem
 
Project fusion engagement kick off
Project fusion engagement kick offProject fusion engagement kick off
Project fusion engagement kick offmaxalus
 
Stephen "Steve" Muzzy Memphis Schools Presentation
Stephen "Steve" Muzzy Memphis Schools PresentationStephen "Steve" Muzzy Memphis Schools Presentation
Stephen "Steve" Muzzy Memphis Schools Presentationsteve muzzy
 
Top Devops bottlenecks, constraints and best practices
Top Devops bottlenecks, constraints and best practicesTop Devops bottlenecks, constraints and best practices
Top Devops bottlenecks, constraints and best practicesMike Kavis
 

Semelhante a Is your data center on the verge of a crisis? (20)

Lean IT Services by Operational Excellence Consulting
Lean IT Services by Operational Excellence ConsultingLean IT Services by Operational Excellence Consulting
Lean IT Services by Operational Excellence Consulting
 
Share point governance webinar 3 real world scenarios (ron charity) - draft...
Share point governance webinar 3   real world scenarios (ron charity) - draft...Share point governance webinar 3   real world scenarios (ron charity) - draft...
Share point governance webinar 3 real world scenarios (ron charity) - draft...
 
Data center engineering operations
Data center engineering operationsData center engineering operations
Data center engineering operations
 
ITIL - introduction to ITIL
ITIL - introduction to ITILITIL - introduction to ITIL
ITIL - introduction to ITIL
 
CII Mumbai_Align F&IT_24-Sep-19.pptx
CII Mumbai_Align F&IT_24-Sep-19.pptxCII Mumbai_Align F&IT_24-Sep-19.pptx
CII Mumbai_Align F&IT_24-Sep-19.pptx
 
Beyond Automation: Extracting Actionable Intelligence from Clinical Trials
Beyond Automation: Extracting Actionable Intelligence from Clinical TrialsBeyond Automation: Extracting Actionable Intelligence from Clinical Trials
Beyond Automation: Extracting Actionable Intelligence from Clinical Trials
 
ITIL Best Practice for Software Companies
ITIL Best Practice for Software CompaniesITIL Best Practice for Software Companies
ITIL Best Practice for Software Companies
 
IHR Presentation
IHR PresentationIHR Presentation
IHR Presentation
 
Confessions of an Internal Auditor: IT Edition
Confessions of an Internal Auditor: IT EditionConfessions of an Internal Auditor: IT Edition
Confessions of an Internal Auditor: IT Edition
 
How important is IT auditing
How important is IT auditingHow important is IT auditing
How important is IT auditing
 
Itilv3
Itilv3Itilv3
Itilv3
 
Itilv3
Itilv3Itilv3
Itilv3
 
Lean Office by Operational Excellence Consulting
Lean Office by Operational Excellence ConsultingLean Office by Operational Excellence Consulting
Lean Office by Operational Excellence Consulting
 
Transforming your procure to pay process
Transforming your procure to pay processTransforming your procure to pay process
Transforming your procure to pay process
 
Transforming procure to-pay
Transforming procure to-payTransforming procure to-pay
Transforming procure to-pay
 
GRCSG2014_Kumar_Lessons for ensuring_F2E [Compatibility Mode]
GRCSG2014_Kumar_Lessons for ensuring_F2E [Compatibility Mode]GRCSG2014_Kumar_Lessons for ensuring_F2E [Compatibility Mode]
GRCSG2014_Kumar_Lessons for ensuring_F2E [Compatibility Mode]
 
Project Fusion Engagement Kick-Off
Project Fusion Engagement Kick-OffProject Fusion Engagement Kick-Off
Project Fusion Engagement Kick-Off
 
Project fusion engagement kick off
Project fusion engagement kick offProject fusion engagement kick off
Project fusion engagement kick off
 
Stephen "Steve" Muzzy Memphis Schools Presentation
Stephen "Steve" Muzzy Memphis Schools PresentationStephen "Steve" Muzzy Memphis Schools Presentation
Stephen "Steve" Muzzy Memphis Schools Presentation
 
Top Devops bottlenecks, constraints and best practices
Top Devops bottlenecks, constraints and best practicesTop Devops bottlenecks, constraints and best practices
Top Devops bottlenecks, constraints and best practices
 

Último

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Último (20)

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Is your data center on the verge of a crisis?

  • 1. © 2014 Uptime Institute Is your data center on the verge of a crisis? Julian Kudritzki Chief Operating Officer Uptime Institute
  • 2. What Defines a Crisis? 2
  • 3. Tour of Operational Computer Room 3
  • 5. Tour of ‘Live’ Critical Spaces 5
  • 6. Daily Practices Compromise Uptime, Safety, and Security 6
  • 7. •  Overtime hours exceeding 10% •  Voice mail boxes full •  Emails not responded to •  Email inbox size limit exceeded •  Meetings missed or routinely cancelled •  No time for training •  Shortage of qualified staff •  Personnel performing work outside their competency •  Everything is an emergency •  Personnel turnover What Else Is Going On? 7
  • 8. •  Break fix budget exceeded •  Maintenance budget exceeded •  Energy cost estimate exceeded or unknown •  Last minute deployment requirements •  No organization chart •  No responsibilities matrix •  No records of maintenance activities •  No written policies & procedures •  No preventive maintenance schedule •  Back of the server looks like a spaghetti pot exploded The Issues Add Up 8
  • 9. •  Cabling is not labeled or worse incorrectly labeled •  Equipment is not uniquely labeled •  Loads are consistently out of balance •  Capacities are not managed or tracked •  Deferred maintenance exceeds 10% •  Housekeeping: if it looks like a mess, it is a mess Maybe you don’t have a crisis, but how do you know how well your data center operation compares to rest of industry? The Issues Add Up 9
  • 10. Are you confident in your Facilities team’s capability to manage a technologically advanced and highly efficient design to your 24 x 7 uptime requirements? •  Can you easily replace any member of that team? •  Are you protected against poor operations practices migrating from older sites to higher criticality data centers? •  Do you have sites that operate in isolation, ignoring global corporate standards? •  Do you even have corporate global standards? •  If you outsource any aspect of your data center operations, how do you avoid losing responsibility and accountability? •  Do you manage an outsourcing contract. . . . or direct an expert team? Ask the Tough Questions 10
  • 11. •  Initial review •  Gap analysis against industry best practices §  Staffing and Organization §  Maintenance §  Training §  Planning, Coordination & Management §  Operating Conditions •  Roadmap to operational excellence •  Plan changes •  Implement changes •  Monitor & refine •  Annual review Path to Data Center Operations Success 11
  • 12. Key Elements of Facilities Management Staffing and Organization •  Staffing •  Qualifications •  Organization Maintenance •  Preventative Maintenance (PM) Program •  Housekeeping Policies •  Maintenance Management System (MMS) •  Vendor Support •  Deferred Maint. Program •  Predictive Maintenance •  Life-Cycle Planning •  Failure Analysis Program 12
  • 13. Key Elements of Facilities Management Training •  Data Center Staff •  Vendors Planning, Coordination, and Management •  Site Policies •  Financial Management •  Reference Library •  Computer Room Mgmt. Operating Conditions •  Load Management •  Operating Set Points •  Alternating Use of Infrastructure Equipment 13
  • 14. The Uptime Institute over the years has observed management issues posing the largest risk to uptime physical infrastructure •  Inadequate staffing •  Ineffective or non-existing maintenance and training programs •  Lacking processes and procedures •  Resulting in the majority of outages being caused by ‘human error’ No standard existed to help Owners/Operators determine •  Common language/vocabulary  of  data  center  operations •  Focus of data center management •  Resource allocation •  Resource requirements Genesis of Industry Best Practices 14
  • 15. Data Center Owners / Operators / End Users •  Increased availability and cost savings •  Multi-site consistency •  Benchmark for continuous monitoring and refinement Colocation / Managed Services Sites •  All of the above plus… •  Customer assurance of consistency •  Competitive differentiator (attain & retain certification) Industry Benchmark •  No need to reply on opinions and anecdotes Value of Industry Best Practices 15
  • 16. Uptime Institute has been conducting Operational Sustainability Reviews for approximately 3 years— based upon decades of site operations knowledge and experience: •  Operational Sustainability Certifications: Tier + Gold, Silver, or Bronze •  Management & Operations (M&O) Stamps of Approval See http://uptimeinstitute.com/publications for Tier Standard: Operational Sustainability Best Practices Reviews 16
  • 17. Staffing •  Inadequate staffing •  Excessive overtime (over 10%) •  No escalation process Qualification •  No list of required qualifications •  No experience with data center specific equipment Organization •  Roles and Responsibilities not documented •  Data center organization not integrated Staffing and Organization Significant Findings 17
  • 18. Preventive Maintenance (PM) •  No list of required PM activities •  PM activities not fully scripted •  No quality control process Housekeeping •  Combustibles in the data center •  No documented housekeeping policy Maintenance Management System (MMS) •  No list of equipment •  Missing critical data: warranty info, maintenance history, performance data, etc. Maintenance Significant Findings 18
  • 19. Vendor Support •  Contracts missing response times, call-in process, detail SOW, or technician qualifications Deferred Maintenance •  Unable to produce Deferred maintenance report from MMS Predictive Maintenance •  No predictive maintenance program •  Not comparing current results with previous results Maintenance Significant Findings 19
  • 20. Life-Cycle Planning •  No life-cycle plan •  Not using MMS data to develop plan Failure Analysis •  No record of outages or near misses Maintenance Significant Findings 20
  • 21. Data Center Staff •  Undocumented On-the-Job (OJT) programs •  No formal qualification program •  No list of training required by position •  No formal training program with lesson plans, etc. Vendors •  No briefing for escorted vendors Training Significant Findings 21
  • 22. Load Management •  Alarm settings not documented •  Alarms not set on PDUs to ensure maximum loads are not exceeded Operating Set Points •  Cooling set points are not document or part of Change Management Process •  Changing of set points is not controlled Operating Conditions Significant Findings 22
  • 23. Site Policies •  Missing Site Policies •  Especially Site Configuration Policy Reference Library •  No process for keeping documents up-to-date Capacity Management •  No process for forecasting future space, power, and cooling requirements •  No active tracking of cooling capacity •  Ineffective management of Cold Aisles /Hot Aisles •  Electrical power monitoring (balancing phases) Planning, Coordination, and Management Significant Findings 23
  • 24. Facilities •  Operate and maintain the critical facility infrastructure •  Support the installation of IT equipment (space, power, & cooling) IT Management •  Operate and maintain IT hardware, software, applications, and network connectivity •  Manage the installation/de-installation of IT equipment Security •  Access Control •  Physical Security Typical Data Center Disciplines 24
  • 25. Functionally Separate Organization •  Corporate Real Estate (Facilities) •  IT •  Security Communication between organizations was typically poor •  Data center activities conducted without coordination •  Poor future space, power, and cooling planning No individual responsible for all aspects of operating a data center Past Organizational Structures 25
  • 26. Factors driving changes to organizational structure •  Rapid changes in technology and speed at which capacity must be brought online •  Increased costs associate with IT and Facilities •  Business objectives of continuous computing availability Legacy organizations could not accommodate quickly evolving business requirements •  Slow to respond •  Not integrated Evolving Organizational Structure 26
  • 27. The value of industry best practices is in the process of continuous improvement •  Discovery leads to learning •  Learning leads to change •  Change leads to improvement •  Regular reviews leads to discovery •  Crises can be avoided Summary 27
  • 28. For more information contact: Julian Kudritzki jkudritzki@uptimeinstitute.com 206.706.4143 Questions? © 2014 Uptime Institute28