SlideShare uma empresa Scribd logo
1 de 20
Baixar para ler offline
1Pa g e
Is minor incident management
The secret to
Major Incident
Management
Bob Fishman
RobertFishman25@gmail.com
508-259-1467
2Pa g e
A WELL RUN Network Operations
Center (NOC) KEEPS
YOUR
BUSINESS
RUNNING
SMOOTHLY
Performance
Minimize service interruptions
Rapid recovery
Ongoing support and maintenance
Well supported business functions
Prevent, detect, respond
A good NOC should be able to deal with even catastrophic
situations, like natural disasters, smoothly, confidently and
quickly.
How do we make a NOC run smoothly? By managing the
little stuff very, very well.
3Pa g e
1 2
4
What makes a NOC Rock?
MONITORING
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
4Pa g e
1 2
3 4
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
What makes a NOC Rock?
MONITORING
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
5Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
MONITORING
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
6Pa g e
Alerts should either be:
• Automatically Ticketed and properly assigned
• Automatically Ticketed and closed when cleared
• Discarded
Monitoring
1 Smart monitoring means the right alerts with the right information.
5
Eyes on Glass should try to be avoided
• Monotonous periods of inactivity occur which lead to
less than optimal performance of humans
• If Eyes on Glass are needed strict process must be
adhered to as to what events get ticketed
7Pa g e
• No work should be done that isn’t ticketed. Why? Tickets should contain a trail left by engineers. The ticket is an important record of
what was done, by who and why.
• Un-ticketed work leads to memory and procedural gaps that cause issues. Furthermore, it means your team is loosing track of how they
spend their time. That rarely ends well.
• Ticket types should include:
o Incidents, Service Requests, Changes and Problems
• Tickets containing thorough comments can lead to great Knowledge Base articles
Ticketing
5
2
Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve.
8Pa g eProcess
5
Every system needs a well documented process. Good processes mean good responses. Good documentation
means a consistent response no matter who’s on duty
3
• Documented processes – particularly for lower level engineers
• Process leads to repeatable, scalable, measurable outcomes with fewer errors
o The outcomes will contain fewer errors which are also able to be reported on
• Undocumented process becomes institutional knowledge and that knowledge may be lost when employees leave
• All work notes must be in the ticket
• If it isn’t in the ticket, it didn’t happen
• When, how and to who to escalate the incident
• Well defined shift hand-over steps and documentation
• When and in what format and to who communications must be sent
9Pa g eTraining
5
Training is as much about expectations and approach as it is about specific knowledge and processes. Good
training makes for good teams.
• Train for professional development
• A more knowledgeable workforce
• Ability to promote from within
• Train so employees understand the corporate values and responsibilities
• Helps company communicate legal issues such as Sexual harassment and Safety to employees
4
10Pa g eCommunication
5
Communication is essential between team members, OEMS, and the business. The leaders sets a tone and
process, and everyone participates.
• Well documented communication processes
• Escalation to the next level up and notification of that escalation
• Stakeholder
• Internal
• External
• The who, what, when and how of each step
• Verbal communication
• All occurrences should be documented with the ticket
• Shift Hand-overs
5
11Pa g ePeople
5
A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same
location.
• The core of any organization are the people
• Retain your best talent
• People must be working towards a common goal defined by the corporate entity
• They need defined duties
• Timely and accurate feedback is essential
• Trained employees feel empowered by and to move up in the organization
• This is a win win scenario
6
12Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
MONITORING
13Pa g e
1 2
3 4
PEOPLE
TRAINING
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
MONITORING
14Pa g e
1 2
3 4
PEOPLE
TRAINING
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
• A validated MI needs
its own process and
documentation
• Who owns the ticket
BECAUSE it is an MI
• Who notifies
Engineering that there
is an MI
• There should be a RACI
document for a Major
Incident
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
MONITORING
15Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
• A Major Incident is no
place for training
• Lower level engineers
can join the bridge and
or the shared video of
troubleshooting BUT
this is a higher level
issue
• Experienced and trained
prior to being part of a
Major Incident
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
• A validated MI needs
its own process and
documentation
• Who owns the ticket
BECAUSE it is an MI
• Who notifies
Engineering that there
is an MI
• There should be a RACI
document for a Major
Incident
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
MONITORING
16Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
• A Major Incident is no
place for training
• Lower level engineers
can join the bridge and
or the shared video of
troubleshooting BUT
this is a higher level
issue
• Experienced and trained
prior to being part of a
Major Incident
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
• A validated MI needs
its own process and
documentation
• Who owns the ticket
BECAUSE it is an MI
• Who notifies
Engineering that there
is an MI
• There should be a RACI
document for a Major
Incident
• This process and
associated tools should
be well defined prior
to expecting an MI to
be handled properly
• Who is responsible to
communicate when, to
who and how
• Who owns the
escalation of the
incident, if needed
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
MONITORING
17Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
• A Major Incident is no
place for training
• Lower level engineers
can join the bridge and
or the shared video of
troubleshooting BUT
this is a higher level
issue
• Experienced and trained
prior to being part of a
Major Incident
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
• People should be
prepared for any MI
• They should understand
the goals and SLAs
• Major Incidents can be
stressful understand
how this may affect
your staff
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
• A validated MI needs
its own process and
documentation
• Who owns the ticket
BECAUSE it is an MI
• Who notifies
Engineering that there
is an MI
• There should be a RACI
document for a Major
Incident
• This process and
associated tools should
be well defined prior
to expecting an MI to
be handled properly
• Who is responsible to
communicate when, to
who and how
• Who owns the
escalation of the
incident, if needed
MONITORING
Incident handling is the key to success
for proper handling of Major Incidents
Preparing for Major Incidents by taking care of the “normal”
incidents will make your NOC Rock
19Pa g e
There actually is no secret to
Major Incident Management
To the end user there are
no minor incidents
Bob Fishman
RobertFishman25@gmail.com
508-259-1467
20Pa g e
There actually is no secret to
Major Incident Management
To the end user there are
no minor incidents
Bob Fishman
RobertFishman25@gmail.com
508-259-1467

Mais conteúdo relacionado

Mais procurados

How to Build an Invincible Incident Management Plan
How to Build an Invincible Incident Management PlanHow to Build an Invincible Incident Management Plan
How to Build an Invincible Incident Management PlanDevOps.com
 
ITIL Incident Management Workflow - Process Guide
	 ITIL Incident Management Workflow - Process Guide	 ITIL Incident Management Workflow - Process Guide
ITIL Incident Management Workflow - Process GuideFlevy.com Best Practices
 
Technical Escalations Best Practices
Technical Escalations Best PracticesTechnical Escalations Best Practices
Technical Escalations Best Practicesmagalong
 
Incident Management Best Practices
Incident Management Best PracticesIncident Management Best Practices
Incident Management Best PracticesTechExcel
 
ITIL Incident Management Workflow PowerPoint Presentation Slides
ITIL Incident Management Workflow PowerPoint Presentation SlidesITIL Incident Management Workflow PowerPoint Presentation Slides
ITIL Incident Management Workflow PowerPoint Presentation SlidesSlideTeam
 
Most Recent updatedResume Vaibhav
Most Recent updatedResume VaibhavMost Recent updatedResume Vaibhav
Most Recent updatedResume VaibhavVaibhav Sawant
 
A Practical Approach to Incident Management for SaaS/PaaS
A Practical Approach to Incident Management for SaaS/PaaSA Practical Approach to Incident Management for SaaS/PaaS
A Practical Approach to Incident Management for SaaS/PaaSMichael Weber
 
Service now vulnerability patching_move
Service now vulnerability patching_moveService now vulnerability patching_move
Service now vulnerability patching_moveSubrat Kumar Dash
 
2011 09 18 United "Platitudes, reality and promise"
2011 09 18 United "Platitudes, reality and promise"2011 09 18 United "Platitudes, reality and promise"
2011 09 18 United "Platitudes, reality and promise"Gene Kim
 
Incident Management PowerPoint Presentation Slides
Incident Management PowerPoint Presentation SlidesIncident Management PowerPoint Presentation Slides
Incident Management PowerPoint Presentation SlidesSlideTeam
 
Credit Union Cyber Security
Credit Union Cyber SecurityCredit Union Cyber Security
Credit Union Cyber SecurityStacy Willis
 
Bright talk running a cloud - final
Bright talk   running a cloud - finalBright talk   running a cloud - final
Bright talk running a cloud - finalAndrew White
 
18 Ways Incident Management Systems Create Order (And Why It Matters)
18 Ways Incident Management Systems Create Order (And Why It Matters)18 Ways Incident Management Systems Create Order (And Why It Matters)
18 Ways Incident Management Systems Create Order (And Why It Matters)24/7 Software
 
Getting Started with Business Continuity
Getting Started with Business ContinuityGetting Started with Business Continuity
Getting Started with Business ContinuityStephen Cobb
 
Best Practices in Disaster Recovery Planning and Testing
Best Practices in Disaster Recovery Planning and TestingBest Practices in Disaster Recovery Planning and Testing
Best Practices in Disaster Recovery Planning and TestingAxcient
 
Software and Tear
Software and TearSoftware and Tear
Software and TearJosh Howell
 
Liberate Your IT Team
Liberate Your IT TeamLiberate Your IT Team
Liberate Your IT Teamvblackwell
 
Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...
Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...
Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...VAST
 
Mastering disaster e book Telehouse
Mastering disaster e book TelehouseMastering disaster e book Telehouse
Mastering disaster e book TelehouseTelehouse
 

Mais procurados (20)

How to Build an Invincible Incident Management Plan
How to Build an Invincible Incident Management PlanHow to Build an Invincible Incident Management Plan
How to Build an Invincible Incident Management Plan
 
ITIL Incident Management Workflow - Process Guide
	 ITIL Incident Management Workflow - Process Guide	 ITIL Incident Management Workflow - Process Guide
ITIL Incident Management Workflow - Process Guide
 
Technical Escalations Best Practices
Technical Escalations Best PracticesTechnical Escalations Best Practices
Technical Escalations Best Practices
 
Incident Management Best Practices
Incident Management Best PracticesIncident Management Best Practices
Incident Management Best Practices
 
Network Operations Center
Network Operations Center  Network Operations Center
Network Operations Center
 
ITIL Incident Management Workflow PowerPoint Presentation Slides
ITIL Incident Management Workflow PowerPoint Presentation SlidesITIL Incident Management Workflow PowerPoint Presentation Slides
ITIL Incident Management Workflow PowerPoint Presentation Slides
 
Most Recent updatedResume Vaibhav
Most Recent updatedResume VaibhavMost Recent updatedResume Vaibhav
Most Recent updatedResume Vaibhav
 
A Practical Approach to Incident Management for SaaS/PaaS
A Practical Approach to Incident Management for SaaS/PaaSA Practical Approach to Incident Management for SaaS/PaaS
A Practical Approach to Incident Management for SaaS/PaaS
 
Service now vulnerability patching_move
Service now vulnerability patching_moveService now vulnerability patching_move
Service now vulnerability patching_move
 
2011 09 18 United "Platitudes, reality and promise"
2011 09 18 United "Platitudes, reality and promise"2011 09 18 United "Platitudes, reality and promise"
2011 09 18 United "Platitudes, reality and promise"
 
Incident Management PowerPoint Presentation Slides
Incident Management PowerPoint Presentation SlidesIncident Management PowerPoint Presentation Slides
Incident Management PowerPoint Presentation Slides
 
Credit Union Cyber Security
Credit Union Cyber SecurityCredit Union Cyber Security
Credit Union Cyber Security
 
Bright talk running a cloud - final
Bright talk   running a cloud - finalBright talk   running a cloud - final
Bright talk running a cloud - final
 
18 Ways Incident Management Systems Create Order (And Why It Matters)
18 Ways Incident Management Systems Create Order (And Why It Matters)18 Ways Incident Management Systems Create Order (And Why It Matters)
18 Ways Incident Management Systems Create Order (And Why It Matters)
 
Getting Started with Business Continuity
Getting Started with Business ContinuityGetting Started with Business Continuity
Getting Started with Business Continuity
 
Best Practices in Disaster Recovery Planning and Testing
Best Practices in Disaster Recovery Planning and TestingBest Practices in Disaster Recovery Planning and Testing
Best Practices in Disaster Recovery Planning and Testing
 
Software and Tear
Software and TearSoftware and Tear
Software and Tear
 
Liberate Your IT Team
Liberate Your IT TeamLiberate Your IT Team
Liberate Your IT Team
 
Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...
Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...
Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...
 
Mastering disaster e book Telehouse
Mastering disaster e book TelehouseMastering disaster e book Telehouse
Mastering disaster e book Telehouse
 

Semelhante a Major Incident - make your NOC Rock

How to Digitally Transform Your Internal Operations
How to Digitally Transform Your Internal OperationsHow to Digitally Transform Your Internal Operations
How to Digitally Transform Your Internal OperationsIntegrify
 
Seminar on Process Documentation.pptx
Seminar on Process Documentation.pptxSeminar on Process Documentation.pptx
Seminar on Process Documentation.pptxNioAbaoCasyao
 
KV_ResumeAttachment_Updated 24112015
KV_ResumeAttachment_Updated 24112015KV_ResumeAttachment_Updated 24112015
KV_ResumeAttachment_Updated 24112015Chau Kek Voon
 
Management Science - Krimzen Tech
Management Science - Krimzen TechManagement Science - Krimzen Tech
Management Science - Krimzen TechDarrenTofu
 
Process Management by Jan Mohammed.pptx
Process Management by Jan Mohammed.pptxProcess Management by Jan Mohammed.pptx
Process Management by Jan Mohammed.pptxJanMohammed3
 
NARCA Presentation - IT Best Practice
NARCA Presentation - IT Best PracticeNARCA Presentation - IT Best Practice
NARCA Presentation - IT Best PracticeBrenda Majewski
 
How to Drive Efficiency and Reduce Risk with Investigative Case Management So...
How to Drive Efficiency and Reduce Risk with Investigative Case Management So...How to Drive Efficiency and Reduce Risk with Investigative Case Management So...
How to Drive Efficiency and Reduce Risk with Investigative Case Management So...Case IQ
 
Management Science - Krimzen Tech
Management Science - Krimzen TechManagement Science - Krimzen Tech
Management Science - Krimzen TechDarrenTofu
 
Continous auditing and risk monitoring 9 23-09
Continous auditing and risk monitoring  9 23-09Continous auditing and risk monitoring  9 23-09
Continous auditing and risk monitoring 9 23-09Gaiani (CarnCorpAudit)
 
Business process mapping
Business process mappingBusiness process mapping
Business process mappingDAVIS THOMAS
 
IT In The Park 2016
IT In The Park 2016IT In The Park 2016
IT In The Park 2016Ray Bugg
 
December GIP Monthly Meeting
December GIP Monthly MeetingDecember GIP Monthly Meeting
December GIP Monthly MeetingCole Wirpel
 
ADDO19 - Automate or not from the beginning that is the question
ADDO19 - Automate or not from the beginning that is the questionADDO19 - Automate or not from the beginning that is the question
ADDO19 - Automate or not from the beginning that is the questionEnrique Carbonell
 
How mature are your processes? The stages of eDiscovery evolution
How mature are your processes? The stages of eDiscovery evolutionHow mature are your processes? The stages of eDiscovery evolution
How mature are your processes? The stages of eDiscovery evolutionMatthew Altass
 
Service catalogue presentation
Service catalogue presentationService catalogue presentation
Service catalogue presentationsubtitle
 
S&OP maturity comes prior to advance planning software
S&OP maturity comes prior to advance planning softwareS&OP maturity comes prior to advance planning software
S&OP maturity comes prior to advance planning softwareTristan Wiggill
 
Planning for an Oil & Gas Operation Well Life Cycle Framework
Planning for an Oil & Gas Operation Well Life Cycle FrameworkPlanning for an Oil & Gas Operation Well Life Cycle Framework
Planning for an Oil & Gas Operation Well Life Cycle FrameworkJeff Dyk
 

Semelhante a Major Incident - make your NOC Rock (20)

How to Digitally Transform Your Internal Operations
How to Digitally Transform Your Internal OperationsHow to Digitally Transform Your Internal Operations
How to Digitally Transform Your Internal Operations
 
Seminar on Process Documentation.pptx
Seminar on Process Documentation.pptxSeminar on Process Documentation.pptx
Seminar on Process Documentation.pptx
 
KV_ResumeAttachment_Updated 24112015
KV_ResumeAttachment_Updated 24112015KV_ResumeAttachment_Updated 24112015
KV_ResumeAttachment_Updated 24112015
 
The SID
The SIDThe SID
The SID
 
Management Science - Krimzen Tech
Management Science - Krimzen TechManagement Science - Krimzen Tech
Management Science - Krimzen Tech
 
Process Management by Jan Mohammed.pptx
Process Management by Jan Mohammed.pptxProcess Management by Jan Mohammed.pptx
Process Management by Jan Mohammed.pptx
 
NARCA Presentation - IT Best Practice
NARCA Presentation - IT Best PracticeNARCA Presentation - IT Best Practice
NARCA Presentation - IT Best Practice
 
How to Drive Efficiency and Reduce Risk with Investigative Case Management So...
How to Drive Efficiency and Reduce Risk with Investigative Case Management So...How to Drive Efficiency and Reduce Risk with Investigative Case Management So...
How to Drive Efficiency and Reduce Risk with Investigative Case Management So...
 
Management Science - Krimzen Tech
Management Science - Krimzen TechManagement Science - Krimzen Tech
Management Science - Krimzen Tech
 
Continous auditing and risk monitoring 9 23-09
Continous auditing and risk monitoring  9 23-09Continous auditing and risk monitoring  9 23-09
Continous auditing and risk monitoring 9 23-09
 
Business process mapping
Business process mappingBusiness process mapping
Business process mapping
 
Xero
XeroXero
Xero
 
IT In The Park 2016
IT In The Park 2016IT In The Park 2016
IT In The Park 2016
 
December GIP Monthly Meeting
December GIP Monthly MeetingDecember GIP Monthly Meeting
December GIP Monthly Meeting
 
ADDO19 - Automate or not from the beginning that is the question
ADDO19 - Automate or not from the beginning that is the questionADDO19 - Automate or not from the beginning that is the question
ADDO19 - Automate or not from the beginning that is the question
 
How mature are your processes? The stages of eDiscovery evolution
How mature are your processes? The stages of eDiscovery evolutionHow mature are your processes? The stages of eDiscovery evolution
How mature are your processes? The stages of eDiscovery evolution
 
E gov championship workshop bangalore 21082013
E gov championship workshop bangalore 21082013E gov championship workshop bangalore 21082013
E gov championship workshop bangalore 21082013
 
Service catalogue presentation
Service catalogue presentationService catalogue presentation
Service catalogue presentation
 
S&OP maturity comes prior to advance planning software
S&OP maturity comes prior to advance planning softwareS&OP maturity comes prior to advance planning software
S&OP maturity comes prior to advance planning software
 
Planning for an Oil & Gas Operation Well Life Cycle Framework
Planning for an Oil & Gas Operation Well Life Cycle FrameworkPlanning for an Oil & Gas Operation Well Life Cycle Framework
Planning for an Oil & Gas Operation Well Life Cycle Framework
 

Último

activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 

Último (20)

activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 

Major Incident - make your NOC Rock

  • 1. 1Pa g e Is minor incident management The secret to Major Incident Management Bob Fishman RobertFishman25@gmail.com 508-259-1467
  • 2. 2Pa g e A WELL RUN Network Operations Center (NOC) KEEPS YOUR BUSINESS RUNNING SMOOTHLY Performance Minimize service interruptions Rapid recovery Ongoing support and maintenance Well supported business functions Prevent, detect, respond A good NOC should be able to deal with even catastrophic situations, like natural disasters, smoothly, confidently and quickly. How do we make a NOC run smoothly? By managing the little stuff very, very well.
  • 3. 3Pa g e 1 2 4 What makes a NOC Rock? MONITORING Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve.
  • 4. 4Pa g e 1 2 3 4 TRAINING Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. What makes a NOC Rock? MONITORING Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve.
  • 5. 5Pa g e 1 2 3 4 PEOPLE TRAINING Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? MONITORING Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve.
  • 6. 6Pa g e Alerts should either be: • Automatically Ticketed and properly assigned • Automatically Ticketed and closed when cleared • Discarded Monitoring 1 Smart monitoring means the right alerts with the right information. 5 Eyes on Glass should try to be avoided • Monotonous periods of inactivity occur which lead to less than optimal performance of humans • If Eyes on Glass are needed strict process must be adhered to as to what events get ticketed
  • 7. 7Pa g e • No work should be done that isn’t ticketed. Why? Tickets should contain a trail left by engineers. The ticket is an important record of what was done, by who and why. • Un-ticketed work leads to memory and procedural gaps that cause issues. Furthermore, it means your team is loosing track of how they spend their time. That rarely ends well. • Ticket types should include: o Incidents, Service Requests, Changes and Problems • Tickets containing thorough comments can lead to great Knowledge Base articles Ticketing 5 2 Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve.
  • 8. 8Pa g eProcess 5 Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty 3 • Documented processes – particularly for lower level engineers • Process leads to repeatable, scalable, measurable outcomes with fewer errors o The outcomes will contain fewer errors which are also able to be reported on • Undocumented process becomes institutional knowledge and that knowledge may be lost when employees leave • All work notes must be in the ticket • If it isn’t in the ticket, it didn’t happen • When, how and to who to escalate the incident • Well defined shift hand-over steps and documentation • When and in what format and to who communications must be sent
  • 9. 9Pa g eTraining 5 Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. • Train for professional development • A more knowledgeable workforce • Ability to promote from within • Train so employees understand the corporate values and responsibilities • Helps company communicate legal issues such as Sexual harassment and Safety to employees 4
  • 10. 10Pa g eCommunication 5 Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. • Well documented communication processes • Escalation to the next level up and notification of that escalation • Stakeholder • Internal • External • The who, what, when and how of each step • Verbal communication • All occurrences should be documented with the ticket • Shift Hand-overs 5
  • 11. 11Pa g ePeople 5 A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. • The core of any organization are the people • Retain your best talent • People must be working towards a common goal defined by the corporate entity • They need defined duties • Timely and accurate feedback is essential • Trained employees feel empowered by and to move up in the organization • This is a win win scenario 6
  • 12. 12Pa g e 1 2 3 4 PEOPLE TRAINING Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve. MONITORING
  • 13. 13Pa g e 1 2 3 4 PEOPLE TRAINING COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve. Monitoring and Ticketing Should need no changes if the system ticketing process correctly identifies Major Incidents or P1s Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. MONITORING
  • 14. 14Pa g e 1 2 3 4 PEOPLE TRAINING COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve. Monitoring and Ticketing Should need no changes if the system ticketing process correctly identifies Major Incidents or P1s • A validated MI needs its own process and documentation • Who owns the ticket BECAUSE it is an MI • Who notifies Engineering that there is an MI • There should be a RACI document for a Major Incident Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. MONITORING
  • 15. 15Pa g e 1 2 3 4 PEOPLE TRAINING Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? • A Major Incident is no place for training • Lower level engineers can join the bridge and or the shared video of troubleshooting BUT this is a higher level issue • Experienced and trained prior to being part of a Major Incident Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve. Monitoring and Ticketing Should need no changes if the system ticketing process correctly identifies Major Incidents or P1s • A validated MI needs its own process and documentation • Who owns the ticket BECAUSE it is an MI • Who notifies Engineering that there is an MI • There should be a RACI document for a Major Incident A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. MONITORING
  • 16. 16Pa g e 1 2 3 4 PEOPLE TRAINING Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? • A Major Incident is no place for training • Lower level engineers can join the bridge and or the shared video of troubleshooting BUT this is a higher level issue • Experienced and trained prior to being part of a Major Incident Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve. Monitoring and Ticketing Should need no changes if the system ticketing process correctly identifies Major Incidents or P1s • A validated MI needs its own process and documentation • Who owns the ticket BECAUSE it is an MI • Who notifies Engineering that there is an MI • There should be a RACI document for a Major Incident • This process and associated tools should be well defined prior to expecting an MI to be handled properly • Who is responsible to communicate when, to who and how • Who owns the escalation of the incident, if needed A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. MONITORING
  • 17. 17Pa g e 1 2 3 4 PEOPLE TRAINING Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? • A Major Incident is no place for training • Lower level engineers can join the bridge and or the shared video of troubleshooting BUT this is a higher level issue • Experienced and trained prior to being part of a Major Incident Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve. • People should be prepared for any MI • They should understand the goals and SLAs • Major Incidents can be stressful understand how this may affect your staff Monitoring and Ticketing Should need no changes if the system ticketing process correctly identifies Major Incidents or P1s • A validated MI needs its own process and documentation • Who owns the ticket BECAUSE it is an MI • Who notifies Engineering that there is an MI • There should be a RACI document for a Major Incident • This process and associated tools should be well defined prior to expecting an MI to be handled properly • Who is responsible to communicate when, to who and how • Who owns the escalation of the incident, if needed MONITORING
  • 18. Incident handling is the key to success for proper handling of Major Incidents Preparing for Major Incidents by taking care of the “normal” incidents will make your NOC Rock
  • 19. 19Pa g e There actually is no secret to Major Incident Management To the end user there are no minor incidents Bob Fishman RobertFishman25@gmail.com 508-259-1467
  • 20. 20Pa g e There actually is no secret to Major Incident Management To the end user there are no minor incidents Bob Fishman RobertFishman25@gmail.com 508-259-1467