1. 1Pa g e
Is minor incident management
The secret to
Major Incident
Management
Bob Fishman
RobertFishman25@gmail.com
508-259-1467
2. 2Pa g e
A WELL RUN Network Operations
Center (NOC) KEEPS
YOUR
BUSINESS
RUNNING
SMOOTHLY
Performance
Minimize service interruptions
Rapid recovery
Ongoing support and maintenance
Well supported business functions
Prevent, detect, respond
A good NOC should be able to deal with even catastrophic
situations, like natural disasters, smoothly, confidently and
quickly.
How do we make a NOC run smoothly? By managing the
little stuff very, very well.
3. 3Pa g e
1 2
4
What makes a NOC Rock?
MONITORING
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
4. 4Pa g e
1 2
3 4
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
What makes a NOC Rock?
MONITORING
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
5. 5Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
MONITORING
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
6. 6Pa g e
Alerts should either be:
• Automatically Ticketed and properly assigned
• Automatically Ticketed and closed when cleared
• Discarded
Monitoring
1 Smart monitoring means the right alerts with the right information.
5
Eyes on Glass should try to be avoided
• Monotonous periods of inactivity occur which lead to
less than optimal performance of humans
• If Eyes on Glass are needed strict process must be
adhered to as to what events get ticketed
7. 7Pa g e
• No work should be done that isn’t ticketed. Why? Tickets should contain a trail left by engineers. The ticket is an important record of
what was done, by who and why.
• Un-ticketed work leads to memory and procedural gaps that cause issues. Furthermore, it means your team is loosing track of how they
spend their time. That rarely ends well.
• Ticket types should include:
o Incidents, Service Requests, Changes and Problems
• Tickets containing thorough comments can lead to great Knowledge Base articles
Ticketing
5
2
Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve.
8. 8Pa g eProcess
5
Every system needs a well documented process. Good processes mean good responses. Good documentation
means a consistent response no matter who’s on duty
3
• Documented processes – particularly for lower level engineers
• Process leads to repeatable, scalable, measurable outcomes with fewer errors
o The outcomes will contain fewer errors which are also able to be reported on
• Undocumented process becomes institutional knowledge and that knowledge may be lost when employees leave
• All work notes must be in the ticket
• If it isn’t in the ticket, it didn’t happen
• When, how and to who to escalate the incident
• Well defined shift hand-over steps and documentation
• When and in what format and to who communications must be sent
9. 9Pa g eTraining
5
Training is as much about expectations and approach as it is about specific knowledge and processes. Good
training makes for good teams.
• Train for professional development
• A more knowledgeable workforce
• Ability to promote from within
• Train so employees understand the corporate values and responsibilities
• Helps company communicate legal issues such as Sexual harassment and Safety to employees
4
10. 10Pa g eCommunication
5
Communication is essential between team members, OEMS, and the business. The leaders sets a tone and
process, and everyone participates.
• Well documented communication processes
• Escalation to the next level up and notification of that escalation
• Stakeholder
• Internal
• External
• The who, what, when and how of each step
• Verbal communication
• All occurrences should be documented with the ticket
• Shift Hand-overs
5
11. 11Pa g ePeople
5
A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same
location.
• The core of any organization are the people
• Retain your best talent
• People must be working towards a common goal defined by the corporate entity
• They need defined duties
• Timely and accurate feedback is essential
• Trained employees feel empowered by and to move up in the organization
• This is a win win scenario
6
12. 12Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
MONITORING
13. 13Pa g e
1 2
3 4
PEOPLE
TRAINING
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
MONITORING
14. 14Pa g e
1 2
3 4
PEOPLE
TRAINING
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
• A validated MI needs
its own process and
documentation
• Who owns the ticket
BECAUSE it is an MI
• Who notifies
Engineering that there
is an MI
• There should be a RACI
document for a Major
Incident
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
MONITORING
15. 15Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
• A Major Incident is no
place for training
• Lower level engineers
can join the bridge and
or the shared video of
troubleshooting BUT
this is a higher level
issue
• Experienced and trained
prior to being part of a
Major Incident
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
• A validated MI needs
its own process and
documentation
• Who owns the ticket
BECAUSE it is an MI
• Who notifies
Engineering that there
is an MI
• There should be a RACI
document for a Major
Incident
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
MONITORING
16. 16Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
• A Major Incident is no
place for training
• Lower level engineers
can join the bridge and
or the shared video of
troubleshooting BUT
this is a higher level
issue
• Experienced and trained
prior to being part of a
Major Incident
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
• A validated MI needs
its own process and
documentation
• Who owns the ticket
BECAUSE it is an MI
• Who notifies
Engineering that there
is an MI
• There should be a RACI
document for a Major
Incident
• This process and
associated tools should
be well defined prior
to expecting an MI to
be handled properly
• Who is responsible to
communicate when, to
who and how
• Who owns the
escalation of the
incident, if needed
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
MONITORING
17. 17Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
• A Major Incident is no
place for training
• Lower level engineers
can join the bridge and
or the shared video of
troubleshooting BUT
this is a higher level
issue
• Experienced and trained
prior to being part of a
Major Incident
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
• People should be
prepared for any MI
• They should understand
the goals and SLAs
• Major Incidents can be
stressful understand
how this may affect
your staff
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
• A validated MI needs
its own process and
documentation
• Who owns the ticket
BECAUSE it is an MI
• Who notifies
Engineering that there
is an MI
• There should be a RACI
document for a Major
Incident
• This process and
associated tools should
be well defined prior
to expecting an MI to
be handled properly
• Who is responsible to
communicate when, to
who and how
• Who owns the
escalation of the
incident, if needed
MONITORING
18. Incident handling is the key to success
for proper handling of Major Incidents
Preparing for Major Incidents by taking care of the “normal”
incidents will make your NOC Rock
19. 19Pa g e
There actually is no secret to
Major Incident Management
To the end user there are
no minor incidents
Bob Fishman
RobertFishman25@gmail.com
508-259-1467
20. 20Pa g e
There actually is no secret to
Major Incident Management
To the end user there are
no minor incidents
Bob Fishman
RobertFishman25@gmail.com
508-259-1467