SlideShare uma empresa Scribd logo
1 de 50
Baixar para ler offline
John Hudson & Matt Fourie
5 November 2012

Go Direct to the
Root Cause –
itRCA the solution?
Agenda
“Most incident
investigators ask
the wrong
questions, so do not
change your people
but change the
questions they are
asking”
Matt Fourie

•

Introduction

•

Current situation

•

Components of a credible approach
•

•

Minimalistic information, being specific
and knowledge (wisdom) creation

The Three critical investigation skills
1.

Service Recovery Analysis

2.

Technical Cause Analysis

3.

Root Cause Analysis

•

Client outcomes

•

Questions & answers
Thinking Dimensions
Some of our recent
clients...
Barclays IT
ANZ IT Division
Macquarie ITG
Unisys
Polypore IT
Medtronic IT
SITA Global
BT Financial
Westpac IT
McDonalds IT
Queensland Police IT
Lockheed Martin Space
Systems
SPARQ IT

• Thinking Dimensions
International - operating
KEPNERandFOURIE
company initiatives for the
last 25 years
• Specialising in RCA
Methodology for IT Incident
and Problem Management
Global Presence
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

Baxter International
Blue Cross Blue Shield
Bosch
Caltex Oil
Carraro
Crown Cork and Seal
Dometic
Electrolux
Federal Judiciary Center
General Dynamics IT
Hollister,Inc
Infineon
BASF
Macquarie Bank IT
BT Financial IT
Stihl
Westpac IT
Maersk
Norfolk Naval Shipyard
Selig
Siemens
SITA
SKF

Americas
• Canada
• Chile
• Peru
• USA

EMEA
• Germany
• Italy
• Netherlands
• Poland
• Saudi Arabia
• South Africa
• Spain
• Turkey
• United Kingdom

Asia Pacific
• Australia
• China
• India
• South Korea
• Thailand
• Singapore
The Current Dilemma
PAST

NOW

FUTURE

STANDARD

itTCA® – TECHNICAL
CAUSE ANALYSIS

itSRA® – SERVICE
RECOVERY ANALYSIS

itRCA® – ROOT
CAUSE ANALYSIS
The Three Skills…
1. itSRA®

Incident

2. itTCA
®

3. itRCA ®

Service
Recovery
Analysis

Recovery &
Containment
Tools & Templates

Technical
Cause
Analysis

Technical Cause
Process and
Techniques

Root
Cause
Analysis

Root Cause & FIX
Checklist &
Templates
Current Default Root Causes
• Hardware
• Software

• “Human Error”
• Environment

Technical Cause

Root Cause
Incisive Thinking
Incident Statement

Internet Banking
Degrading

Technical Cause Root Cause
Incisive Thinking
Incident Statement

Technical Cause Root Cause

Internet Banking
Degrading

New browser
configuration
issue
Incisive Thinking
Incident Statement

Technical Cause Root Cause

Internet Banking
Degrading

New browser
configuration
issue

Integrative testing
not done properly
Incisive Thinking
Incident Statement

Technical Cause Root Cause

Internet Banking
Degrading

New browser
configuration
issue

Integrative testing
not done properly
Incisive Thinking
Incident Statement

Technical Cause Root Cause

Internet Banking
Degrading

New browser
configuration
issue

Encrypted “hello”
message not
returned

Integrative testing
not done properly
Incisive Thinking
Incident Statement

Technical Cause Root Cause

Internet Banking
Degrading

New browser
configuration
issue

Encrypted “hello”
message not
returned

„Beta‟ Certificate
used

Integrative testing
not done properly
Incisive Thinking
Incident Statement

Technical Cause Root Cause

Internet Banking
Degrading

New browser
configuration
issue

Integrative testing
not done properly

Encrypted “hello”
message not
returned

„Beta‟ Certificate
used

Policy
requirements for
“production”
environment not
adhered to
Incisive Thinking
Incident Statement

G-Force System
Freezing

Technical Cause Root Cause
Incisive Thinking
Incident Statement

Technical Cause Root Cause

G-Force System
Freezing

High volume
Incisive Thinking
Incident Statement

Technical Cause Root Cause

G-Force System
Freezing

High volume

Too many
users allowed
access
Incisive Thinking
Incident Statement

Technical Cause Root Cause

G-Force System
Freezing

High volume

G-Force SQL DB
thread count
exceeding maximum

Too many users
allowed access
Incisive Thinking
Incident Statement

Technical Cause Root Cause

G-Force System
Freezing

High volume

G-Force SQL DB
G-Force program
thread count
not closing out
exceeding maximum threads

Too many users
allowed access
Incisive Thinking
Incident Statement

Technical Cause Root Cause

G-Force System
Freezing

High volume

Too many users
allowed access

G-Force SQL DB
G-Force program Vendor
thread count
not closing out
implemented an
exceeding maximum threads
untested program
update
Basic phases of problem solving
Procedure for addressing an Incident

1. State the purpose
Divergent
Thinking

2. Gather incident/problem detail

3. Evaluate for causes
Convergent
Thinking

4. Confirm technical/root cause
1. Testing
2. Verifying cause
Basic phases of problem solving
Procedure for addressing an Incident

1. State the purpose
Divergent
Thinking

2. Gather incident/problem detail

3. Evaluate for causes
Convergent
Thinking

4. Confirm technical/root cause
1. Testing
2. Verifying cause
Good RCA…
YOU NEED TO SOLVE AN INCIDENT;
•

QUICKLY [Service Recovery]

•

ACCURATELY [Technical Cause]

•

PERMANENTLY [Root Cause]
Factors in minimalistic approach
Factor
I Keep six honest serving-men:
(They taught me all I knew);

Their names are What and
Why and When

What
Where

And How and Where and Who.

When

I send them over land and sea,

How

I send them east and west;

But after they have worked for me,
I give them all a rest.
Rudyard Kipling

Why
Who

IS

BUT NOT
Extreme Focus With “Specificity”
Object
Servers

Fault
Not
communicating

“The key to success
is to be insistent
about specificity –
the more specific
you are the better
your chances to
Solve an incident.”
KEPNERandFOURIE

Specificity Rules
•One object one fault

•Single-minded &
simplistic
•Highly focused

•Must find the correct
entry point
•Ask a question –
expect an answer
Extreme Focus With “Specificity”
Object
Servers

Fault
Not
communicating
Data not
transferred

Specificity Rules
•One object one fault

•Single-minded &
simplistic
•Highly focused

•Must find the correct
entry point
•Ask a question –
expect an answer
Extreme Focus With “Specificity
Object
Servers

Fault

Specificity Rules

Not
communicating

•One object one fault

Data not
transferred

•Single-minded &
simplistic

Sent but not
received by
receiving servers

•Highly focused

•Must find the correct
entry point
•Ask a question –
expect an answer
Extreme Focus With “Specificity”
Object
Servers

Fault

Specificity Rules
•One object one fault

Data not
transferred.
Sent but not
received by
receiving servers
Data for Large
Outlets

Not
communicating

•Single-minded &
simplistic

Not received

•Highly focused

•Must find the correct
entry point
•Ask a question –
expect an answer
Extreme Focus With “Specificity”
Object
Servers

Fault

Specificity Rules

Not
communicating

•One object one fault

Data not
transferred.
Sent but not
received by
receiving servers

•Single-minded &
simplistic
•Highly focused

Data for Large
Outlets

Not received

•Must find the correct
entry point

Sales turnover
numbers for
Large Outlets

Not received

•Ask a question –
expect an answer
Creating Intelligence
DATA

INFORMATION

IS

BUT NOT

Internet
Banking

Intranet
Banking

KNOWLEDGE
WHY NOT
Different routing
SSL handshake

Unexpected Outcomes
•“BUT NOT” clarifies the
facts
•Creates a curious “contrast”

Slow

Freezing

Volume?

APAC users

USA, UK

ADSL lines

Started Oct 1

Before

New passwords

Continuous

After 4pm

Different routing

•Looking at answers at a
“granular level”

•Stimulates deductive
reasoning
The Current Dilemma
PAST

NOW

FUTURE

STANDARD

itTCA® – TECHNICAL
CAUSE ANALYSIS

itSRA ® – SERVICE
RECOVERY ANALYSIS

itRCA ® – ROOT
CAUSE ANALYSIS
Service Recovery [ MTR]
FACTOR

IS

BUT NOT

REQUIREMENT

OBJECT

Mobile
website
access

PC website
access

WHAT TO
RESTORE

FAULT

Denied – not
authorized

Slow/freezing

WHAT PROBLEMS
TO REMOVE

WHO

Blackberry
users

Other Smart
phones

WHO

WHERE

Asia

ANZ, UK,
USA

WHERE

IMPACT

Customer
complaints

PATTERN

Sporadic

TO WHAT EXTENT

continuous

FOR HOW LONG

ACTIONS TO
CONSIDER
Service Recovery [ MTR]
Statement: Restore website access to customers
Key Solution Requirements

Various actions to meet key requirements

1

2

3

4

5

1. Provide access to client to at least receive
interim non-availability notice

0

3

2

1

3

2. No loss of Data

3

3

0

0

1

3. Should not impact System Performance

1

0

3

1

0

4. ADSL compatible for Asia

1

2

0

0

0

5. Improve reliability

3

0

3

1

1

6. Implementation within the hour

1

3

3

1

2

Possible Actions:
1. Upload or switch on simple site maintenance page
2. Set up or start up back up service
3. Reroute 20/80 service all to back up service
4. Restrict access to low load tasks only
5. Allow access based on region
Service Recovery [ MTR]
Statement: Restore website access to customers
Key Solution Requirements

Various actions to meet key requirements

1

2

3

4

5

1. Provide access to client to at least receive
interim non-availability notice

0

3

2

1

3

2. No loss of Data

3

3

0

0

1

3. Should not impact System Performance

1

0

3

1

0

4. ADSL compatible for Asia

1

2

0

0

0

5. Improve reliability

3

0

3

1

1

6. Implementation within the hour

1

3

3

1

2

Possible Actions:
1. Upload or switch on simple site maintenance page
2. Set up or start up back up service
3. Reroute 20/80 service all to back up service
4. Restrict access to low load tasks only
5. Allow access based on region
The Current Dilemma
PAST

NOW

FUTURE

STANDARD

itTCA® – TECHNICAL
CAUSE ANALYSIS

itSRA ® – SERVICE
RECOVERY ANALYSIS

itRCA ® – ROOT
CAUSE ANALYSIS
Technical Cause Analysis [TCA - MTTR]
IS

BUT
NOT

WHY
NOT

OBJECT

OBJECT – What object and which other object(s)
not?

FAULT

FAULT – What fault and which other typical faults
not?

USERS

USERS – Who has the problem and who does not?

WHERE

WHERE – Where are these users and where could
they have been but are not?

TIMING

TIMING – When did it happen first time and when
not?

PATTERN

PATTERN – What is the pattern of faults and what
could it have been but is not?

CYCLE

CYCLE – In which cycle does the problem occur and
in which cycle does it not occur?
Technical Cause Analysis [TCA]
DIMENSION

IS

BUT NOT

WHY NOT

Object

Fireburst
V2.0
connection

E-Express,
Mango
connections

F/B upgrade from V1
to V2, Poor testing
issue

Fault

dropping

Freezing, slow

Time out settings,
configuration of drivers

Location
of Object

ANZ, USA,
UK

Asia

LAN, Proxy server
issues, F/Wall rules

Timing

Monday,
Sept 2nd with
SOB

Any time earlier
than Sept 2nd

Java upgrade,
Netscape upgrade

Pattern

Continuous

Sporadic,
Periodic

Don‟t know

Life Cycle

When doing
a transaction

“x” time into
transaction

Operator error, Code
error on a specific
page

Phase of
Work

Just after
logging in

Logging in or out

OS configuration issue,
DNS issue

Possible Causes &
Testing
Technical Cause Analysis [TCA]
DIMENSION

IS

BUT NOT

WHY NOT

Object

Fireburst
V2.0
connection

E-Express,
Mango
connections

F/B upgrade from V1
to V2, Poor testing
issue

Fault

Dropping

Freezing, slow

Time out settings,
configuration of drivers

Location
of Object

ANZ, USA,
UK

Asia

LAN, Proxy server
issues, F/Wall rules

Timing

Monday,
Sept 2nd with
SOB

Any time earlier
than Sept 2nd

Java upgrade,
Netscape upgrade

Pattern

Continuous

Sporadic,
Periodic

Don‟t know

Life Cycle

When doing
a transaction

“x” time into
transaction

Operator error, Code
error on a specific
page

Phase of
Work

Just after
logging in

Logging in or out

OS configuration issue,
DNS issue

Possible Causes &
Testing
1. Proxy server tampered with during the Java
upgrade on the LAN

2. Java upgrade caused driver incompatibility
with Fireburst website V2.0

3. Netscape upgrade caused driver
incompatibility with Fireburst website V2.0
Technical Cause Analysis [TCA]
DIMENSION

IS

BUT NOT

WHY NOT

Object

Fireburst
V2.0
connection

E-Express,
Mango
connections

F/B upgrade from V1
to V2, Poor testing
issue

Fault

Dropping

Freezing, slow

Time out settings,
configuration of drivers

Location
of Object

ANZ, USA,
UK

Asia

LAN, Proxy server
issues, F/Wall rules

Timing

Monday,
Sept 2nd with
SOB

Any time earlier
than Sept 2nd

Java upgrade,
Netscape upgrade

Pattern

Continuous

Sporadic,
Periodic

Don‟t know

Life Cycle

When doing
a transaction

“x” time into
transaction

Operator error, Code
error on a specific
page

Just after
logging in

Logging in or out

OS configuration issue,
DNS issue

Phase of
Work

Possible Causes &
Testing
1. Proxy server tampered with during the Java
upgrade on the LAN

X

2. Java upgrade caused driver incompatibility
with Fireburst website V2.0
√

√

X

3. Netscape upgrade caused driver
incompatibility with Fireburst website V2.0
√

√

A1

√

√

√

√

A1- Only if the staff in Asia did not upgrade to
Netscape
The Current Dilemma
PAST

NOW

FUTURE

STANDARD

itTCA® – TECHNICAL
CAUSE ANALYSIS

itSRA ® – SERVICE
RECOVERY ANALYSIS

itRCA ® – ROOT
CAUSE ANALYSIS
A Case of a good thinking process
• Deviation Statement
• Factor Analysis
• Possible causal factors

• Testing the causal
hypotheses
• Find the underlying
reason(s) for incident

'The truth, if it
exists, is in the
details'
“Bartlett – Familiar
Quotations”
The Right Starting Point
• Find the technical
cause first
• Do 5 Why‟s to get to
the systemic level
• Find the root
cause(s)
• Fix the
incident/problem for
good

“If a team has not
solved an incident,
the person with the
information was not
invited”

Chuck Kepner
Four Questions to get Started
•

Is the object deviation within the control of
your own system? Can you fix the root cause
with actions under your control?

•

Is the technical cause deviation in the vendor's
system? Can you only fix the root cause with
the vendor's help?

ITRCA

Max4

ITRCA
Max4

Is the object deviation within the control of
your own system? Can you only fix the root
cause with the vendor's help?

•

RiskWise

•

Is the technical cause deviation in the vendor's
system? We would only be able to take
avoiding actions.
Root Cause Analysis [RCA]
DIMENSION

IS

BUT
NOT
APPPLICATION:

What application and which
other applications not?

DEVIATION

DEVIATION:

What deviation do we have
and which ones not?

FUNCTION

FUNCTION:

Which job/function/process is
involved and which ones not?

WHO

USERS:

Who has the problem and who
does not?

WHERE

WHERE:

TIMING

TIMING:

Where are these users and
where could they have been
but are not?
When did it happen first time
and when not?

FREQUENCY

FREQUENCY:

APPLICATION

How frequent is the fault
occurring?
Root Cause Analysis [RCA]
COMPONENT

CAUSAL FACTORS

Decision Making

Process and Collaboration for inputs

Implementation
issues

Resources and Scope & Definition of
Poor decision process and documentation for this
project
task

Standard Operating
Procedures

Applicability of SOP and Awareness
of SOP

Management

Management of Work and Staff

Measurement

KPI”s and Roles & Responsibilities

CAUSAL ELEMENTS
Critical stakeholder requirements not consulted for
this task
Inadequate authority levels for making good
decisions

Inadequate standards guiding the decision making
Time Zone difficulties hampering effective
decision making
Unrealistic time, cost and performance
expectations
Poor initial estimation of resources needed for the
project
Poor updated approval data making the procedure
unclear
Poor work guidance/coaching for correct
performance
Work standards for this task is not enforced
Poor management support in getting this task
done
KPI and metrics regarding this output not clear or
absent
Poor feedback on this KPI
Duplication and GAPS making roles and
responsibilities difficult
Root Cause Analysis 2 cont. [RCA]
COMPONENT

CAUSAL FACTORS

Support

Internal and External Vendor support

Communications

Clarity of communications and
instructions

Work Environment

Task Interference and consequences

Skills

Complexity and applicability

Testing Practices

CAUSAL ELEMENTS

Procedures and requirements

Overuse of the SME causing sub-standard work
Poor continual vendor support for this output
Continual interruptions in performing the task
Task performance request not properly understood
Work environment not conducive for the demands
of the task
Unrealistic task and performance expectation for
this task
Not having enough experience with similar tasks
No vendor training provided for new product and or
service
Poor risk analysis and decision pressure during
testing
Not all aspects tested and the test was incomplete

Personal

Aptitude and Attitude

Inadequate problem solving ability for this type of
task
Incumbent does not follow instructions or Standard
Procedure
Root Cause Analysis [RCA]
COMPONENT

CAUSAL FACTORS

Decision Making

Process and Collaboration for
inputs

Implementation
issues

Resources and Scope &
Definition of project

Standard
Operating
Procedures

Applicability of SOP and
Awareness of SOP

Management

Management of Work and Staff

Measurement

KPI”s and Roles &
Responsibilities

CAUSAL ELEMENTS
Critical stakeholder requirements not consulted for
this task
Inadequate authority levels for making good
decisions
Poor decision process and documentation for this
task
Inadequate standards guiding the decision making
Time Zone difficulties hampering effective decision
making
Unrealistic time, cost and performance expectations
Poor initial estimation of resources needed for the
project
Poor updated approval data making the procedure
unclear
Poor work guidance/coaching for correct
performance
Work standards for this task is not enforced
Poor management support in getting this task done
KPI and metrics regarding this output not clear or
absent
Poor feedback on this KPI
Duplication and GAPS making roles and
responsibilities difficult
Root Cause Analysis [RCA]
COMPONENT

CAUSAL FACTORS

Support

Internal and External
Vendor support

Communications

Clarity of communications
and instructions

Work
Environment

Task Interference and
consequences

Skills

Complexity and applicability

Testing Practices

Procedures and
requirements

Personal

Aptitude and Attitude

CAUSAL ELEMENTS
Overuse of the SME causing sub-standard work
Poor continual vendor support for this output
Continual interruptions in performing the task
Task performance request not properly understood
Work environment not conducive for the
demands of the task
Unrealistic task and performance expectation
for this task
Not having enough experience with similar tasks
No vendor training provided for new product and or
service
Poor risk analysis and decision pressure during
testing
Not all aspects tested and the test was incomplete
Inadequate problem solving ability for this type of
task
Incumbent does not follow instructions or Standard
Procedure
Testing the Hypothesis
The decision making process is too
cumbersome to allow for own initiatives and
the staff member must make a choice with
given alternatives which is not most optimal for
the situation

Final Conclusion and
Action Plan:
1.

The job incumbent did not get the necessary
support to do his job under a pressure situation
adding to task interference

✗

2.

External vendor support for certain technical
decisions was not available and that resulted
in a less optimized decision choice.

3.
Additional Resources
“SOLVE IT” – Find a way
to solve incidents
quickly, accurately and
permanently.

Mais conteúdo relacionado

Mais procurados

Intro to Root Cause Analysis
Intro to Root Cause AnalysisIntro to Root Cause Analysis
Intro to Root Cause AnalysisCarmel Khan
 
Root Cause Analysis
Root Cause AnalysisRoot Cause Analysis
Root Cause Analysisgatelyw396
 
Rikard Edgren - Testing is an Island - A Software Testing Dystopia
Rikard Edgren - Testing is an Island - A Software Testing DystopiaRikard Edgren - Testing is an Island - A Software Testing Dystopia
Rikard Edgren - Testing is an Island - A Software Testing DystopiaTEST Huddle
 
Root Cause Analysis: Think Again! - by Kevin Stewart
Root Cause Analysis: Think Again! - by Kevin StewartRoot Cause Analysis: Think Again! - by Kevin Stewart
Root Cause Analysis: Think Again! - by Kevin StewartASQ Reliability Division
 
Introduction to Root Cause Analysis
Introduction to Root Cause AnalysisIntroduction to Root Cause Analysis
Introduction to Root Cause AnalysisCarmel Khan
 
RCA Root Cause Analysis
RCA Root Cause AnalysisRCA Root Cause Analysis
RCA Root Cause Analysiswaleed sayed
 
Root Cause Analysis (RCA) Tools
Root Cause Analysis (RCA) ToolsRoot Cause Analysis (RCA) Tools
Root Cause Analysis (RCA) ToolsJeremy Jay Lim
 
5 why’s technique and cause and effect analysis
5 why’s technique and cause and effect analysis5 why’s technique and cause and effect analysis
5 why’s technique and cause and effect analysisBhagya Silva
 
Root Cause Analysis | QualiTest Group
Root Cause Analysis | QualiTest GroupRoot Cause Analysis | QualiTest Group
Root Cause Analysis | QualiTest GroupQualitest
 

Mais procurados (19)

Intro to Root Cause Analysis
Intro to Root Cause AnalysisIntro to Root Cause Analysis
Intro to Root Cause Analysis
 
Root cause analysis
Root cause analysisRoot cause analysis
Root cause analysis
 
Root Cause Analysis
Root Cause AnalysisRoot Cause Analysis
Root Cause Analysis
 
Root Cause Analysis_Linkedin
Root Cause Analysis_LinkedinRoot Cause Analysis_Linkedin
Root Cause Analysis_Linkedin
 
Rikard Edgren - Testing is an Island - A Software Testing Dystopia
Rikard Edgren - Testing is an Island - A Software Testing DystopiaRikard Edgren - Testing is an Island - A Software Testing Dystopia
Rikard Edgren - Testing is an Island - A Software Testing Dystopia
 
Root Cause Analysis: Think Again! - by Kevin Stewart
Root Cause Analysis: Think Again! - by Kevin StewartRoot Cause Analysis: Think Again! - by Kevin Stewart
Root Cause Analysis: Think Again! - by Kevin Stewart
 
Introduction to Root Cause Analysis
Introduction to Root Cause AnalysisIntroduction to Root Cause Analysis
Introduction to Root Cause Analysis
 
Kepner tregoe methodology-version2
Kepner tregoe methodology-version2Kepner tregoe methodology-version2
Kepner tregoe methodology-version2
 
Root Cause Analysis Presentation
Root Cause Analysis PresentationRoot Cause Analysis Presentation
Root Cause Analysis Presentation
 
RCA Root Cause Analysis
RCA Root Cause AnalysisRCA Root Cause Analysis
RCA Root Cause Analysis
 
Root cause analysis by: ICG Team
Root cause analysis by: ICG TeamRoot cause analysis by: ICG Team
Root cause analysis by: ICG Team
 
#8 Root Cause Analysis
#8 Root Cause Analysis#8 Root Cause Analysis
#8 Root Cause Analysis
 
Root cause analysis
Root cause analysisRoot cause analysis
Root cause analysis
 
5 whys nhsiq 2014
5 whys   nhsiq 20145 whys   nhsiq 2014
5 whys nhsiq 2014
 
Root Cause Analysis (RCA) Tools
Root Cause Analysis (RCA) ToolsRoot Cause Analysis (RCA) Tools
Root Cause Analysis (RCA) Tools
 
5 why’s technique and cause and effect analysis
5 why’s technique and cause and effect analysis5 why’s technique and cause and effect analysis
5 why’s technique and cause and effect analysis
 
Root Cause Analysis | QualiTest Group
Root Cause Analysis | QualiTest GroupRoot Cause Analysis | QualiTest Group
Root Cause Analysis | QualiTest Group
 
Root Cause Analysis
Root Cause AnalysisRoot Cause Analysis
Root Cause Analysis
 
5 why analysis
5 why analysis5 why analysis
5 why analysis
 

Semelhante a Information Technology - Discover the Root Cause and Develop a solution through structured processes

Virtual Data : Eliminating the data constraint in Application Development
Virtual Data :  Eliminating the data constraint in Application DevelopmentVirtual Data :  Eliminating the data constraint in Application Development
Virtual Data : Eliminating the data constraint in Application DevelopmentKyle Hailey
 
Oracle database threats - LAOUC Webinar
Oracle database threats - LAOUC WebinarOracle database threats - LAOUC Webinar
Oracle database threats - LAOUC WebinarOsama Mustafa
 
Fast 360 assessment sample report
Fast 360 assessment sample reportFast 360 assessment sample report
Fast 360 assessment sample reportExtraHop Networks
 
Accelerate Develoment with VIrtual Data
Accelerate Develoment with VIrtual DataAccelerate Develoment with VIrtual Data
Accelerate Develoment with VIrtual DataKyle Hailey
 
Accelerating Delivery of Value
Accelerating Delivery of ValueAccelerating Delivery of Value
Accelerating Delivery of ValueRyan D. Hatch
 
Enough Blame for System Performance Issues
Enough Blame for System Performance IssuesEnough Blame for System Performance Issues
Enough Blame for System Performance IssuesMahesh Vallampati
 
Event Driven Architectures - Net Conf UY 2018
Event Driven Architectures - Net Conf UY 2018Event Driven Architectures - Net Conf UY 2018
Event Driven Architectures - Net Conf UY 2018Bradley Irby
 
Financial Services Technology Leader Turns Mainframe Logs into Real-Time Insi...
Financial Services Technology Leader Turns Mainframe Logs into Real-Time Insi...Financial Services Technology Leader Turns Mainframe Logs into Real-Time Insi...
Financial Services Technology Leader Turns Mainframe Logs into Real-Time Insi...Precisely
 
Využijte svou Oracle databázi na maximum!
Využijte svou Oracle databázi na maximum!Využijte svou Oracle databázi na maximum!
Využijte svou Oracle databázi na maximum!MarketingArrowECS_CZ
 
EarthLink Business - Business Continuity
EarthLink Business - Business ContinuityEarthLink Business - Business Continuity
EarthLink Business - Business ContinuityMike Ricca
 
"It can always get worse!" – Lessons Learned in over 20 years working with Or...
"It can always get worse!" – Lessons Learned in over 20 years working with Or..."It can always get worse!" – Lessons Learned in over 20 years working with Or...
"It can always get worse!" – Lessons Learned in over 20 years working with Or...Markus Michalewicz
 
DevOps, Databases and The Phoenix Project UGF4042 from OOW14
DevOps, Databases and The Phoenix Project UGF4042 from OOW14DevOps, Databases and The Phoenix Project UGF4042 from OOW14
DevOps, Databases and The Phoenix Project UGF4042 from OOW14Kyle Hailey
 
Data Stack Considerations: Build vs. Buy at Tout
Data Stack Considerations: Build vs. Buy at ToutData Stack Considerations: Build vs. Buy at Tout
Data Stack Considerations: Build vs. Buy at ToutLooker
 
e-IT exec lunch - "It's all about data" - 25 May '16
e-IT exec lunch - "It's all about data" - 25 May '16e-IT exec lunch - "It's all about data" - 25 May '16
e-IT exec lunch - "It's all about data" - 25 May '16Devin Deen
 
Tis The Season: Load Testing Tips and Checklist for Retail Seasonal Readiness
Tis The Season: Load Testing Tips and Checklist for Retail Seasonal ReadinessTis The Season: Load Testing Tips and Checklist for Retail Seasonal Readiness
Tis The Season: Load Testing Tips and Checklist for Retail Seasonal ReadinessSOASTA
 
Tis The Season: Load Testing Tips and Checklist for Retail Seasonal Readiness
Tis The Season: Load Testing Tips and Checklist for Retail Seasonal ReadinessTis The Season: Load Testing Tips and Checklist for Retail Seasonal Readiness
Tis The Season: Load Testing Tips and Checklist for Retail Seasonal ReadinessSOASTA
 
SenchaCon Roadshow Irvine 2017
SenchaCon Roadshow Irvine 2017SenchaCon Roadshow Irvine 2017
SenchaCon Roadshow Irvine 2017Speedment, Inc.
 
Elasticsearch : petit déjeuner du 13 mars 2014
Elasticsearch : petit déjeuner du 13 mars 2014Elasticsearch : petit déjeuner du 13 mars 2014
Elasticsearch : petit déjeuner du 13 mars 2014ALTER WAY
 
Delphix and DBmaestro
Delphix and DBmaestroDelphix and DBmaestro
Delphix and DBmaestroKyle Hailey
 

Semelhante a Information Technology - Discover the Root Cause and Develop a solution through structured processes (20)

Virtual Data : Eliminating the data constraint in Application Development
Virtual Data :  Eliminating the data constraint in Application DevelopmentVirtual Data :  Eliminating the data constraint in Application Development
Virtual Data : Eliminating the data constraint in Application Development
 
Oracle database threats - LAOUC Webinar
Oracle database threats - LAOUC WebinarOracle database threats - LAOUC Webinar
Oracle database threats - LAOUC Webinar
 
Fast 360 assessment sample report
Fast 360 assessment sample reportFast 360 assessment sample report
Fast 360 assessment sample report
 
Accelerate Develoment with VIrtual Data
Accelerate Develoment with VIrtual DataAccelerate Develoment with VIrtual Data
Accelerate Develoment with VIrtual Data
 
Accelerating Delivery of Value
Accelerating Delivery of ValueAccelerating Delivery of Value
Accelerating Delivery of Value
 
Enough Blame for System Performance Issues
Enough Blame for System Performance IssuesEnough Blame for System Performance Issues
Enough Blame for System Performance Issues
 
Event Driven Architectures - Net Conf UY 2018
Event Driven Architectures - Net Conf UY 2018Event Driven Architectures - Net Conf UY 2018
Event Driven Architectures - Net Conf UY 2018
 
Financial Services Technology Leader Turns Mainframe Logs into Real-Time Insi...
Financial Services Technology Leader Turns Mainframe Logs into Real-Time Insi...Financial Services Technology Leader Turns Mainframe Logs into Real-Time Insi...
Financial Services Technology Leader Turns Mainframe Logs into Real-Time Insi...
 
Využijte svou Oracle databázi na maximum!
Využijte svou Oracle databázi na maximum!Využijte svou Oracle databázi na maximum!
Využijte svou Oracle databázi na maximum!
 
EarthLink Business - Business Continuity
EarthLink Business - Business ContinuityEarthLink Business - Business Continuity
EarthLink Business - Business Continuity
 
"It can always get worse!" – Lessons Learned in over 20 years working with Or...
"It can always get worse!" – Lessons Learned in over 20 years working with Or..."It can always get worse!" – Lessons Learned in over 20 years working with Or...
"It can always get worse!" – Lessons Learned in over 20 years working with Or...
 
DevOps, Databases and The Phoenix Project UGF4042 from OOW14
DevOps, Databases and The Phoenix Project UGF4042 from OOW14DevOps, Databases and The Phoenix Project UGF4042 from OOW14
DevOps, Databases and The Phoenix Project UGF4042 from OOW14
 
Data Stack Considerations: Build vs. Buy at Tout
Data Stack Considerations: Build vs. Buy at ToutData Stack Considerations: Build vs. Buy at Tout
Data Stack Considerations: Build vs. Buy at Tout
 
e-IT exec lunch - "It's all about data" - 25 May '16
e-IT exec lunch - "It's all about data" - 25 May '16e-IT exec lunch - "It's all about data" - 25 May '16
e-IT exec lunch - "It's all about data" - 25 May '16
 
Tis The Season: Load Testing Tips and Checklist for Retail Seasonal Readiness
Tis The Season: Load Testing Tips and Checklist for Retail Seasonal ReadinessTis The Season: Load Testing Tips and Checklist for Retail Seasonal Readiness
Tis The Season: Load Testing Tips and Checklist for Retail Seasonal Readiness
 
Tis The Season: Load Testing Tips and Checklist for Retail Seasonal Readiness
Tis The Season: Load Testing Tips and Checklist for Retail Seasonal ReadinessTis The Season: Load Testing Tips and Checklist for Retail Seasonal Readiness
Tis The Season: Load Testing Tips and Checklist for Retail Seasonal Readiness
 
SenchaCon Roadshow Irvine 2017
SenchaCon Roadshow Irvine 2017SenchaCon Roadshow Irvine 2017
SenchaCon Roadshow Irvine 2017
 
Elasticsearch : petit déjeuner du 13 mars 2014
Elasticsearch : petit déjeuner du 13 mars 2014Elasticsearch : petit déjeuner du 13 mars 2014
Elasticsearch : petit déjeuner du 13 mars 2014
 
Delphix and DBmaestro
Delphix and DBmaestroDelphix and DBmaestro
Delphix and DBmaestro
 
DCNCBC
DCNCBCDCNCBC
DCNCBC
 

Último

Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?SANGHEE SHIN
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum ComputingGDSC PJATK
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfAnna Loughnan Colquhoun
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.francesco barbera
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 

Último (20)

Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum Computing
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdf
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 

Information Technology - Discover the Root Cause and Develop a solution through structured processes

  • 1. John Hudson & Matt Fourie 5 November 2012 Go Direct to the Root Cause – itRCA the solution?
  • 2. Agenda “Most incident investigators ask the wrong questions, so do not change your people but change the questions they are asking” Matt Fourie • Introduction • Current situation • Components of a credible approach • • Minimalistic information, being specific and knowledge (wisdom) creation The Three critical investigation skills 1. Service Recovery Analysis 2. Technical Cause Analysis 3. Root Cause Analysis • Client outcomes • Questions & answers
  • 3. Thinking Dimensions Some of our recent clients... Barclays IT ANZ IT Division Macquarie ITG Unisys Polypore IT Medtronic IT SITA Global BT Financial Westpac IT McDonalds IT Queensland Police IT Lockheed Martin Space Systems SPARQ IT • Thinking Dimensions International - operating KEPNERandFOURIE company initiatives for the last 25 years • Specialising in RCA Methodology for IT Incident and Problem Management
  • 4. Global Presence • • • • • • • • • • • • • • • • • • • • • • • Baxter International Blue Cross Blue Shield Bosch Caltex Oil Carraro Crown Cork and Seal Dometic Electrolux Federal Judiciary Center General Dynamics IT Hollister,Inc Infineon BASF Macquarie Bank IT BT Financial IT Stihl Westpac IT Maersk Norfolk Naval Shipyard Selig Siemens SITA SKF Americas • Canada • Chile • Peru • USA EMEA • Germany • Italy • Netherlands • Poland • Saudi Arabia • South Africa • Spain • Turkey • United Kingdom Asia Pacific • Australia • China • India • South Korea • Thailand • Singapore
  • 5. The Current Dilemma PAST NOW FUTURE STANDARD itTCA® – TECHNICAL CAUSE ANALYSIS itSRA® – SERVICE RECOVERY ANALYSIS itRCA® – ROOT CAUSE ANALYSIS
  • 6. The Three Skills… 1. itSRA® Incident 2. itTCA ® 3. itRCA ® Service Recovery Analysis Recovery & Containment Tools & Templates Technical Cause Analysis Technical Cause Process and Techniques Root Cause Analysis Root Cause & FIX Checklist & Templates
  • 7. Current Default Root Causes • Hardware • Software • “Human Error” • Environment Technical Cause Root Cause
  • 8. Incisive Thinking Incident Statement Internet Banking Degrading Technical Cause Root Cause
  • 9. Incisive Thinking Incident Statement Technical Cause Root Cause Internet Banking Degrading New browser configuration issue
  • 10. Incisive Thinking Incident Statement Technical Cause Root Cause Internet Banking Degrading New browser configuration issue Integrative testing not done properly
  • 11. Incisive Thinking Incident Statement Technical Cause Root Cause Internet Banking Degrading New browser configuration issue Integrative testing not done properly
  • 12. Incisive Thinking Incident Statement Technical Cause Root Cause Internet Banking Degrading New browser configuration issue Encrypted “hello” message not returned Integrative testing not done properly
  • 13. Incisive Thinking Incident Statement Technical Cause Root Cause Internet Banking Degrading New browser configuration issue Encrypted “hello” message not returned „Beta‟ Certificate used Integrative testing not done properly
  • 14. Incisive Thinking Incident Statement Technical Cause Root Cause Internet Banking Degrading New browser configuration issue Integrative testing not done properly Encrypted “hello” message not returned „Beta‟ Certificate used Policy requirements for “production” environment not adhered to
  • 15. Incisive Thinking Incident Statement G-Force System Freezing Technical Cause Root Cause
  • 16. Incisive Thinking Incident Statement Technical Cause Root Cause G-Force System Freezing High volume
  • 17. Incisive Thinking Incident Statement Technical Cause Root Cause G-Force System Freezing High volume Too many users allowed access
  • 18. Incisive Thinking Incident Statement Technical Cause Root Cause G-Force System Freezing High volume G-Force SQL DB thread count exceeding maximum Too many users allowed access
  • 19. Incisive Thinking Incident Statement Technical Cause Root Cause G-Force System Freezing High volume G-Force SQL DB G-Force program thread count not closing out exceeding maximum threads Too many users allowed access
  • 20. Incisive Thinking Incident Statement Technical Cause Root Cause G-Force System Freezing High volume Too many users allowed access G-Force SQL DB G-Force program Vendor thread count not closing out implemented an exceeding maximum threads untested program update
  • 21. Basic phases of problem solving Procedure for addressing an Incident 1. State the purpose Divergent Thinking 2. Gather incident/problem detail 3. Evaluate for causes Convergent Thinking 4. Confirm technical/root cause 1. Testing 2. Verifying cause
  • 22. Basic phases of problem solving Procedure for addressing an Incident 1. State the purpose Divergent Thinking 2. Gather incident/problem detail 3. Evaluate for causes Convergent Thinking 4. Confirm technical/root cause 1. Testing 2. Verifying cause
  • 23. Good RCA… YOU NEED TO SOLVE AN INCIDENT; • QUICKLY [Service Recovery] • ACCURATELY [Technical Cause] • PERMANENTLY [Root Cause]
  • 24. Factors in minimalistic approach Factor I Keep six honest serving-men: (They taught me all I knew); Their names are What and Why and When What Where And How and Where and Who. When I send them over land and sea, How I send them east and west; But after they have worked for me, I give them all a rest. Rudyard Kipling Why Who IS BUT NOT
  • 25. Extreme Focus With “Specificity” Object Servers Fault Not communicating “The key to success is to be insistent about specificity – the more specific you are the better your chances to Solve an incident.” KEPNERandFOURIE Specificity Rules •One object one fault •Single-minded & simplistic •Highly focused •Must find the correct entry point •Ask a question – expect an answer
  • 26. Extreme Focus With “Specificity” Object Servers Fault Not communicating Data not transferred Specificity Rules •One object one fault •Single-minded & simplistic •Highly focused •Must find the correct entry point •Ask a question – expect an answer
  • 27. Extreme Focus With “Specificity Object Servers Fault Specificity Rules Not communicating •One object one fault Data not transferred •Single-minded & simplistic Sent but not received by receiving servers •Highly focused •Must find the correct entry point •Ask a question – expect an answer
  • 28. Extreme Focus With “Specificity” Object Servers Fault Specificity Rules •One object one fault Data not transferred. Sent but not received by receiving servers Data for Large Outlets Not communicating •Single-minded & simplistic Not received •Highly focused •Must find the correct entry point •Ask a question – expect an answer
  • 29. Extreme Focus With “Specificity” Object Servers Fault Specificity Rules Not communicating •One object one fault Data not transferred. Sent but not received by receiving servers •Single-minded & simplistic •Highly focused Data for Large Outlets Not received •Must find the correct entry point Sales turnover numbers for Large Outlets Not received •Ask a question – expect an answer
  • 30. Creating Intelligence DATA INFORMATION IS BUT NOT Internet Banking Intranet Banking KNOWLEDGE WHY NOT Different routing SSL handshake Unexpected Outcomes •“BUT NOT” clarifies the facts •Creates a curious “contrast” Slow Freezing Volume? APAC users USA, UK ADSL lines Started Oct 1 Before New passwords Continuous After 4pm Different routing •Looking at answers at a “granular level” •Stimulates deductive reasoning
  • 31. The Current Dilemma PAST NOW FUTURE STANDARD itTCA® – TECHNICAL CAUSE ANALYSIS itSRA ® – SERVICE RECOVERY ANALYSIS itRCA ® – ROOT CAUSE ANALYSIS
  • 32. Service Recovery [ MTR] FACTOR IS BUT NOT REQUIREMENT OBJECT Mobile website access PC website access WHAT TO RESTORE FAULT Denied – not authorized Slow/freezing WHAT PROBLEMS TO REMOVE WHO Blackberry users Other Smart phones WHO WHERE Asia ANZ, UK, USA WHERE IMPACT Customer complaints PATTERN Sporadic TO WHAT EXTENT continuous FOR HOW LONG ACTIONS TO CONSIDER
  • 33. Service Recovery [ MTR] Statement: Restore website access to customers Key Solution Requirements Various actions to meet key requirements 1 2 3 4 5 1. Provide access to client to at least receive interim non-availability notice 0 3 2 1 3 2. No loss of Data 3 3 0 0 1 3. Should not impact System Performance 1 0 3 1 0 4. ADSL compatible for Asia 1 2 0 0 0 5. Improve reliability 3 0 3 1 1 6. Implementation within the hour 1 3 3 1 2 Possible Actions: 1. Upload or switch on simple site maintenance page 2. Set up or start up back up service 3. Reroute 20/80 service all to back up service 4. Restrict access to low load tasks only 5. Allow access based on region
  • 34. Service Recovery [ MTR] Statement: Restore website access to customers Key Solution Requirements Various actions to meet key requirements 1 2 3 4 5 1. Provide access to client to at least receive interim non-availability notice 0 3 2 1 3 2. No loss of Data 3 3 0 0 1 3. Should not impact System Performance 1 0 3 1 0 4. ADSL compatible for Asia 1 2 0 0 0 5. Improve reliability 3 0 3 1 1 6. Implementation within the hour 1 3 3 1 2 Possible Actions: 1. Upload or switch on simple site maintenance page 2. Set up or start up back up service 3. Reroute 20/80 service all to back up service 4. Restrict access to low load tasks only 5. Allow access based on region
  • 35. The Current Dilemma PAST NOW FUTURE STANDARD itTCA® – TECHNICAL CAUSE ANALYSIS itSRA ® – SERVICE RECOVERY ANALYSIS itRCA ® – ROOT CAUSE ANALYSIS
  • 36. Technical Cause Analysis [TCA - MTTR] IS BUT NOT WHY NOT OBJECT OBJECT – What object and which other object(s) not? FAULT FAULT – What fault and which other typical faults not? USERS USERS – Who has the problem and who does not? WHERE WHERE – Where are these users and where could they have been but are not? TIMING TIMING – When did it happen first time and when not? PATTERN PATTERN – What is the pattern of faults and what could it have been but is not? CYCLE CYCLE – In which cycle does the problem occur and in which cycle does it not occur?
  • 37. Technical Cause Analysis [TCA] DIMENSION IS BUT NOT WHY NOT Object Fireburst V2.0 connection E-Express, Mango connections F/B upgrade from V1 to V2, Poor testing issue Fault dropping Freezing, slow Time out settings, configuration of drivers Location of Object ANZ, USA, UK Asia LAN, Proxy server issues, F/Wall rules Timing Monday, Sept 2nd with SOB Any time earlier than Sept 2nd Java upgrade, Netscape upgrade Pattern Continuous Sporadic, Periodic Don‟t know Life Cycle When doing a transaction “x” time into transaction Operator error, Code error on a specific page Phase of Work Just after logging in Logging in or out OS configuration issue, DNS issue Possible Causes & Testing
  • 38. Technical Cause Analysis [TCA] DIMENSION IS BUT NOT WHY NOT Object Fireburst V2.0 connection E-Express, Mango connections F/B upgrade from V1 to V2, Poor testing issue Fault Dropping Freezing, slow Time out settings, configuration of drivers Location of Object ANZ, USA, UK Asia LAN, Proxy server issues, F/Wall rules Timing Monday, Sept 2nd with SOB Any time earlier than Sept 2nd Java upgrade, Netscape upgrade Pattern Continuous Sporadic, Periodic Don‟t know Life Cycle When doing a transaction “x” time into transaction Operator error, Code error on a specific page Phase of Work Just after logging in Logging in or out OS configuration issue, DNS issue Possible Causes & Testing 1. Proxy server tampered with during the Java upgrade on the LAN 2. Java upgrade caused driver incompatibility with Fireburst website V2.0 3. Netscape upgrade caused driver incompatibility with Fireburst website V2.0
  • 39. Technical Cause Analysis [TCA] DIMENSION IS BUT NOT WHY NOT Object Fireburst V2.0 connection E-Express, Mango connections F/B upgrade from V1 to V2, Poor testing issue Fault Dropping Freezing, slow Time out settings, configuration of drivers Location of Object ANZ, USA, UK Asia LAN, Proxy server issues, F/Wall rules Timing Monday, Sept 2nd with SOB Any time earlier than Sept 2nd Java upgrade, Netscape upgrade Pattern Continuous Sporadic, Periodic Don‟t know Life Cycle When doing a transaction “x” time into transaction Operator error, Code error on a specific page Just after logging in Logging in or out OS configuration issue, DNS issue Phase of Work Possible Causes & Testing 1. Proxy server tampered with during the Java upgrade on the LAN X 2. Java upgrade caused driver incompatibility with Fireburst website V2.0 √ √ X 3. Netscape upgrade caused driver incompatibility with Fireburst website V2.0 √ √ A1 √ √ √ √ A1- Only if the staff in Asia did not upgrade to Netscape
  • 40. The Current Dilemma PAST NOW FUTURE STANDARD itTCA® – TECHNICAL CAUSE ANALYSIS itSRA ® – SERVICE RECOVERY ANALYSIS itRCA ® – ROOT CAUSE ANALYSIS
  • 41. A Case of a good thinking process • Deviation Statement • Factor Analysis • Possible causal factors • Testing the causal hypotheses • Find the underlying reason(s) for incident 'The truth, if it exists, is in the details' “Bartlett – Familiar Quotations”
  • 42. The Right Starting Point • Find the technical cause first • Do 5 Why‟s to get to the systemic level • Find the root cause(s) • Fix the incident/problem for good “If a team has not solved an incident, the person with the information was not invited” Chuck Kepner
  • 43. Four Questions to get Started • Is the object deviation within the control of your own system? Can you fix the root cause with actions under your control? • Is the technical cause deviation in the vendor's system? Can you only fix the root cause with the vendor's help? ITRCA Max4 ITRCA Max4 Is the object deviation within the control of your own system? Can you only fix the root cause with the vendor's help? • RiskWise • Is the technical cause deviation in the vendor's system? We would only be able to take avoiding actions.
  • 44. Root Cause Analysis [RCA] DIMENSION IS BUT NOT APPPLICATION: What application and which other applications not? DEVIATION DEVIATION: What deviation do we have and which ones not? FUNCTION FUNCTION: Which job/function/process is involved and which ones not? WHO USERS: Who has the problem and who does not? WHERE WHERE: TIMING TIMING: Where are these users and where could they have been but are not? When did it happen first time and when not? FREQUENCY FREQUENCY: APPLICATION How frequent is the fault occurring?
  • 45. Root Cause Analysis [RCA] COMPONENT CAUSAL FACTORS Decision Making Process and Collaboration for inputs Implementation issues Resources and Scope & Definition of Poor decision process and documentation for this project task Standard Operating Procedures Applicability of SOP and Awareness of SOP Management Management of Work and Staff Measurement KPI”s and Roles & Responsibilities CAUSAL ELEMENTS Critical stakeholder requirements not consulted for this task Inadequate authority levels for making good decisions Inadequate standards guiding the decision making Time Zone difficulties hampering effective decision making Unrealistic time, cost and performance expectations Poor initial estimation of resources needed for the project Poor updated approval data making the procedure unclear Poor work guidance/coaching for correct performance Work standards for this task is not enforced Poor management support in getting this task done KPI and metrics regarding this output not clear or absent Poor feedback on this KPI Duplication and GAPS making roles and responsibilities difficult
  • 46. Root Cause Analysis 2 cont. [RCA] COMPONENT CAUSAL FACTORS Support Internal and External Vendor support Communications Clarity of communications and instructions Work Environment Task Interference and consequences Skills Complexity and applicability Testing Practices CAUSAL ELEMENTS Procedures and requirements Overuse of the SME causing sub-standard work Poor continual vendor support for this output Continual interruptions in performing the task Task performance request not properly understood Work environment not conducive for the demands of the task Unrealistic task and performance expectation for this task Not having enough experience with similar tasks No vendor training provided for new product and or service Poor risk analysis and decision pressure during testing Not all aspects tested and the test was incomplete Personal Aptitude and Attitude Inadequate problem solving ability for this type of task Incumbent does not follow instructions or Standard Procedure
  • 47. Root Cause Analysis [RCA] COMPONENT CAUSAL FACTORS Decision Making Process and Collaboration for inputs Implementation issues Resources and Scope & Definition of project Standard Operating Procedures Applicability of SOP and Awareness of SOP Management Management of Work and Staff Measurement KPI”s and Roles & Responsibilities CAUSAL ELEMENTS Critical stakeholder requirements not consulted for this task Inadequate authority levels for making good decisions Poor decision process and documentation for this task Inadequate standards guiding the decision making Time Zone difficulties hampering effective decision making Unrealistic time, cost and performance expectations Poor initial estimation of resources needed for the project Poor updated approval data making the procedure unclear Poor work guidance/coaching for correct performance Work standards for this task is not enforced Poor management support in getting this task done KPI and metrics regarding this output not clear or absent Poor feedback on this KPI Duplication and GAPS making roles and responsibilities difficult
  • 48. Root Cause Analysis [RCA] COMPONENT CAUSAL FACTORS Support Internal and External Vendor support Communications Clarity of communications and instructions Work Environment Task Interference and consequences Skills Complexity and applicability Testing Practices Procedures and requirements Personal Aptitude and Attitude CAUSAL ELEMENTS Overuse of the SME causing sub-standard work Poor continual vendor support for this output Continual interruptions in performing the task Task performance request not properly understood Work environment not conducive for the demands of the task Unrealistic task and performance expectation for this task Not having enough experience with similar tasks No vendor training provided for new product and or service Poor risk analysis and decision pressure during testing Not all aspects tested and the test was incomplete Inadequate problem solving ability for this type of task Incumbent does not follow instructions or Standard Procedure
  • 49. Testing the Hypothesis The decision making process is too cumbersome to allow for own initiatives and the staff member must make a choice with given alternatives which is not most optimal for the situation Final Conclusion and Action Plan: 1. The job incumbent did not get the necessary support to do his job under a pressure situation adding to task interference ✗ 2. External vendor support for certain technical decisions was not available and that resulted in a less optimized decision choice. 3.
  • 50. Additional Resources “SOLVE IT” – Find a way to solve incidents quickly, accurately and permanently.

Notas do Editor

  1. v
  2. In 2011 we are represented in 20 countries and in 12 different languages. TD has been growing steadily over the last 10 years. As you can see from the list TD and its network were already working with a formidable list of global clients. 2011 was also the year that TD officially decided in their strategy that they will niche exclusively into the IT market.
  3. The procedure for problem solving is the following; First you have to state the problem situation and then once you have the correct statement and thus the correct “entry point” into the problem situation, you would be able to gather the most relevant information pertaining to the problem. Once you have the information, you need to analyze it and then come to a mutually agreed answer.
  4. The procedure for problem solving is the following; First you have to state the problem situation and then once you have the correct statement and thus the correct “entry point” into the problem situation, you would be able to gather the most relevant information pertaining to the problem. Once you have the information, you need to analyze it and then come to a mutually agreed answer.