SlideShare a Scribd company logo
1 of 41
NOTICE: Proprietary and Confidential
This material is proprietary to Centric Consulting, LLC. It contains trade secrets and informationwhich is solely the property of Centric Consulting,LLC. This material is
solely for the Client’sinternaluse. This material shall not be used, reproduced, copied, disclosed, transmitted, in whole or in part, without the express consent of Centric
Consulting,LLC.
© 2013 Centric Consulting,LLC. All rights reserved
Bad Metric. Bad!
Teaching an old dog, nothing new
What are some typical metrics that you measure?
Other Examples of Software Testing Metrics
• Test Case Counts by Execution Status
• Test Case Percentages by Execution Status
• Test Case Execution Status Trend
• Test Case Status Planned vs Executed
• Test Case Coverage
• Test Case Status vs Coverage
• Test Case First Run Failure Counts
• Test Case Re– Run Counts
Test Cases
• Automation Index (Percent Automatable)
• Automation Progress
• Automation Test Coverage
Automation extras
More Examples of Software Testing Metrics
• Defect Counts by Status
• Defect Counts by Priority
• Defect Status Trend
• Defect Density
• Defect Remove Efficiency
• Defect Leakage
• Average Defect Response Time
Defects
• Requirements Volatility Index
• Testing Process Efficiency
Other
Common Themes
Counts
Metric (Counts/Counts)
Trends
Other Examples of Software Testing Metrics
• Test Case Counts by Execution Status – Count
• Test Case Percentages by Execution Status – Count
• Test Case Execution Status Trend – Trend
• Test Case Executed vs Planned – Metric and Trend
• Test Case Coverage – Metric
• Test Case Status vs Coverage – Metric
• Test Case First Run Failure Counts – Count
• Test Case Re– Run Counts – Count
Test Cases
• Automation Index (Percent Automatable) – Metric
• Automation Progress – Count
• Automation Test Coverage – Metric
Automation extras
More Examples of Software Testing Metrics
• Defect Counts by Status – Count
• Defect Counts by Priority – Count
• Defect Status Trend – Trend
• Defect Density – Metric
• Defect Remove Efficiency – Metric
• Defect Leakage – Metric
• Average Defect Response Time – Trend
Defects
• Requirements Volatility Index – Metric
• Testing Process Efficiency – Metric
Other
The Problem We Typically Face?
They Fail to Communicate
• Present data instead of information
• Offer no interpretation, allow user to draw own conclusion
They Are Often Inaccurate
• The act of measuring lacks of consistency
• The measures themselves have inherent variability
• No one reports margin of errors
They Do Not Measure a Control
• Can’t make decision based on number
• The measurement isn’t a lever to introduce change
They Are Not Tied to Organizational Objectives
• No threshold set for desired goal
• No action or consequence if not achieved
Counting
Counting
Exercise #1
1. Need 3 volunteers
2. Assume 1 scoop equals 1 days worth of testing effort
3. Hershey Kisses and Tootsie Rolls are tests, Starbursts are
bugs
4. Take a scoop
5. How many tests did you execute?
6. Based on how many tests you ran, how many more scoops
do you need to execute the rest (there are 180 total)?
Exercise #1 Questions
• Was the same scoop used? Were the results the
same?
• Was there variability in the number of tests run in
each scoop.
• Is that typical in testing?
• Was there variability in the estimate of the number
of tests left?
• Is this similar to guessing how much time is effort is left in
a test cycle?
• Are these numbers reliable?
• Are they repeatable?
Exercise #2
1. Need 3 volunteers
2. Assume 1 scoop equals 1 days worth of testing effort
3. Hershey Kisses and Tootsie Rolls are tests, Starbursts are bugs (Red are
severe)
4. Take a scoop
5. How many tests did you execute?
6. How many defects did you find?
7. Based on how many tests you ran, how many more scoops do you need
to execute the rest?
8. Based on how much effort you put in, how many more scoops do you
need to find the rest of the defects?
Exercise #2 Questions
• Was the same scoop used? Were the results the same?
• With an estimate of the number of tests remaining, is it reasonable to
estimate the number of defects will be found?
• Do people ask you to guess this type of information?
• If you know how many tests (Starbursts) are left and how many man-
hours you will use (scoop size), can you estimate how many scoops are
needed to execute all tests (find all Starbursts)?
• Is it accurate? Is it close enough?
• Are these numbers reliable?
• Are they repeatable?
• Does encountering defects (M&M’s) reveal anything about the overall
quality (how many M&M’s exist, or what it’ll take to find them)?
Challenges with Counting
Label does not equal content
Inherent variability
Not evenly spaced
Lacks reference for context
Lack of consistency
Metrics (Measure over Measure)
Sampling
Target Population
Matched Samples
Independent Samples
Random Sampling
Simple Random Sampling
Stratified Sampling
Cluster Sampling
Quota Sampling
Spatial Sampling
Sampling Variability
Standard Error
Bias
Precision
For each population there are many possible samples. A sample statistic
gives information about a corresponding population parameter
Sampling in Testing
Does testing use sampling?
Consider in most corporate environments:
• We never test the entire application
• It is not realistically possible to find
every defect
• So, does testing use sampling?
Ponder this as we discuss the next section…
Is Testing a Methodical Defect Searching
Activity?
Sampling
Remember, We can’t test everything – not enough time/people/budget
So, which sample approach better approximates an actual measure (e.g.
dots per sq. inch?)
5.25 dots/sq. in. 6.5 dots/sq. in.
Ponder this as we discuss the next section…
Is Testing a Methodical Defect Searching
Activity?
Sampling
Which sample approach better approximates an actual measure (e.g. dots
per sq. inch?)
• What is more accurate, random or methodical searching?
5.25 dots/sq. in. 6.5 dots/sq. in.
4.95 dots/sq. in. 6.3 dots/sq. in.
There are actually 6.6
dots/sq. in.
Exercise #3
Exercise #3
1. Need 3 volunteers
2. Assume 1 scoop equals 1 days worth of testing effort
3. Hershey Kisses and Tootsie Rolls are tests, Starbursts are bugs (Red are
severe)
4. Each volunteer grab 1 scoop of candy
5. How many (total) tests did you execute?
6. How many (total) defects did you find?
7. Log results
8. Repeat 2 more times
Exercise #3 Questions
• Does this graph represent anything useful?
• Does a trend line help or mean anything?
• Is it possible or reasonable to estimate the # of
defects you’ll see based on the number of
tests, from even 9 samples?
• Compare scoop 1 to scoop 9 – does any scoop
seem to be a reasonable estimate?
Challenges with Metrics (Measure over Measure)
Implied Derivations and Forecasting
Counts over Counts
Denominator Rules
Implies Velocity
Measure over Measure
Trends
Trend
Trend is a change in a measure (or metric) over time interval.
Has three components
Direction/Movement Speed/Size Cause (Implied)
Exercise #4
1. Need 3 volunteers
2. Assume 1 scoop equals 1 days worth of testing effort
3. Hershey Kisses and Tootsie Rolls are tests, Starbursts are bugs (Red are
severe)
4. Each volunteer grab 1 scoop of candy
5. How many of EACH type of tests did you execute?
6. How many of EACH type of defect did you find?
7. Log results
8. Repeat 2 more times
Exercise #4 Questions
• Does the graph line represent any information of value?
• Is there assurance (control) that simply taking a scoop (e.g.
executing tests in a given day) will result in defects being
found?
• Is the shape of the defect cumulative line representative of
anything?
• If we only look at scoops 1-3 or 7-9, does it tell us anything or
mislead us?
• What if we took 2 scoops per day (added a tester – but still
counted as 1 day), would that affect anything how things
look?
• Is M&M’s per scoop or M&M’s per skittles/starbursts mean
anything?
Challenges with Trends
Affected by challenges of counting
Affected by challenges of metrics
Time Based Series
Intervals and Activity Pause
Purpose of Metrics
Measure of
Performance
Conformance to
Best Practice
Deviation from Goal
Issues affecting purpose
Misaligned with strategy
Using metrics as outputs only
Too many metrics
Ease of measure does not equal importance
Lack of context
Limited dimensions
Lack behavioral aspects
Changing the World
How to Leverage Metrics
Explicitly link metrics to goals
Use trends over absolute numbers
Use shorter tracking periods
Change metrics when they stop
driving change
Account for error and confidence
Q&A
Joseph Ours
Email:
Joseph.ours@centricconsulting.com
Company Website:
https://centricconsulting.com/technol
ogy-solutions/software-quality-
assurance-and-testing/
Twitter:
@justjoehere
LinkedIN:
www.linkedin.com/josephours
Personal Blog:
http://josephours.blogspot.com

More Related Content

What's hot

What's hot (20)

Stop! you're testing too much
Stop!  you're testing too muchStop!  you're testing too much
Stop! you're testing too much
 
From Gatekeeper to Partner by Kelsey Shannahan
From Gatekeeper to Partner by Kelsey ShannahanFrom Gatekeeper to Partner by Kelsey Shannahan
From Gatekeeper to Partner by Kelsey Shannahan
 
Ajay Balamnrugadas - Weekend Testing, Skilled Software Testing Unleashed - Eu...
Ajay Balamnrugadas - Weekend Testing, Skilled Software Testing Unleashed - Eu...Ajay Balamnrugadas - Weekend Testing, Skilled Software Testing Unleashed - Eu...
Ajay Balamnrugadas - Weekend Testing, Skilled Software Testing Unleashed - Eu...
 
Advancing Testing Using Axioms
Advancing Testing Using AxiomsAdvancing Testing Using Axioms
Advancing Testing Using Axioms
 
QI, not QA
QI, not QAQI, not QA
QI, not QA
 
Will Robots Replace Testers?
Will Robots Replace Testers?Will Robots Replace Testers?
Will Robots Replace Testers?
 
Fantastic Tests - The Crimes of Bad Test Design
Fantastic Tests - The Crimes of Bad Test DesignFantastic Tests - The Crimes of Bad Test Design
Fantastic Tests - The Crimes of Bad Test Design
 
'The Real Agile Testing Quadrants' with Michael Bolton
'The Real Agile Testing Quadrants' with Michael Bolton'The Real Agile Testing Quadrants' with Michael Bolton
'The Real Agile Testing Quadrants' with Michael Bolton
 
Erkki Poyhonen - Software Testing - A Users Guide
Erkki Poyhonen - Software Testing - A Users GuideErkki Poyhonen - Software Testing - A Users Guide
Erkki Poyhonen - Software Testing - A Users Guide
 
10 signs your testing is not enough
10 signs your testing is not enough10 signs your testing is not enough
10 signs your testing is not enough
 
Growing a Company Test Community: Roles and Paths for Testers
Growing a Company Test Community: Roles and Paths for TestersGrowing a Company Test Community: Roles and Paths for Testers
Growing a Company Test Community: Roles and Paths for Testers
 
The Thinking Tester, Evolved
The Thinking Tester, EvolvedThe Thinking Tester, Evolved
The Thinking Tester, Evolved
 
New Model Testing: A New Test Process and Tool
New Model Testing:  A New Test Process and ToolNew Model Testing:  A New Test Process and Tool
New Model Testing: A New Test Process and Tool
 
Agile Testing
Agile TestingAgile Testing
Agile Testing
 
[HCMC STC Jan 2015] Workshop Of Context-Driven Testing In Agile
[HCMC STC Jan 2015] Workshop Of Context-Driven Testing In Agile[HCMC STC Jan 2015] Workshop Of Context-Driven Testing In Agile
[HCMC STC Jan 2015] Workshop Of Context-Driven Testing In Agile
 
Four Stages of Automated Testing by Bradley Temple
Four Stages of Automated Testing by Bradley TempleFour Stages of Automated Testing by Bradley Temple
Four Stages of Automated Testing by Bradley Temple
 
Hardware/Software Integration Testing
Hardware/Software Integration TestingHardware/Software Integration Testing
Hardware/Software Integration Testing
 
Become a Quality Enabler
Become a Quality EnablerBecome a Quality Enabler
Become a Quality Enabler
 
Defining Test Competence
Defining Test CompetenceDefining Test Competence
Defining Test Competence
 
James Whittaker - Pursuing Quality-You Won't Get There - EuroSTAR 2011
James Whittaker - Pursuing Quality-You Won't Get There - EuroSTAR 2011James Whittaker - Pursuing Quality-You Won't Get There - EuroSTAR 2011
James Whittaker - Pursuing Quality-You Won't Get There - EuroSTAR 2011
 

Viewers also liked

How to bake in quality in agile scrum projects
How to bake in quality in agile scrum projectsHow to bake in quality in agile scrum projects
How to bake in quality in agile scrum projects
Santanu Bhattacharya
 
UNIT TESTING PPT
UNIT TESTING PPTUNIT TESTING PPT
UNIT TESTING PPT
suhasreddy1
 

Viewers also liked (15)

Feedback and its importance in delivering high quality software - Ken De Souza
Feedback and its importance in delivering high quality software - Ken De SouzaFeedback and its importance in delivering high quality software - Ken De Souza
Feedback and its importance in delivering high quality software - Ken De Souza
 
Improv(e) your testing! - Damian Synadinos
Improv(e) your testing! - Damian SynadinosImprov(e) your testing! - Damian Synadinos
Improv(e) your testing! - Damian Synadinos
 
How to bake in quality in agile scrum projects
How to bake in quality in agile scrum projectsHow to bake in quality in agile scrum projects
How to bake in quality in agile scrum projects
 
WE are Doing it Wrong - Dmitry Sharkov
WE are Doing it Wrong - Dmitry SharkovWE are Doing it Wrong - Dmitry Sharkov
WE are Doing it Wrong - Dmitry Sharkov
 
Cucumber From the Ground Up - Joseph Beale
Cucumber From the Ground Up - Joseph BealeCucumber From the Ground Up - Joseph Beale
Cucumber From the Ground Up - Joseph Beale
 
Training for Automated Testing - Kelsey Shannahan
Training for Automated Testing - Kelsey ShannahanTraining for Automated Testing - Kelsey Shannahan
Training for Automated Testing - Kelsey Shannahan
 
Ready, set, go! - Anna Royzman
Ready, set, go! - Anna RoyzmanReady, set, go! - Anna Royzman
Ready, set, go! - Anna Royzman
 
Combinatorial software test design beyond pairwise testing
Combinatorial software test design beyond pairwise testingCombinatorial software test design beyond pairwise testing
Combinatorial software test design beyond pairwise testing
 
The Art of Gherkin Scripting - Matt Eakin
The Art of Gherkin Scripting - Matt EakinThe Art of Gherkin Scripting - Matt Eakin
The Art of Gherkin Scripting - Matt Eakin
 
STOP! You're Testing Too Much - Shawn Wallace
STOP!  You're Testing Too Much - Shawn WallaceSTOP!  You're Testing Too Much - Shawn Wallace
STOP! You're Testing Too Much - Shawn Wallace
 
The Risky Business of Testing by Shaminder Rai and Dave Patel
The Risky Business of Testing by Shaminder Rai and Dave PatelThe Risky Business of Testing by Shaminder Rai and Dave Patel
The Risky Business of Testing by Shaminder Rai and Dave Patel
 
When Cultures Collide – A tester’s story by Raj Subramanian
When Cultures Collide – A tester’s story by Raj SubramanianWhen Cultures Collide – A tester’s story by Raj Subramanian
When Cultures Collide – A tester’s story by Raj Subramanian
 
UNIT TESTING PPT
UNIT TESTING PPTUNIT TESTING PPT
UNIT TESTING PPT
 
Unit Testing Concepts and Best Practices
Unit Testing Concepts and Best PracticesUnit Testing Concepts and Best Practices
Unit Testing Concepts and Best Practices
 
Testing Microservices
Testing MicroservicesTesting Microservices
Testing Microservices
 

Similar to Bad metric, bad! - Joseph Ours

How did i miss that bug rtc
How did i miss that bug rtcHow did i miss that bug rtc
How did i miss that bug rtc
GerieOwen
 
Planning of experiment in industrial research
Planning of experiment in industrial researchPlanning of experiment in industrial research
Planning of experiment in industrial research
pbbharate
 
software testing metrics do's - don'ts-XBOSoft-QAI Webinar
software testing metrics do's - don'ts-XBOSoft-QAI Webinarsoftware testing metrics do's - don'ts-XBOSoft-QAI Webinar
software testing metrics do's - don'ts-XBOSoft-QAI Webinar
XBOSoft
 

Similar to Bad metric, bad! - Joseph Ours (20)

Bad Metric, Bad!
Bad Metric, Bad!Bad Metric, Bad!
Bad Metric, Bad!
 
SAM
SAMSAM
SAM
 
Things Could Get Worse: Ideas About Regression Testing
Things Could Get Worse: Ideas About Regression TestingThings Could Get Worse: Ideas About Regression Testing
Things Could Get Worse: Ideas About Regression Testing
 
How much testing is enough
How much testing is enoughHow much testing is enough
How much testing is enough
 
Anton Muzhailo - Practical Test Process Improvement using ISTQB
Anton Muzhailo - Practical Test Process Improvement using ISTQBAnton Muzhailo - Practical Test Process Improvement using ISTQB
Anton Muzhailo - Practical Test Process Improvement using ISTQB
 
Model validation
Model validationModel validation
Model validation
 
Analytic emperical Mehods
Analytic emperical MehodsAnalytic emperical Mehods
Analytic emperical Mehods
 
How did i miss that bug rtc
How did i miss that bug rtcHow did i miss that bug rtc
How did i miss that bug rtc
 
MLSEV Virtual. Automating Model Selection
MLSEV Virtual. Automating Model SelectionMLSEV Virtual. Automating Model Selection
MLSEV Virtual. Automating Model Selection
 
Evaluating tests
Evaluating testsEvaluating tests
Evaluating tests
 
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour Presentation
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour PresentationSoftware Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour Presentation
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour Presentation
 
Testing for everyone agile yorkshire
Testing for everyone agile yorkshireTesting for everyone agile yorkshire
Testing for everyone agile yorkshire
 
Planning of experiment in industrial research
Planning of experiment in industrial researchPlanning of experiment in industrial research
Planning of experiment in industrial research
 
[Paul Holland] Bad Metrics and What You Can Do About It
[Paul Holland] Bad Metrics and What You Can Do About It[Paul Holland] Bad Metrics and What You Can Do About It
[Paul Holland] Bad Metrics and What You Can Do About It
 
Session 3 sample design
Session 3   sample designSession 3   sample design
Session 3 sample design
 
software testing metrics do's - don'ts-XBOSoft-QAI Webinar
software testing metrics do's - don'ts-XBOSoft-QAI Webinarsoftware testing metrics do's - don'ts-XBOSoft-QAI Webinar
software testing metrics do's - don'ts-XBOSoft-QAI Webinar
 
Software Quality Metrics Do's and Don'ts - XBOSoft-QAI Webinar
Software Quality Metrics Do's and Don'ts - XBOSoft-QAI WebinarSoftware Quality Metrics Do's and Don'ts - XBOSoft-QAI Webinar
Software Quality Metrics Do's and Don'ts - XBOSoft-QAI Webinar
 
Test case design techniques
Test case design techniquesTest case design techniques
Test case design techniques
 
Test case design techniques
Test case design techniquesTest case design techniques
Test case design techniques
 
5. testing differences
5. testing differences5. testing differences
5. testing differences
 

More from QA or the Highway

Jeff Van Fleet and John Townsend - Transition from Testing to Leadership.pdf
Jeff Van Fleet and John Townsend - Transition from Testing to Leadership.pdfJeff Van Fleet and John Townsend - Transition from Testing to Leadership.pdf
Jeff Van Fleet and John Townsend - Transition from Testing to Leadership.pdf
QA or the Highway
 

More from QA or the Highway (20)

KrishnaToolComparisionPPT.pdf
KrishnaToolComparisionPPT.pdfKrishnaToolComparisionPPT.pdf
KrishnaToolComparisionPPT.pdf
 
Ravi Lakkavalli - World Quality Report.pptx
Ravi Lakkavalli - World Quality Report.pptxRavi Lakkavalli - World Quality Report.pptx
Ravi Lakkavalli - World Quality Report.pptx
 
Caleb Crandall - Testing Between the Buckets.pptx
Caleb Crandall - Testing Between the Buckets.pptxCaleb Crandall - Testing Between the Buckets.pptx
Caleb Crandall - Testing Between the Buckets.pptx
 
Thomas Haver - Mobile Testing.pdf
Thomas Haver - Mobile Testing.pdfThomas Haver - Mobile Testing.pdf
Thomas Haver - Mobile Testing.pdf
 
Thomas Haver - Example Mapping.pdf
Thomas Haver - Example Mapping.pdfThomas Haver - Example Mapping.pdf
Thomas Haver - Example Mapping.pdf
 
Joe Colantonio - Actionable Automation Awesomeness in Testing Farm.pdf
Joe Colantonio - Actionable Automation Awesomeness in Testing Farm.pdfJoe Colantonio - Actionable Automation Awesomeness in Testing Farm.pdf
Joe Colantonio - Actionable Automation Awesomeness in Testing Farm.pdf
 
Sarah Geisinger - Continious Testing Metrics That Matter.pdf
Sarah Geisinger - Continious Testing Metrics That Matter.pdfSarah Geisinger - Continious Testing Metrics That Matter.pdf
Sarah Geisinger - Continious Testing Metrics That Matter.pdf
 
Jeff Sing - Quarterly Service Delivery Reviews.pdf
Jeff Sing - Quarterly Service Delivery Reviews.pdfJeff Sing - Quarterly Service Delivery Reviews.pdf
Jeff Sing - Quarterly Service Delivery Reviews.pdf
 
Leandro Melendez - Chihuahua Load Tests.pdf
Leandro Melendez - Chihuahua Load Tests.pdfLeandro Melendez - Chihuahua Load Tests.pdf
Leandro Melendez - Chihuahua Load Tests.pdf
 
Rick Clymer - Incident Management.pdf
Rick Clymer - Incident Management.pdfRick Clymer - Incident Management.pdf
Rick Clymer - Incident Management.pdf
 
Robert Fornal - ChatGPT as a Testing Tool.pptx
Robert Fornal - ChatGPT as a Testing Tool.pptxRobert Fornal - ChatGPT as a Testing Tool.pptx
Robert Fornal - ChatGPT as a Testing Tool.pptx
 
Federico Toledo - Extra-functional testing.pdf
Federico Toledo - Extra-functional testing.pdfFederico Toledo - Extra-functional testing.pdf
Federico Toledo - Extra-functional testing.pdf
 
Andrew Knight - Managing the Test Data Nightmare.pptx
Andrew Knight - Managing the Test Data Nightmare.pptxAndrew Knight - Managing the Test Data Nightmare.pptx
Andrew Knight - Managing the Test Data Nightmare.pptx
 
Melissa Tondi - Automation We_re Doing it Wrong.pdf
Melissa Tondi - Automation We_re Doing it Wrong.pdfMelissa Tondi - Automation We_re Doing it Wrong.pdf
Melissa Tondi - Automation We_re Doing it Wrong.pdf
 
Jeff Van Fleet and John Townsend - Transition from Testing to Leadership.pdf
Jeff Van Fleet and John Townsend - Transition from Testing to Leadership.pdfJeff Van Fleet and John Townsend - Transition from Testing to Leadership.pdf
Jeff Van Fleet and John Townsend - Transition from Testing to Leadership.pdf
 
DesiradhaRam Gadde - Testers _ Testing in ChatGPT-AI world.pptx
DesiradhaRam Gadde - Testers _ Testing in ChatGPT-AI world.pptxDesiradhaRam Gadde - Testers _ Testing in ChatGPT-AI world.pptx
DesiradhaRam Gadde - Testers _ Testing in ChatGPT-AI world.pptx
 
Damian Synadinos - Word Smatter.pdf
Damian Synadinos - Word Smatter.pdfDamian Synadinos - Word Smatter.pdf
Damian Synadinos - Word Smatter.pdf
 
Lee Barnes - What Successful Test Automation is.pdf
Lee Barnes - What Successful Test Automation is.pdfLee Barnes - What Successful Test Automation is.pdf
Lee Barnes - What Successful Test Automation is.pdf
 
Jordan Powell - API Testing with Cypress.pptx
Jordan Powell - API Testing with Cypress.pptxJordan Powell - API Testing with Cypress.pptx
Jordan Powell - API Testing with Cypress.pptx
 
Carlos Kidman - Exploring AI Applications in Testing.pptx
Carlos Kidman - Exploring AI Applications in Testing.pptxCarlos Kidman - Exploring AI Applications in Testing.pptx
Carlos Kidman - Exploring AI Applications in Testing.pptx
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Recently uploaded (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Bad metric, bad! - Joseph Ours

  • 1. NOTICE: Proprietary and Confidential This material is proprietary to Centric Consulting, LLC. It contains trade secrets and informationwhich is solely the property of Centric Consulting,LLC. This material is solely for the Client’sinternaluse. This material shall not be used, reproduced, copied, disclosed, transmitted, in whole or in part, without the express consent of Centric Consulting,LLC. © 2013 Centric Consulting,LLC. All rights reserved Bad Metric. Bad! Teaching an old dog, nothing new
  • 2.
  • 3.
  • 4. What are some typical metrics that you measure?
  • 5. Other Examples of Software Testing Metrics • Test Case Counts by Execution Status • Test Case Percentages by Execution Status • Test Case Execution Status Trend • Test Case Status Planned vs Executed • Test Case Coverage • Test Case Status vs Coverage • Test Case First Run Failure Counts • Test Case Re– Run Counts Test Cases • Automation Index (Percent Automatable) • Automation Progress • Automation Test Coverage Automation extras
  • 6. More Examples of Software Testing Metrics • Defect Counts by Status • Defect Counts by Priority • Defect Status Trend • Defect Density • Defect Remove Efficiency • Defect Leakage • Average Defect Response Time Defects • Requirements Volatility Index • Testing Process Efficiency Other
  • 8. Other Examples of Software Testing Metrics • Test Case Counts by Execution Status – Count • Test Case Percentages by Execution Status – Count • Test Case Execution Status Trend – Trend • Test Case Executed vs Planned – Metric and Trend • Test Case Coverage – Metric • Test Case Status vs Coverage – Metric • Test Case First Run Failure Counts – Count • Test Case Re– Run Counts – Count Test Cases • Automation Index (Percent Automatable) – Metric • Automation Progress – Count • Automation Test Coverage – Metric Automation extras
  • 9. More Examples of Software Testing Metrics • Defect Counts by Status – Count • Defect Counts by Priority – Count • Defect Status Trend – Trend • Defect Density – Metric • Defect Remove Efficiency – Metric • Defect Leakage – Metric • Average Defect Response Time – Trend Defects • Requirements Volatility Index – Metric • Testing Process Efficiency – Metric Other
  • 10. The Problem We Typically Face? They Fail to Communicate • Present data instead of information • Offer no interpretation, allow user to draw own conclusion They Are Often Inaccurate • The act of measuring lacks of consistency • The measures themselves have inherent variability • No one reports margin of errors They Do Not Measure a Control • Can’t make decision based on number • The measurement isn’t a lever to introduce change They Are Not Tied to Organizational Objectives • No threshold set for desired goal • No action or consequence if not achieved
  • 13.
  • 14. Exercise #1 1. Need 3 volunteers 2. Assume 1 scoop equals 1 days worth of testing effort 3. Hershey Kisses and Tootsie Rolls are tests, Starbursts are bugs 4. Take a scoop 5. How many tests did you execute? 6. Based on how many tests you ran, how many more scoops do you need to execute the rest (there are 180 total)?
  • 15. Exercise #1 Questions • Was the same scoop used? Were the results the same? • Was there variability in the number of tests run in each scoop. • Is that typical in testing? • Was there variability in the estimate of the number of tests left? • Is this similar to guessing how much time is effort is left in a test cycle? • Are these numbers reliable? • Are they repeatable?
  • 16. Exercise #2 1. Need 3 volunteers 2. Assume 1 scoop equals 1 days worth of testing effort 3. Hershey Kisses and Tootsie Rolls are tests, Starbursts are bugs (Red are severe) 4. Take a scoop 5. How many tests did you execute? 6. How many defects did you find? 7. Based on how many tests you ran, how many more scoops do you need to execute the rest? 8. Based on how much effort you put in, how many more scoops do you need to find the rest of the defects?
  • 17. Exercise #2 Questions • Was the same scoop used? Were the results the same? • With an estimate of the number of tests remaining, is it reasonable to estimate the number of defects will be found? • Do people ask you to guess this type of information? • If you know how many tests (Starbursts) are left and how many man- hours you will use (scoop size), can you estimate how many scoops are needed to execute all tests (find all Starbursts)? • Is it accurate? Is it close enough? • Are these numbers reliable? • Are they repeatable? • Does encountering defects (M&M’s) reveal anything about the overall quality (how many M&M’s exist, or what it’ll take to find them)?
  • 18. Challenges with Counting Label does not equal content Inherent variability Not evenly spaced Lacks reference for context Lack of consistency
  • 20. Sampling Target Population Matched Samples Independent Samples Random Sampling Simple Random Sampling Stratified Sampling Cluster Sampling Quota Sampling Spatial Sampling Sampling Variability Standard Error Bias Precision For each population there are many possible samples. A sample statistic gives information about a corresponding population parameter
  • 21. Sampling in Testing Does testing use sampling? Consider in most corporate environments: • We never test the entire application • It is not realistically possible to find every defect • So, does testing use sampling?
  • 22. Ponder this as we discuss the next section… Is Testing a Methodical Defect Searching Activity?
  • 23. Sampling Remember, We can’t test everything – not enough time/people/budget So, which sample approach better approximates an actual measure (e.g. dots per sq. inch?) 5.25 dots/sq. in. 6.5 dots/sq. in.
  • 24. Ponder this as we discuss the next section… Is Testing a Methodical Defect Searching Activity?
  • 25. Sampling Which sample approach better approximates an actual measure (e.g. dots per sq. inch?) • What is more accurate, random or methodical searching? 5.25 dots/sq. in. 6.5 dots/sq. in. 4.95 dots/sq. in. 6.3 dots/sq. in. There are actually 6.6 dots/sq. in.
  • 27. Exercise #3 1. Need 3 volunteers 2. Assume 1 scoop equals 1 days worth of testing effort 3. Hershey Kisses and Tootsie Rolls are tests, Starbursts are bugs (Red are severe) 4. Each volunteer grab 1 scoop of candy 5. How many (total) tests did you execute? 6. How many (total) defects did you find? 7. Log results 8. Repeat 2 more times
  • 28. Exercise #3 Questions • Does this graph represent anything useful? • Does a trend line help or mean anything? • Is it possible or reasonable to estimate the # of defects you’ll see based on the number of tests, from even 9 samples? • Compare scoop 1 to scoop 9 – does any scoop seem to be a reasonable estimate?
  • 29. Challenges with Metrics (Measure over Measure) Implied Derivations and Forecasting Counts over Counts Denominator Rules Implies Velocity Measure over Measure
  • 31. Trend Trend is a change in a measure (or metric) over time interval. Has three components Direction/Movement Speed/Size Cause (Implied)
  • 32.
  • 33. Exercise #4 1. Need 3 volunteers 2. Assume 1 scoop equals 1 days worth of testing effort 3. Hershey Kisses and Tootsie Rolls are tests, Starbursts are bugs (Red are severe) 4. Each volunteer grab 1 scoop of candy 5. How many of EACH type of tests did you execute? 6. How many of EACH type of defect did you find? 7. Log results 8. Repeat 2 more times
  • 34. Exercise #4 Questions • Does the graph line represent any information of value? • Is there assurance (control) that simply taking a scoop (e.g. executing tests in a given day) will result in defects being found? • Is the shape of the defect cumulative line representative of anything? • If we only look at scoops 1-3 or 7-9, does it tell us anything or mislead us? • What if we took 2 scoops per day (added a tester – but still counted as 1 day), would that affect anything how things look? • Is M&M’s per scoop or M&M’s per skittles/starbursts mean anything?
  • 35. Challenges with Trends Affected by challenges of counting Affected by challenges of metrics Time Based Series Intervals and Activity Pause
  • 36.
  • 37. Purpose of Metrics Measure of Performance Conformance to Best Practice Deviation from Goal
  • 38. Issues affecting purpose Misaligned with strategy Using metrics as outputs only Too many metrics Ease of measure does not equal importance Lack of context Limited dimensions Lack behavioral aspects
  • 40. How to Leverage Metrics Explicitly link metrics to goals Use trends over absolute numbers Use shorter tracking periods Change metrics when they stop driving change Account for error and confidence