SlideShare uma empresa Scribd logo
1 de 23
Data Acquisition: A Key
Challenge for Quality
and Reliability
Improvement
Gerald J. Hahn & Necip Doganaksoy
©2013 ASQ & Presentation Hahn & Doganaksoy

http://reliabilitycalendar.org/webina
rs/
ASQ Reliability Division
English Webinar Series
One of the monthly webinars
on topics of interest to
reliability engineers.
To view recorded webinar (available to ASQ Reliability
Division members only) visit asq.org/reliability
To sign up for the free and available to anyone live
webinars visit reliabilitycalendar.org and select English
Webinars to find links to register for upcoming events

http://reliabilitycalendar.org/webina
rs/
DATA ACQUISITION: A KEY CHALLENGE FOR QUALITY
AND RELIABILITY IMPROVEMENT

Gerald J. Hahn
GE Global Research
(Retired)
gerryhahn@yahoo.com

Necip Doganaksoy
GlobalFoundries
necipdoganaksoy@yahoo.com

ASQ RELIABILITY DIVISION WEBINAR
November 14, 2013

3
THE OBVIOUS, THE EXPECTATION AND THE REALITY
• The Obvious
– Statistical quality and reliability analyses are based
upon sample data (and assumptions about
sampled populations, etc.)
– Such analyses are only as good as the data upon
which they are based
– Bad data lead to more complex, less powerful or
invalid analyses
– David Moore: The most important information
about any statistical study is how the data were
produced
• The Expectation: Much attention is given to the data
acquisition process in training and applications
• The Reality: Little or insufficient attention is generally
given to the data acquisition process
4
THE CONSEQUENCES AND THE
CHALLENGE
• The Consequences
– Why is it that every database that I have
encountered is filled with data quality problems?
(Theodore Johnson, 2003 QPRC)
– Common wisdom puts the extent of the total
project effort spent in cleaning the data before
doing any analysis as high as 60-95% (DeVeaux
and Hand, Statistical Science 2005)
• The Challenge: Move data-acquisition to front burner
– Understand limitations of available data
– Emphasize data acquisition
– Use disciplined process
5
WEBINAR TOPICS
•
•
•
•
•
•
•
•

Typical data acquisition situations
Problems (and opportunities) with observational data
A disciplined, targeted approach for data acquisition
Washing machine design reliability example
Some guidelines for effective data acquisition
Some practical challenges
Some relevant further commentaries
Elevator speech

EMPHASIS ON QUALITY AND RELIABILITY
6
TYPICAL DATA ACQUISITION SITUATIONS
• Control over data acquisition
– Designed experiments
– Random sampling studies (from specified
population)
– Double-blind medical studies
– Systems development studies, e.g.,
•
•
•
•
•

Estimate design reliability
Evaluate measurement system
Assess process capability
Signal changes via control charts
Anticipate/avoid field failures by automated monitoring

• Observational studies (and data mining) on
existing data often from Big Data
MANY APPLICATIONS INVOLVE COMBINATIONS

7
PROBLEMS (AND OPPORTUNITIES) WITH
OBSERVATIONAL DATA
•

Problems with “available” databases
– Data obtained for purposes other than statistical analysis
– Data resides in different data bases

• Some limitations of observational data
–
–
–
–
–

•

•

Missing values and events
Unrepresentative observations
Inconsistent or imprecise measurements
Limited variability
Key impacting variables unrecorded; recorded proxy variables
deemed “significant” (e.g., foot size impacts reading ability)
Observational studies
– May be helpful for prediction, e.g., credit performance, top selling
items before expected hurricane, finding best time to buy plane ticket
– Misleading or useless for gaining “cause and effect” understanding
– Observation from the trenches (Kati Illouz, GE): Data owners tend to
be overly optimistic about their data
Data inadequacies (and reasons) define future information needs

QUALITY—NOT QUANTITY—OF DATA IS WHAT COUNTS
8
IN SUMMARY
• Even the most sophisticated statistical analysis cannot
compensate for or rescue inadequate data
• It’s not that there is lack of data. Instead, it is that the
data are inadequate to answer the questions (NY Times
article on “How Safe is Cycling?” October 22, 2013
• Massive data does not guarantee success…Knowing
how the data were collected (the “data pedigree”) is
critical (Snee, Union College Mathematics
Conference, October 2013)
• A good principle to remember is that data are guilty
until proven innocent, not the other way around (Snee
and Hoerl, QP Dec 2012)
• Observational data have an important role in pointing
the way forward, but they should not be a primary
ingredient for making final decisions (Anderson-Cook
and Borror, QP April 2013)
9
DISCIPLINED, TARGETED PROCESS FOR DATA
ACQUISITION (DEUPM) FOR SYSTEMS
DEVELOPMENT STUDY
• Proposed process:
– Step 1: D: Define the problem
– Step 2: E: Evaluate the existing data
– Step 3: U: Understand data acquisition opportunities
and limitations
– Step 4: P: Plan data acquisition and analysis
– Step 5: M: Monitor, clean data, analyze and validate
• Example: Demonstrate desired ten-year reliability for
new washing machine design in 6 months elapsed time

10
STEP 1: DEFINE THE PROBLEM

• Define specific questions to be answered
Washing machine design example:
– Stated objective: Show within 6 months and with 95% confidence
that following can be met:
• 95% reliability after one year of operation
• 90% reliability after five years
• 80% reliability after ten years
(“reliability” defined as no repair or servicing need)
– Added question: How can reliability be improved further?

• Identify resulting actions
Washing machine design example: Go to full scale
production if validated and make identified improvements

• State population or process of interest
Washing machine design example: 6 million machines to
be built in next 5 years
11
STEP 2: EVALUATE THE EXISTING DATA
• Understand the process and its physical basis
Washing machine design example: Study up and participate in
design reviews, FMEA’s (Failure Mode and Effects Analyses), etc.
• Determine and analyze existing data
Washing machine design example
– Previous design
• Existing data
– In-house component, sub-assembly and system tests
– Field failure and servicing data
• Conclusion: Previous design does not meet current reliability goals

– New design
•
•
•
•
•

Proposed new design aims to correct key past problems
Possible concern: Introduction of new failure modes
Existing data: Component and sub-assembly test results
Data identified one new failure mode; rapidly addressed and corrected
Conclusion: Proposed new design appears to correct past problems
without introducing new ones; reliability goals appear to be met

• Identify data inadequacies
Washing machine design example: No information about system
performance in realistic use environment
12
STEP 3:UNDERSTAND DATA ACQUISITION
OPPORTUNITIES AND LIMITATIONS
• Gain understanding of data that can be acquired and how
Washing machine example: In-house accelerated use rate systems testing
• Simulate 3.5 years of operation per month
• Evaluate weekly for failures
• Sample unfailed units and measure degradation (destructive test)

• Determine practical considerations and limitations in data
acquisition
Washing machine design example:
• 6 months of testing
• 3 prototype lots initially (and one more subsequently)
• 36 available test stands

• Assess relevance of resulting data to meet study goals and
underlying assumptions
Washing machine design example:
• Assume prototype lots representative of 5-year high volume production
• Assume failures are cycle (and not elapsed time) dependent
• Assume realistic simulation of field environment
Conclusion: This is analytic (not enumerative) study; statistical
13
confidence bounds capture only statistical uncertainty
STEP 4: PLAN DATA ACQUISITION AND IMPLEMENT

•

•

•

•

Specify test conditions or operational environment
Washing machine design example: Run washing machines with full load of soiled
towels, mixed with sand, wrapped in plastic bag
Specify sample size and selection process
Washing machine design example: Select 12 units randomly from each of 3 prototype
lots and put on life test
Specify protocol and operational details
Washing machine design example:
– Record failures and determine failure mode
– After 3 months and again after 6 months
Years
• Remove 4 units from each of 3 lots and measure degradation
• Replace 3 month withdrawals with 12 units from 4th prototype lot
– Assure high-precision measurements, meaningful failure definition, complete and
consistent data recording procedures, etc
Specify data analysis plan and assess expected statistical precision
Washing machine design example:
– Do Weibull distribution analysis on time to failure data after 6 months
– Conduct supplementary analysis using degradation data
– Simulation study demonstrated proposed plan provides desired statistical precision
Specify pilot study
Washing machine design example: Run three washing machines for one week
Percent Failing

•

14
STEP 5: MONITOR, CLEAN DATA, ANALYZE AND
VALIDATE

• Monitor implementation to ensure that process is being followed
Washing machine design example: Continue involvement

• Clean data—as gathered
Washing machine design example: Develop proactive checks for missing or
inconsistent data

• Conduct preliminary analyses; act thereon, as appropriate
Washing machine design example: Analyze failure data after 1 week, 1 month
and 3 months; identify failure modes for correction

• Conduct final data analysis and report findings
Washing machine design example: Do final analyses after 6 months (failure
and degradation data)

• Validate: Propose appropriate validation testing
Washing machine design example:
– Continue 6 of 36 units on test beyond 6 months
– Test 100 machines with company employees and 60 machines in
laundromats
– Audit sample 6 production units each week: Test five for 1 week; one for
3 months
– Develop system for capturing and analyzing field reliability data
– Provide current data access to engineers and management
15
SOME GUIDELINES FOR EFFECTIVE DATA
ACQUISITION (STEP 4)
RECORD KEY VARIABLES AND EVENTS
Example: Use field data to estimate reliability and
speedily identify/address root causes of failures calls
for
• Field data
–
–
–
–

Estimate of product usage
Product performance measurements over time
Time to failure
Failure mode information

• Manufacturing data
–
–
–
–
–
–

Parts and manufacturing lot identification
Actual process conditions
Ambient conditions during manufacture
Unplanned events
Other potentially important process variables
End-of-line performance
16
ENSURE CONSISTENT AND
ACCURATE DATA RECORDING
• Strive for precise measurements
• Combat data recording inconsistencies
– Differences between operators
– Differences in qualitative scaling assessments
– Differences in data recording conventions; e.g. date of 2/8

• Address missing values
– Understand reason
– Handle appropriately
– Minimize occurrence

• Conduct timely data cleaning: Identify “errors” in
recorded data (e.g. 999 for missing values) and correct
17
AVOID SYSTEMATICALLY
UNRECORDED OBSERVATIONS
Some examples:
• Information recorded on failed units
only
• Information only during warranty
period
• Exclusion of “outlier” information
• Purging of “old”—but still relevant-data
18
SOME OTHER HINTS
• Strive to obtain continuous data
• Aim for compatibility and integration of
databases
• Consider sampling

19
CHALLENGES
• Some practical challenges
– Added cost and possible delays
– Added bureaucracy
– Diversity of data ownership:
Engineering, Manufacturing, etc.
– Need for added work not evident
Result: Lack of motivation by data recorders and their
management

• Strive to overcome by
–
–
–
–
–

Recognizing perspectives of others
Understanding consequences of our requests
Making requests as simple and reasonable as possible
Automating data acquisition process
Providing convincing justification (e.g., insurance)
20
SOME RELEVANT FURTHER COMMENTARIES
•

Webinar adapted from
– Hahn, G.J. and Doganaksoy, N. (2011), A Career in Statistics: Beyond the
Numbers, Wiley (Chapter 11).
– Doganaksoy, N. and Hahn, G.J. (2012), Getting the Right Data Up Front: A Key
Challenge, Quality Engineering, Vol. 24, #4, 446-459.

•

Also note
– Anderson-Cook, C.M. and Borror, C.M. (2013), Paving the Way: Seven Data
Collection Strategies to Enhance Your Quality Analyses, Quality Progress, April, 1829.
– Coleman, D.E., Montgomery, D.C. (1993), A Systematic Approach for Planning a
Designed Industrial Experiment, Technometrics, Vol.35,.1, 1-12.
– DeVeaux, R. D., Hand, D.J, (2005), How to Lie with Bad Data, Statistical Science, 20
(3) 121-238.
– Hahn, G.J. , Doganaksoy, N. (2003), Data Acquisition: Focusing on the
Challenge, Presentation at Joint Statistical Meetings.
– Hahn, G.J. Doganaksoy, N. (2008), The Role of Statistics in Business and
Industry, Wiley, 2008.
– Kenett, R.S. and Shmueli, G. (2013), On Information Quality (with discussion and
rejoinder), Journal of the Royal Statistical Society, Series A, (forthcoming).
– Schield, M (2006), Beware the Lurking Variable, Stats, 46, 14-18.
– Snee, R.D. , Hoerl, R.W. (2012), Inquiry on Pedigree: Do you know the quality and
origin of your data?, Quality Progress, December, 66-68.
– Steiner, S.H, MacKay, R.J. (2005), Statistical Engineering: An Algorithm for Reducing
Variation in Manufacturing Processes, Milwaukee, WI, ASQ Quality Press
21
ELEVATOR SPEECH
• We need put the horse (focus on data acquisition)
before the CART (Classification and Regression
Tree) data analysis
• Specific proposals
– Focus on data acquisition in training programs
– Scrutinize available data to assess relevance
and identify gaps
– Use disciplined, targeted process for added data
acquisition
– Remain constantly cognizant of underlying
assumptions
• Thanks for listening
– Gerry Hahn, gerryhahn@yahoo.com
– Necip
Doganaksoy, necipdoganaksoy@yahoo.com
22
SOME RELEVANT FURTHER COMMENTARIES
•

Webinar adapted from
–
–

•

Hahn, G.J. and Doganaksoy, N. (2011), A Career in Statistics: Beyond the
Numbers, Wiley (Chapter 11).
Doganaksoy, N. and Hahn, G.J. (2012), Getting the Right Data Up Front: A Key
Challenge, Quality Engineering, Vol. 24, #4, 446-459.

Also note
–
–
–
–
–
–
–
–

–

Anderson-Cook, C.M. and Borror, C.M. (2013), Paving the Way: Seven Data Collection
Strategies to Enhance Your Quality Analyses, Quality Progress, April, 18-29.
Coleman, D.E., Montgomery, D.C. (1993), A Systematic Approach for Planning a
Designed Industrial Experiment, Technometrics, Vol.35,.1, 1-12.
DeVeaux, R. D., Hand, D.J, (2005), How to Lie with Bad Data, Statistical Science, 20 (3)
121-238.
Hahn, G.J. , Doganaksoy, N. (2003), Data Acquisition: Focusing on the
Challenge, Presentation at Joint Statistical Meetings.
Hahn, G.J. Doganaksoy, N. (2008), The Role of Statistics in Business and
Industry, Wiley, 2008.
Kenett, R.S. and Shmueli, G. (2013), On Information Quality (with discussion and
rejoinder), Journal of the Royal Statistical Society, Series A, (forthcoming).
Schield, M (2006), Beware the Lurking Variable, Stats, 46, 14-18.
Snee, R.D. , Hoerl, R.W. (2012), Inquiry on Pedigree: Do you know the quality and
origin of your data?, Quality Progress, December, 66-68.
Steiner, S.H, MacKay, R.J. (2005), Statistical Engineering: An Algorithm for Reducing
Variation in Manufacturing Processes, Milwaukee, WI, ASQ Quality Press

23

Mais conteúdo relacionado

Mais procurados

Root Cause Analysis - Tools, Tips and Tricks to Get to the Bottom of Root Cause
Root Cause Analysis - Tools, Tips and Tricks to Get to the Bottom of Root CauseRoot Cause Analysis - Tools, Tips and Tricks to Get to the Bottom of Root Cause
Root Cause Analysis - Tools, Tips and Tricks to Get to the Bottom of Root CauseCraig Thornton
 
HCLT Whitepaper: Landmines of Software Testing Metrics
HCLT Whitepaper: Landmines of Software Testing MetricsHCLT Whitepaper: Landmines of Software Testing Metrics
HCLT Whitepaper: Landmines of Software Testing MetricsHCL Technologies
 
7 QC Tools training presentation
7 QC Tools training presentation7 QC Tools training presentation
7 QC Tools training presentationPRASHANT KSHIRSAGAR
 
Testing Metrics and why Managers like them
Testing Metrics and why Managers like themTesting Metrics and why Managers like them
Testing Metrics and why Managers like themPractiTest
 
Risk-Based Testing - Designing & managing the test process (2002)
Risk-Based Testing - Designing & managing the test process (2002)Risk-Based Testing - Designing & managing the test process (2002)
Risk-Based Testing - Designing & managing the test process (2002)Neil Thompson
 
Testing fundamentals in a changing world
Testing fundamentals in a changing worldTesting fundamentals in a changing world
Testing fundamentals in a changing worldPractiTest
 
2013 OHSUG - Oracle Clinical and RDC Training for Data Management and Clinica...
2013 OHSUG - Oracle Clinical and RDC Training for Data Management and Clinica...2013 OHSUG - Oracle Clinical and RDC Training for Data Management and Clinica...
2013 OHSUG - Oracle Clinical and RDC Training for Data Management and Clinica...Perficient
 
Acceptance testing for rome
Acceptance testing for romeAcceptance testing for rome
Acceptance testing for romeGitaAdryana
 
From Defect Reporting To Defect Prevention
From Defect Reporting To Defect PreventionFrom Defect Reporting To Defect Prevention
From Defect Reporting To Defect PreventionSune Gynthersen
 
Risk based quality management
Risk based quality managementRisk based quality management
Risk based quality managementselinasimpson2301
 
Defect Analytics & Statistical Trends
Defect Analytics & Statistical TrendsDefect Analytics & Statistical Trends
Defect Analytics & Statistical TrendsMani Nutulapati
 
TRI Webinar: RBM - Protocol Risk Assessment and Designing Site Quality Risk ...
TRI Webinar:  RBM - Protocol Risk Assessment and Designing Site Quality Risk ...TRI Webinar:  RBM - Protocol Risk Assessment and Designing Site Quality Risk ...
TRI Webinar: RBM - Protocol Risk Assessment and Designing Site Quality Risk ...TRI, the risk-based monitoring company
 

Mais procurados (19)

Root cause analysis training
Root cause analysis trainingRoot cause analysis training
Root cause analysis training
 
rcat---irca-global
rcat---irca-globalrcat---irca-global
rcat---irca-global
 
Root Cause Analysis - Tools, Tips and Tricks to Get to the Bottom of Root Cause
Root Cause Analysis - Tools, Tips and Tricks to Get to the Bottom of Root CauseRoot Cause Analysis - Tools, Tips and Tricks to Get to the Bottom of Root Cause
Root Cause Analysis - Tools, Tips and Tricks to Get to the Bottom of Root Cause
 
Root cause analysis
Root cause analysis Root cause analysis
Root cause analysis
 
HCLT Whitepaper: Landmines of Software Testing Metrics
HCLT Whitepaper: Landmines of Software Testing MetricsHCLT Whitepaper: Landmines of Software Testing Metrics
HCLT Whitepaper: Landmines of Software Testing Metrics
 
7 QC Tools training presentation
7 QC Tools training presentation7 QC Tools training presentation
7 QC Tools training presentation
 
Testing Metrics and why Managers like them
Testing Metrics and why Managers like themTesting Metrics and why Managers like them
Testing Metrics and why Managers like them
 
Rkfl Problem Solving
Rkfl Problem SolvingRkfl Problem Solving
Rkfl Problem Solving
 
Risk-Based Testing - Designing & managing the test process (2002)
Risk-Based Testing - Designing & managing the test process (2002)Risk-Based Testing - Designing & managing the test process (2002)
Risk-Based Testing - Designing & managing the test process (2002)
 
Testing fundamentals in a changing world
Testing fundamentals in a changing worldTesting fundamentals in a changing world
Testing fundamentals in a changing world
 
2013 OHSUG - Oracle Clinical and RDC Training for Data Management and Clinica...
2013 OHSUG - Oracle Clinical and RDC Training for Data Management and Clinica...2013 OHSUG - Oracle Clinical and RDC Training for Data Management and Clinica...
2013 OHSUG - Oracle Clinical and RDC Training for Data Management and Clinica...
 
Acceptance testing for rome
Acceptance testing for romeAcceptance testing for rome
Acceptance testing for rome
 
5 C Template
5 C Template5 C Template
5 C Template
 
Tool Support For Testing
Tool Support For TestingTool Support For Testing
Tool Support For Testing
 
From Defect Reporting To Defect Prevention
From Defect Reporting To Defect PreventionFrom Defect Reporting To Defect Prevention
From Defect Reporting To Defect Prevention
 
Risk based quality management
Risk based quality managementRisk based quality management
Risk based quality management
 
Defect Analytics & Statistical Trends
Defect Analytics & Statistical TrendsDefect Analytics & Statistical Trends
Defect Analytics & Statistical Trends
 
TRI Webinar: RBM - Protocol Risk Assessment and Designing Site Quality Risk ...
TRI Webinar:  RBM - Protocol Risk Assessment and Designing Site Quality Risk ...TRI Webinar:  RBM - Protocol Risk Assessment and Designing Site Quality Risk ...
TRI Webinar: RBM - Protocol Risk Assessment and Designing Site Quality Risk ...
 
Fda validation inspections
Fda validation inspectionsFda validation inspections
Fda validation inspections
 

Semelhante a Data Acquisition: A Key Challenge for Quality and Reliability Improvement

crisp.ppt
crisp.pptcrisp.ppt
crisp.pptSK Chew
 
Test data management
Test data managementTest data management
Test data managementRohit Gupta
 
Testing Data Analysis Framework - A Case Study_orig.pptx
Testing Data Analysis Framework - A Case Study_orig.pptxTesting Data Analysis Framework - A Case Study_orig.pptx
Testing Data Analysis Framework - A Case Study_orig.pptxAgile Testing Alliance
 
Group 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxGroup 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxellamangapis2003
 
Saksham Sarode - Building Effective test Data Management in Distributed Envir...
Saksham Sarode - Building Effective test Data Management in Distributed Envir...Saksham Sarode - Building Effective test Data Management in Distributed Envir...
Saksham Sarode - Building Effective test Data Management in Distributed Envir...TEST Huddle
 
Software Engineering (Testing Activities, Management, and Automation)
Software Engineering (Testing Activities, Management, and Automation)Software Engineering (Testing Activities, Management, and Automation)
Software Engineering (Testing Activities, Management, and Automation)ShudipPal
 
Mis system analysis and system design
Mis   system analysis and system designMis   system analysis and system design
Mis system analysis and system designRahul Hedau
 
Mind Map Test Data Management Overview
Mind Map Test Data Management OverviewMind Map Test Data Management Overview
Mind Map Test Data Management Overviewdublinx
 
DefectmodelsinSparseenvironments
DefectmodelsinSparseenvironmentsDefectmodelsinSparseenvironments
DefectmodelsinSparseenvironmentspbaxter
 
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptxXanGwaps
 
White paper: "Human performance improvement"
White paper: "Human performance improvement"White paper: "Human performance improvement"
White paper: "Human performance improvement"APARNA SANAKA
 
Big (huge) Data and a continuous and predictive audit: new evidence, new met...
 Big (huge) Data and a continuous and predictive audit: new evidence, new met... Big (huge) Data and a continuous and predictive audit: new evidence, new met...
Big (huge) Data and a continuous and predictive audit: new evidence, new met...TECSI FEA USP
 
Automating Phase One Clinical Trials
Automating Phase One Clinical TrialsAutomating Phase One Clinical Trials
Automating Phase One Clinical TrialsPerficient
 
Creating Functional Testing Strategy.pptx
Creating Functional Testing Strategy.pptxCreating Functional Testing Strategy.pptx
Creating Functional Testing Strategy.pptxMohit Rajvanshi
 
Module 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdfModule 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdffathiah5
 
Data quality in decision making - Dr. Philip Woodall, University of Cambridge
Data quality in decision making - Dr. Philip Woodall, University of CambridgeData quality in decision making - Dr. Philip Woodall, University of Cambridge
Data quality in decision making - Dr. Philip Woodall, University of CambridgeBCS Data Management Specialist Group
 
10. Project Quality Management
10. Project Quality Management 10. Project Quality Management
10. Project Quality Management BhuWan Khadka
 

Semelhante a Data Acquisition: A Key Challenge for Quality and Reliability Improvement (20)

Role of Data Quality Assessment in a Project
Role of Data Quality Assessment in a ProjectRole of Data Quality Assessment in a Project
Role of Data Quality Assessment in a Project
 
crisp.ppt
crisp.pptcrisp.ppt
crisp.ppt
 
crisp.ppt
crisp.pptcrisp.ppt
crisp.ppt
 
Test data management
Test data managementTest data management
Test data management
 
Testing Data Analysis Framework - A Case Study_orig.pptx
Testing Data Analysis Framework - A Case Study_orig.pptxTesting Data Analysis Framework - A Case Study_orig.pptx
Testing Data Analysis Framework - A Case Study_orig.pptx
 
Group 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxGroup 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptx
 
Data mining
Data miningData mining
Data mining
 
Saksham Sarode - Building Effective test Data Management in Distributed Envir...
Saksham Sarode - Building Effective test Data Management in Distributed Envir...Saksham Sarode - Building Effective test Data Management in Distributed Envir...
Saksham Sarode - Building Effective test Data Management in Distributed Envir...
 
Software Engineering (Testing Activities, Management, and Automation)
Software Engineering (Testing Activities, Management, and Automation)Software Engineering (Testing Activities, Management, and Automation)
Software Engineering (Testing Activities, Management, and Automation)
 
Mis system analysis and system design
Mis   system analysis and system designMis   system analysis and system design
Mis system analysis and system design
 
Mind Map Test Data Management Overview
Mind Map Test Data Management OverviewMind Map Test Data Management Overview
Mind Map Test Data Management Overview
 
DefectmodelsinSparseenvironments
DefectmodelsinSparseenvironmentsDefectmodelsinSparseenvironments
DefectmodelsinSparseenvironments
 
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
 
White paper: "Human performance improvement"
White paper: "Human performance improvement"White paper: "Human performance improvement"
White paper: "Human performance improvement"
 
Big (huge) Data and a continuous and predictive audit: new evidence, new met...
 Big (huge) Data and a continuous and predictive audit: new evidence, new met... Big (huge) Data and a continuous and predictive audit: new evidence, new met...
Big (huge) Data and a continuous and predictive audit: new evidence, new met...
 
Automating Phase One Clinical Trials
Automating Phase One Clinical TrialsAutomating Phase One Clinical Trials
Automating Phase One Clinical Trials
 
Creating Functional Testing Strategy.pptx
Creating Functional Testing Strategy.pptxCreating Functional Testing Strategy.pptx
Creating Functional Testing Strategy.pptx
 
Module 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdfModule 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdf
 
Data quality in decision making - Dr. Philip Woodall, University of Cambridge
Data quality in decision making - Dr. Philip Woodall, University of CambridgeData quality in decision making - Dr. Philip Woodall, University of Cambridge
Data quality in decision making - Dr. Philip Woodall, University of Cambridge
 
10. Project Quality Management
10. Project Quality Management 10. Project Quality Management
10. Project Quality Management
 

Mais de ASQ Reliability Division

A Proposal for an Alternative to MTBF/MTTF
A Proposal for an Alternative to MTBF/MTTFA Proposal for an Alternative to MTBF/MTTF
A Proposal for an Alternative to MTBF/MTTFASQ Reliability Division
 
Root Cause Analysis: Think Again! - by Kevin Stewart
Root Cause Analysis: Think Again! - by Kevin StewartRoot Cause Analysis: Think Again! - by Kevin Stewart
Root Cause Analysis: Think Again! - by Kevin StewartASQ Reliability Division
 
Dynamic vs. Traditional Probabilistic Risk Assessment Methodologies - by Huai...
Dynamic vs. Traditional Probabilistic Risk Assessment Methodologies - by Huai...Dynamic vs. Traditional Probabilistic Risk Assessment Methodologies - by Huai...
Dynamic vs. Traditional Probabilistic Risk Assessment Methodologies - by Huai...ASQ Reliability Division
 
Efficient Reliability Demonstration Tests - by Guangbin Yang
Efficient Reliability Demonstration Tests - by Guangbin YangEfficient Reliability Demonstration Tests - by Guangbin Yang
Efficient Reliability Demonstration Tests - by Guangbin YangASQ Reliability Division
 
Reliability Modeling Using Degradation Data - by Harry Guo
Reliability Modeling Using Degradation Data - by Harry GuoReliability Modeling Using Degradation Data - by Harry Guo
Reliability Modeling Using Degradation Data - by Harry GuoASQ Reliability Division
 
Reliability Division Webinar Series - Innovation: Quality for Tomorrow
Reliability Division Webinar Series -  Innovation: Quality for TomorrowReliability Division Webinar Series -  Innovation: Quality for Tomorrow
Reliability Division Webinar Series - Innovation: Quality for TomorrowASQ Reliability Division
 
Impact of censored data on reliability analysis
Impact of censored data on reliability analysisImpact of censored data on reliability analysis
Impact of censored data on reliability analysisASQ Reliability Division
 
A multi phase decision on reliability growth with latent failure modes
A multi phase decision on reliability growth with latent failure modesA multi phase decision on reliability growth with latent failure modes
A multi phase decision on reliability growth with latent failure modesASQ Reliability Division
 
ASQ RD Webinar: Design for reliability a roadmap for design robustness
ASQ RD Webinar: Design for reliability   a roadmap for design robustnessASQ RD Webinar: Design for reliability   a roadmap for design robustness
ASQ RD Webinar: Design for reliability a roadmap for design robustnessASQ Reliability Division
 
ASQ RD Webinar: Improved QFN Reliability Process
ASQ RD Webinar: Improved QFN Reliability Process ASQ RD Webinar: Improved QFN Reliability Process
ASQ RD Webinar: Improved QFN Reliability Process ASQ Reliability Division
 
A Novel View of Applying FMECA to Software Engineering
A Novel View of Applying FMECA to Software EngineeringA Novel View of Applying FMECA to Software Engineering
A Novel View of Applying FMECA to Software EngineeringASQ Reliability Division
 
Astr2013 tutorial by mike silverman of ops a la carte 40 years of halt, wha...
Astr2013 tutorial by mike silverman of ops a la carte   40 years of halt, wha...Astr2013 tutorial by mike silverman of ops a la carte   40 years of halt, wha...
Astr2013 tutorial by mike silverman of ops a la carte 40 years of halt, wha...ASQ Reliability Division
 
Comparing Individual Reliability to Population Reliability for Aging Systems
Comparing Individual Reliability to Population Reliability for Aging SystemsComparing Individual Reliability to Population Reliability for Aging Systems
Comparing Individual Reliability to Population Reliability for Aging SystemsASQ Reliability Division
 
2013 asq field data analysis & statistical warranty forecasting
2013 asq field data analysis & statistical warranty forecasting2013 asq field data analysis & statistical warranty forecasting
2013 asq field data analysis & statistical warranty forecastingASQ Reliability Division
 

Mais de ASQ Reliability Division (20)

On Duty Cycle Concept in Reliability
On Duty Cycle Concept in ReliabilityOn Duty Cycle Concept in Reliability
On Duty Cycle Concept in Reliability
 
A Proposal for an Alternative to MTBF/MTTF
A Proposal for an Alternative to MTBF/MTTFA Proposal for an Alternative to MTBF/MTTF
A Proposal for an Alternative to MTBF/MTTF
 
Thermodynamic Reliability
Thermodynamic  ReliabilityThermodynamic  Reliability
Thermodynamic Reliability
 
Root Cause Analysis: Think Again! - by Kevin Stewart
Root Cause Analysis: Think Again! - by Kevin StewartRoot Cause Analysis: Think Again! - by Kevin Stewart
Root Cause Analysis: Think Again! - by Kevin Stewart
 
Dynamic vs. Traditional Probabilistic Risk Assessment Methodologies - by Huai...
Dynamic vs. Traditional Probabilistic Risk Assessment Methodologies - by Huai...Dynamic vs. Traditional Probabilistic Risk Assessment Methodologies - by Huai...
Dynamic vs. Traditional Probabilistic Risk Assessment Methodologies - by Huai...
 
Efficient Reliability Demonstration Tests - by Guangbin Yang
Efficient Reliability Demonstration Tests - by Guangbin YangEfficient Reliability Demonstration Tests - by Guangbin Yang
Efficient Reliability Demonstration Tests - by Guangbin Yang
 
Reliability Modeling Using Degradation Data - by Harry Guo
Reliability Modeling Using Degradation Data - by Harry GuoReliability Modeling Using Degradation Data - by Harry Guo
Reliability Modeling Using Degradation Data - by Harry Guo
 
Reliability Division Webinar Series - Innovation: Quality for Tomorrow
Reliability Division Webinar Series -  Innovation: Quality for TomorrowReliability Division Webinar Series -  Innovation: Quality for Tomorrow
Reliability Division Webinar Series - Innovation: Quality for Tomorrow
 
Impact of censored data on reliability analysis
Impact of censored data on reliability analysisImpact of censored data on reliability analysis
Impact of censored data on reliability analysis
 
An introduction to weibull analysis
An introduction to weibull analysisAn introduction to weibull analysis
An introduction to weibull analysis
 
A multi phase decision on reliability growth with latent failure modes
A multi phase decision on reliability growth with latent failure modesA multi phase decision on reliability growth with latent failure modes
A multi phase decision on reliability growth with latent failure modes
 
Reliably Solving Intractable Problems
Reliably Solving Intractable ProblemsReliably Solving Intractable Problems
Reliably Solving Intractable Problems
 
Reliably producing breakthroughs
Reliably producing breakthroughsReliably producing breakthroughs
Reliably producing breakthroughs
 
ASQ RD Webinar: Design for reliability a roadmap for design robustness
ASQ RD Webinar: Design for reliability   a roadmap for design robustnessASQ RD Webinar: Design for reliability   a roadmap for design robustness
ASQ RD Webinar: Design for reliability a roadmap for design robustness
 
ASQ RD Webinar: Improved QFN Reliability Process
ASQ RD Webinar: Improved QFN Reliability Process ASQ RD Webinar: Improved QFN Reliability Process
ASQ RD Webinar: Improved QFN Reliability Process
 
A Novel View of Applying FMECA to Software Engineering
A Novel View of Applying FMECA to Software EngineeringA Novel View of Applying FMECA to Software Engineering
A Novel View of Applying FMECA to Software Engineering
 
Astr2013 tutorial by mike silverman of ops a la carte 40 years of halt, wha...
Astr2013 tutorial by mike silverman of ops a la carte   40 years of halt, wha...Astr2013 tutorial by mike silverman of ops a la carte   40 years of halt, wha...
Astr2013 tutorial by mike silverman of ops a la carte 40 years of halt, wha...
 
Comparing Individual Reliability to Population Reliability for Aging Systems
Comparing Individual Reliability to Population Reliability for Aging SystemsComparing Individual Reliability to Population Reliability for Aging Systems
Comparing Individual Reliability to Population Reliability for Aging Systems
 
2013 asq field data analysis & statistical warranty forecasting
2013 asq field data analysis & statistical warranty forecasting2013 asq field data analysis & statistical warranty forecasting
2013 asq field data analysis & statistical warranty forecasting
 
Plan a more effective rdt
Plan a more effective rdtPlan a more effective rdt
Plan a more effective rdt
 

Último

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Último (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Data Acquisition: A Key Challenge for Quality and Reliability Improvement

  • 1. Data Acquisition: A Key Challenge for Quality and Reliability Improvement Gerald J. Hahn & Necip Doganaksoy ©2013 ASQ & Presentation Hahn & Doganaksoy http://reliabilitycalendar.org/webina rs/
  • 2. ASQ Reliability Division English Webinar Series One of the monthly webinars on topics of interest to reliability engineers. To view recorded webinar (available to ASQ Reliability Division members only) visit asq.org/reliability To sign up for the free and available to anyone live webinars visit reliabilitycalendar.org and select English Webinars to find links to register for upcoming events http://reliabilitycalendar.org/webina rs/
  • 3. DATA ACQUISITION: A KEY CHALLENGE FOR QUALITY AND RELIABILITY IMPROVEMENT Gerald J. Hahn GE Global Research (Retired) gerryhahn@yahoo.com Necip Doganaksoy GlobalFoundries necipdoganaksoy@yahoo.com ASQ RELIABILITY DIVISION WEBINAR November 14, 2013 3
  • 4. THE OBVIOUS, THE EXPECTATION AND THE REALITY • The Obvious – Statistical quality and reliability analyses are based upon sample data (and assumptions about sampled populations, etc.) – Such analyses are only as good as the data upon which they are based – Bad data lead to more complex, less powerful or invalid analyses – David Moore: The most important information about any statistical study is how the data were produced • The Expectation: Much attention is given to the data acquisition process in training and applications • The Reality: Little or insufficient attention is generally given to the data acquisition process 4
  • 5. THE CONSEQUENCES AND THE CHALLENGE • The Consequences – Why is it that every database that I have encountered is filled with data quality problems? (Theodore Johnson, 2003 QPRC) – Common wisdom puts the extent of the total project effort spent in cleaning the data before doing any analysis as high as 60-95% (DeVeaux and Hand, Statistical Science 2005) • The Challenge: Move data-acquisition to front burner – Understand limitations of available data – Emphasize data acquisition – Use disciplined process 5
  • 6. WEBINAR TOPICS • • • • • • • • Typical data acquisition situations Problems (and opportunities) with observational data A disciplined, targeted approach for data acquisition Washing machine design reliability example Some guidelines for effective data acquisition Some practical challenges Some relevant further commentaries Elevator speech EMPHASIS ON QUALITY AND RELIABILITY 6
  • 7. TYPICAL DATA ACQUISITION SITUATIONS • Control over data acquisition – Designed experiments – Random sampling studies (from specified population) – Double-blind medical studies – Systems development studies, e.g., • • • • • Estimate design reliability Evaluate measurement system Assess process capability Signal changes via control charts Anticipate/avoid field failures by automated monitoring • Observational studies (and data mining) on existing data often from Big Data MANY APPLICATIONS INVOLVE COMBINATIONS 7
  • 8. PROBLEMS (AND OPPORTUNITIES) WITH OBSERVATIONAL DATA • Problems with “available” databases – Data obtained for purposes other than statistical analysis – Data resides in different data bases • Some limitations of observational data – – – – – • • Missing values and events Unrepresentative observations Inconsistent or imprecise measurements Limited variability Key impacting variables unrecorded; recorded proxy variables deemed “significant” (e.g., foot size impacts reading ability) Observational studies – May be helpful for prediction, e.g., credit performance, top selling items before expected hurricane, finding best time to buy plane ticket – Misleading or useless for gaining “cause and effect” understanding – Observation from the trenches (Kati Illouz, GE): Data owners tend to be overly optimistic about their data Data inadequacies (and reasons) define future information needs QUALITY—NOT QUANTITY—OF DATA IS WHAT COUNTS 8
  • 9. IN SUMMARY • Even the most sophisticated statistical analysis cannot compensate for or rescue inadequate data • It’s not that there is lack of data. Instead, it is that the data are inadequate to answer the questions (NY Times article on “How Safe is Cycling?” October 22, 2013 • Massive data does not guarantee success…Knowing how the data were collected (the “data pedigree”) is critical (Snee, Union College Mathematics Conference, October 2013) • A good principle to remember is that data are guilty until proven innocent, not the other way around (Snee and Hoerl, QP Dec 2012) • Observational data have an important role in pointing the way forward, but they should not be a primary ingredient for making final decisions (Anderson-Cook and Borror, QP April 2013) 9
  • 10. DISCIPLINED, TARGETED PROCESS FOR DATA ACQUISITION (DEUPM) FOR SYSTEMS DEVELOPMENT STUDY • Proposed process: – Step 1: D: Define the problem – Step 2: E: Evaluate the existing data – Step 3: U: Understand data acquisition opportunities and limitations – Step 4: P: Plan data acquisition and analysis – Step 5: M: Monitor, clean data, analyze and validate • Example: Demonstrate desired ten-year reliability for new washing machine design in 6 months elapsed time 10
  • 11. STEP 1: DEFINE THE PROBLEM • Define specific questions to be answered Washing machine design example: – Stated objective: Show within 6 months and with 95% confidence that following can be met: • 95% reliability after one year of operation • 90% reliability after five years • 80% reliability after ten years (“reliability” defined as no repair or servicing need) – Added question: How can reliability be improved further? • Identify resulting actions Washing machine design example: Go to full scale production if validated and make identified improvements • State population or process of interest Washing machine design example: 6 million machines to be built in next 5 years 11
  • 12. STEP 2: EVALUATE THE EXISTING DATA • Understand the process and its physical basis Washing machine design example: Study up and participate in design reviews, FMEA’s (Failure Mode and Effects Analyses), etc. • Determine and analyze existing data Washing machine design example – Previous design • Existing data – In-house component, sub-assembly and system tests – Field failure and servicing data • Conclusion: Previous design does not meet current reliability goals – New design • • • • • Proposed new design aims to correct key past problems Possible concern: Introduction of new failure modes Existing data: Component and sub-assembly test results Data identified one new failure mode; rapidly addressed and corrected Conclusion: Proposed new design appears to correct past problems without introducing new ones; reliability goals appear to be met • Identify data inadequacies Washing machine design example: No information about system performance in realistic use environment 12
  • 13. STEP 3:UNDERSTAND DATA ACQUISITION OPPORTUNITIES AND LIMITATIONS • Gain understanding of data that can be acquired and how Washing machine example: In-house accelerated use rate systems testing • Simulate 3.5 years of operation per month • Evaluate weekly for failures • Sample unfailed units and measure degradation (destructive test) • Determine practical considerations and limitations in data acquisition Washing machine design example: • 6 months of testing • 3 prototype lots initially (and one more subsequently) • 36 available test stands • Assess relevance of resulting data to meet study goals and underlying assumptions Washing machine design example: • Assume prototype lots representative of 5-year high volume production • Assume failures are cycle (and not elapsed time) dependent • Assume realistic simulation of field environment Conclusion: This is analytic (not enumerative) study; statistical 13 confidence bounds capture only statistical uncertainty
  • 14. STEP 4: PLAN DATA ACQUISITION AND IMPLEMENT • • • • Specify test conditions or operational environment Washing machine design example: Run washing machines with full load of soiled towels, mixed with sand, wrapped in plastic bag Specify sample size and selection process Washing machine design example: Select 12 units randomly from each of 3 prototype lots and put on life test Specify protocol and operational details Washing machine design example: – Record failures and determine failure mode – After 3 months and again after 6 months Years • Remove 4 units from each of 3 lots and measure degradation • Replace 3 month withdrawals with 12 units from 4th prototype lot – Assure high-precision measurements, meaningful failure definition, complete and consistent data recording procedures, etc Specify data analysis plan and assess expected statistical precision Washing machine design example: – Do Weibull distribution analysis on time to failure data after 6 months – Conduct supplementary analysis using degradation data – Simulation study demonstrated proposed plan provides desired statistical precision Specify pilot study Washing machine design example: Run three washing machines for one week Percent Failing • 14
  • 15. STEP 5: MONITOR, CLEAN DATA, ANALYZE AND VALIDATE • Monitor implementation to ensure that process is being followed Washing machine design example: Continue involvement • Clean data—as gathered Washing machine design example: Develop proactive checks for missing or inconsistent data • Conduct preliminary analyses; act thereon, as appropriate Washing machine design example: Analyze failure data after 1 week, 1 month and 3 months; identify failure modes for correction • Conduct final data analysis and report findings Washing machine design example: Do final analyses after 6 months (failure and degradation data) • Validate: Propose appropriate validation testing Washing machine design example: – Continue 6 of 36 units on test beyond 6 months – Test 100 machines with company employees and 60 machines in laundromats – Audit sample 6 production units each week: Test five for 1 week; one for 3 months – Develop system for capturing and analyzing field reliability data – Provide current data access to engineers and management 15
  • 16. SOME GUIDELINES FOR EFFECTIVE DATA ACQUISITION (STEP 4) RECORD KEY VARIABLES AND EVENTS Example: Use field data to estimate reliability and speedily identify/address root causes of failures calls for • Field data – – – – Estimate of product usage Product performance measurements over time Time to failure Failure mode information • Manufacturing data – – – – – – Parts and manufacturing lot identification Actual process conditions Ambient conditions during manufacture Unplanned events Other potentially important process variables End-of-line performance 16
  • 17. ENSURE CONSISTENT AND ACCURATE DATA RECORDING • Strive for precise measurements • Combat data recording inconsistencies – Differences between operators – Differences in qualitative scaling assessments – Differences in data recording conventions; e.g. date of 2/8 • Address missing values – Understand reason – Handle appropriately – Minimize occurrence • Conduct timely data cleaning: Identify “errors” in recorded data (e.g. 999 for missing values) and correct 17
  • 18. AVOID SYSTEMATICALLY UNRECORDED OBSERVATIONS Some examples: • Information recorded on failed units only • Information only during warranty period • Exclusion of “outlier” information • Purging of “old”—but still relevant-data 18
  • 19. SOME OTHER HINTS • Strive to obtain continuous data • Aim for compatibility and integration of databases • Consider sampling 19
  • 20. CHALLENGES • Some practical challenges – Added cost and possible delays – Added bureaucracy – Diversity of data ownership: Engineering, Manufacturing, etc. – Need for added work not evident Result: Lack of motivation by data recorders and their management • Strive to overcome by – – – – – Recognizing perspectives of others Understanding consequences of our requests Making requests as simple and reasonable as possible Automating data acquisition process Providing convincing justification (e.g., insurance) 20
  • 21. SOME RELEVANT FURTHER COMMENTARIES • Webinar adapted from – Hahn, G.J. and Doganaksoy, N. (2011), A Career in Statistics: Beyond the Numbers, Wiley (Chapter 11). – Doganaksoy, N. and Hahn, G.J. (2012), Getting the Right Data Up Front: A Key Challenge, Quality Engineering, Vol. 24, #4, 446-459. • Also note – Anderson-Cook, C.M. and Borror, C.M. (2013), Paving the Way: Seven Data Collection Strategies to Enhance Your Quality Analyses, Quality Progress, April, 1829. – Coleman, D.E., Montgomery, D.C. (1993), A Systematic Approach for Planning a Designed Industrial Experiment, Technometrics, Vol.35,.1, 1-12. – DeVeaux, R. D., Hand, D.J, (2005), How to Lie with Bad Data, Statistical Science, 20 (3) 121-238. – Hahn, G.J. , Doganaksoy, N. (2003), Data Acquisition: Focusing on the Challenge, Presentation at Joint Statistical Meetings. – Hahn, G.J. Doganaksoy, N. (2008), The Role of Statistics in Business and Industry, Wiley, 2008. – Kenett, R.S. and Shmueli, G. (2013), On Information Quality (with discussion and rejoinder), Journal of the Royal Statistical Society, Series A, (forthcoming). – Schield, M (2006), Beware the Lurking Variable, Stats, 46, 14-18. – Snee, R.D. , Hoerl, R.W. (2012), Inquiry on Pedigree: Do you know the quality and origin of your data?, Quality Progress, December, 66-68. – Steiner, S.H, MacKay, R.J. (2005), Statistical Engineering: An Algorithm for Reducing Variation in Manufacturing Processes, Milwaukee, WI, ASQ Quality Press 21
  • 22. ELEVATOR SPEECH • We need put the horse (focus on data acquisition) before the CART (Classification and Regression Tree) data analysis • Specific proposals – Focus on data acquisition in training programs – Scrutinize available data to assess relevance and identify gaps – Use disciplined, targeted process for added data acquisition – Remain constantly cognizant of underlying assumptions • Thanks for listening – Gerry Hahn, gerryhahn@yahoo.com – Necip Doganaksoy, necipdoganaksoy@yahoo.com 22
  • 23. SOME RELEVANT FURTHER COMMENTARIES • Webinar adapted from – – • Hahn, G.J. and Doganaksoy, N. (2011), A Career in Statistics: Beyond the Numbers, Wiley (Chapter 11). Doganaksoy, N. and Hahn, G.J. (2012), Getting the Right Data Up Front: A Key Challenge, Quality Engineering, Vol. 24, #4, 446-459. Also note – – – – – – – – – Anderson-Cook, C.M. and Borror, C.M. (2013), Paving the Way: Seven Data Collection Strategies to Enhance Your Quality Analyses, Quality Progress, April, 18-29. Coleman, D.E., Montgomery, D.C. (1993), A Systematic Approach for Planning a Designed Industrial Experiment, Technometrics, Vol.35,.1, 1-12. DeVeaux, R. D., Hand, D.J, (2005), How to Lie with Bad Data, Statistical Science, 20 (3) 121-238. Hahn, G.J. , Doganaksoy, N. (2003), Data Acquisition: Focusing on the Challenge, Presentation at Joint Statistical Meetings. Hahn, G.J. Doganaksoy, N. (2008), The Role of Statistics in Business and Industry, Wiley, 2008. Kenett, R.S. and Shmueli, G. (2013), On Information Quality (with discussion and rejoinder), Journal of the Royal Statistical Society, Series A, (forthcoming). Schield, M (2006), Beware the Lurking Variable, Stats, 46, 14-18. Snee, R.D. , Hoerl, R.W. (2012), Inquiry on Pedigree: Do you know the quality and origin of your data?, Quality Progress, December, 66-68. Steiner, S.H, MacKay, R.J. (2005), Statistical Engineering: An Algorithm for Reducing Variation in Manufacturing Processes, Milwaukee, WI, ASQ Quality Press 23