Accelerated reliability techniques in the 21st century

Accelerated
Reliability Techniques
in the 21st Century
Mike Silverman &
Lou LaValle, Doug
Goodman, & Milena Krasich
©2013 ASQ & Presentation Silverman

http://reliabilitycalendar.org/webina
rs/

ASQ Reliability Division
English Webinar Series
One of the monthly webinars
on topics of interest to
reliability engineers.
To view recorded webinar (available to ASQ Reliability
Division members only) visit asq.org/reliability

To sign up for the free and available to anyone live
webinars visit reliabilitycalendar.org and select English
Webinars to find links to register for upcoming events

http://reliabilitycalendar.org/webina
rs/

Accelerated Reliability
Techniques
in the 21 st Century

by
Lou LaVallee, Senior Reliability Engineer, Ops A La Carte
Doug Goodman, CEO, Ridgetop Group
Mike Silverman, Managing Partner, Ops A La Carte
Milena Krasich, Sr Principal Systems Engineer, Raytheon
Page 3

ABSTRACT
• As product development cycles are shortening, the need for
more accelerated reliability tools is becoming increasingly
more important.
• This webinar will focus on the best reliability tools focused
on accelerating the reliability learning process. In this
webinar, we will focus on four important accelerated tools:
• Robust design and Reliability Engineering Synergy
• Prognostics as a Tool for Reliable Systems
• HALT
• Accelerated Reliability Growth Testing
The audience will understand a variety of reliability tools that
can be used to accelerate product development from design
through testing. This will also provide a snapshot of the
learning that will be provided as part of the Accelerated
Stress Testing and Reliability Workshop for 2013.
Page 4

Agenda
• Introduction 5 min
• Robust Design and Reliability Engineering
Synergy 10 min
Lou LaVallee, Ops A La Carte

• Prognostics as a Tool for Reliable Systems10 min
Doug Goodman, Ridgetop Group
• HALT/AST – History and Trends 10 min
Mike Silverman, Ops A La Carte
• Accelerated Reliability Growth Testing 10 min
Milena Krasich, Raytheon
• Tieing them all together 5 min
• Questions 10 min Page 5

Biography for Lou LaVallee
• Mr. LaVallee is founder of Upstate Reliability Engineering Services
an upstate New York based consulting firm delivering advanced reliability support to a wide
variety of industries. He joined forces with Ops a la Carte in 2010.
• He has a strong technical background in physics, engineering materials/polymer science and a
solid grounding in consumer product design, development, and delivery. His comprehensive
background includes electronic films , robust design, modeling & analytics, critical parameter
management, six sigma DFSS & DMAIC, optimization of product quality/reliability,
experimental design, reliability test methods, and design tool development and deployment. He
successfully managed systems engineering groups for development of ink jet print heads at
Xerox Corp.
• Mr. LaVallee has held other technical management positions in manufacturing technology,
engineering excellence (trained several thousand engineers worldwide). He also managed the
robust engineering center at Xerox for 10 years, managed a high volume printing product
quality and reliability group, and worked extensively with high volume printing product service
organization.
• He has strong validation experience of design quality and reliability through product reviews
and customer interaction Mr. LaVallee holds a Bachelor of Science degree in Physics (BS),
and an MS from the University of Rochester in materials/polymer engineering.
• He holds several U.S. patents involving fluidics and engineering design processes. He is
currently a senior reliability engineering consultant with Ops a la Carte LLC.. Mr. LaVallee is an
ASQ certified reliability engineer.
Page 6

Biography for Doug Goodman
• Mr Goodman is CEO and Founder of Ridgetop Group, Inc.
an Arizona-based leader of advanced diagnostic, prognostic
and health management tools, instrumentation, and rad hard
microelectronics.
• He is accustomed to being a pioneer of innovative electronic technology and
establishing engineering firsts. His comprehensive background encompasses
low-noise instrumentation design, design-for-test (DFT), fault simulation
techniques, and design tool development at firms such as Tektronix and
Honeywell.
• He was also part of the team that developed the first DSP-based IF processing
for spectrum analyzers.
• He successfully steered engineering at Analogy Inc. (electromechanical design
simulation tools) as vice president until its IPO. Afterwards, he moved to co-
found and head Opmaxx Inc., a design-for-test IP firm that later merged with
Credence Systems.
• Mr. Goodman also serves on the Board of Engineering Synthesis Design, Inc.
(ESDI), a waveform and surface metrology instrumentation firm based in
Tucson, Arizona. (ESDI.com).
Page 7

Biography for Mike Silverman
• Mike Silverman is the founder and a managing partner at
Ops A La Carte, a Professional Consulting Company that has an
intense focus on helping customers with end-to-end reliability.
• Mike has over 25 years of experience in reliability engineering, reliability
management and reliability training. He is an experienced leader in reliability
improvement through analysis and testing.
• Through Ops A La Carte, Mike has had extensive experience as a consultant to
high-tech companies, and has consulted for over 500 companies in over 100
different industries in most of the US and 15 countries around the world.
• Mike is an expert in accelerated reliability techniques and owns HALT and
HASS Labs, one of the oldest and most experienced reliability labs in the world.
• Mike has recently completed his first book on reliability entitled “How Reliable Is
Your Product: 50 Ways to Improve Product Reliability”.
• Mike has authored and published 25 papers on reliability techniques and has
presented these around the world including Canada, China, Germany, Japan,
Korea, Singapore, Taiwan, and the USA. He has also developed and currently
teaches over 30 courses on reliability techniques.
• Mike is the chair of this year’s ASTR conference and chair of the Santa Clara
Valley IEEE Reliability Society. Page 8

Biography for Milena Krasich
• Milena Krasich is a Senior Principal Systems Engineer in Raytheon
Integrated Defense Systems, RAM Engineering Group in MA.
• Prior to joining Raytheon, she was a Senior Technical Lead of Reliability
Engineering in Design Quality Engineering of Bose Corporation, Automotive
Systems Division. Before joining Bose, she was a Member of Technical Staff in the
Reliability Engineering Group of General Dynamics Advanced Technology Systems
formerly Lucent Technologies, after the five year tenure at the Jet Propulsion
Laboratory in Pasadena, California.
• While in California, she was a part-time professor at the California State University
Dominguez Hills, where she taught graduate courses in System Reliability,
Advanced Reliability and Maintainability, and Statistical Process Control. At that
time, she was also a part-time professor at the California State Polytechnic
University, Pomona, teaching undergraduate courses in Engineering Statistics,
Reliability, SPC, Environmental Testing, Production Systems Design.
• She holds a BS and MS in EE from the University of Belgrade, Yugoslavia, and is a
California registered Professional Electrical Engineer.
• She is also a member of the IEEE and ASQC Reliability Society, and a Fellow and
the president Emeritus of the Institute of Environmental Sciences and Technology.
Currently, she is the Technical Advisor (Chair) to the US Technical Advisory Group
(TAG) to the International Electrotechnical Committee, IEC, Technical Committee,
TC56, Dependability. Page 9

&
Accelerated Stress Testing and Reliability Workshop
October 9-11, 2013 San Diego, CA
Accelerating Reliability into the 21st Century
Keynote Presenter Day 1: Vice Admiral Walter Massenburg
Keynote Presenter Day 2: Alain Bensoussan, Thales
Avionics
CALL FOR PRESENTATIONS: We are now Accepting Abstracts.
Email to: don.gerstle@gmail.com.
Guidelines on website www.ieee-astr.org
For more details, click here to join our LinkedIn Group:
IEEE/CPMT Workshop on Accelerated Stress Testing and Reliability

Accelerated Reliability
Techniques
in the 21 st Century

Page 11

Introduction
In this webinar, we will introduce four of the most
effective reliability techniques that can accelerate
reliability learning on your product.
• Robust Design and Reliability Engineering Synergy
• Prognostics as a Tool for Reliable Systems
• Highly Accelerated Life Testing (HALT)
• Accelerated Reliability Growth Testing (RGT)

We invite you to determine which can be most
effective for your reliability program.

Page 12

Robust Design and Reliability
Engineering Synergy
by Lou LaVallee

Page 13

Poll Question 1
Have you ever used Design for Robustness
Techniques ?
a) We use all the time
b) We’ve used a few times
c) We tried once
d) We haven’t used but are planning to
e) We have never used

Page 14

Robust Design &
Reliability Engineering Synergies

Louis LaVallee
Sr Reliability Consultant
Ops A La Carte

Abstract for full tutorial

Robust Design (RD) Methodology is discussed for hardware
development. Comparison is made with reliability engineering (RE)
tools and practices. Differences and similarities are presented.

Proximity to ideal function for robust design is presented and
compared to physics of failure and other reliability modeling and
prediction approaches. Measurement selection is shown to strongly
differentiates RD and reliability engineering methods. When and
how to get the most from each methodology is outlined. Pitfalls for
each set of practices are also covered. (This presentation is a taste
of a larger presentation to be delivered in San Diego)

Page 16

Many Design methods & Interfaces
AXD
TRIZ

QFD
DFR
PUGH

DOE ROBUST DESIGN
VA/VE

DFSS 6
CP/CS MNGMT
Page 17

RD Reliabilit y
Life Tests
P-diagram
Root cause Analysis
Tolerance Design Expt
Layout
Ideal Function POF
Response DOE RCM
Tuning Engineering Maintainability CBM
6 Flexibility
Lean Science Warranty $

Robust Design Simulation Reliability
Quality Testing
Models
Loss
Reuse FMECA
transformability HALT/HASS
Planning
S/N RSM
ADT Life prediction Redundancy
Online QC
ALT
Parameter design FTA
Availability

Generic Function RBD

Page 18

Robustness is…

“The ability to transform input to output as closely to ideal
function as possible. Proximity to ideal function is highly
desirable. A design is more robust if ratio of useful part to
harmful part [of input energy ] is large. A design is more
robust if it operates close to ideal, even when exposed to
various noise factors, including time”

Reliability is…
“The ability of a system, subsystem, assembly, or component
to perform its required functions under stated conditions
for a specified period of time”
Page 19

Harmful Variation & Countermeasures
• Search for root cause & eliminate it
• Screen out defectives (scrap and rework)
• Feedback/feed forward control systems
• Tighten tolerances (control, noise, signal factors)
• Add a subsystem to balance the problem
• Calibration & adjustment
• Robust design (Parameter design & RSM)
• Change the concept to better one
• Turn off or turn down the power
• Correct design mistakes (e.g. installing diodes backwards)

Page 20

Robustness Growth
S/N

Factors Can be changed
today

time

S/N
Factors Can be changed in 1
week

time
S/N
Benchmark Target

Factors Can be changed in 2
weeks
Robustness gains

time
Page 21

Progression of Robustness to Ideal Function Development

A B C

LSL USL
Zero Defects
Cpk
Static S/N
Dynamic S/N Ratio

When a product’s performance deviates from target, its quality is
considered inferior. Such deviations in performance cause losses to
the user of the product, and in varying degrees to the rest of
society.
Page 22

Useful
Input signals Output
Main Function
Mi Y=f(x)+
Harmful
Output

Noise Control
Factors Factors

Taxonomy of Design Function -- P Diagram
Page 23

Transformability & Robustness Improvement
Response Response

N1 N1

N2

N2

0,0 M signal 0,0 M signal

Minimizing the effects of noise factors on transformation of input to output
improves reliability. Sensitivity increase can be used for power reduction. Noise
factor here might be fatigue cycles, or stress in one or two directions, or …

Page 24

Typical Failure Modes and Causes for
Mechanical Springs

TYPE OF SPRING/STRESS
FAILURE MODES FAILURE CAUSES
CONDITION
- Load loss
- Parameter change
- Static (constant deflection - Creep
- Hydrogen embrittlement
or constant load) -Compression Set
- Yielding
- Fracture
- Damaged spring end - Corrosive atmosphere
- Cyclic (10,000 cycles or - Fatigue failure - Misalignment
more during - Buckling - Excessive stress range of
the life of the spring) - Surging reverse stress **
- Complex stress change as a - Cycling temperature
function of time …

- Dynamic (intermittent - Surface defects
- Fracture - Excessive stress range of
occurrences of
- Fatigue failure reverse stress
a load surge)
- Resonance surging

Page 25

Ideal Function & Failure Modes

If data remain close to ideal function, even under
predicted stressful conditions of use, and there is no way
for failure to occur without affecting functional variation of
the data, then moving closer to ideal function is highly
desirable.

For example, spring fatigue, if it did occur would
dramatically change force-deflection (F-D) data and inflate
variation. Similarly, for yielding, F-D results would change
and inflate the variation. Other failure modes would follow
in most cases.

Page 26

Measurement System Ideal Function
Y= M+e
M=true value of measurand
Y=measured value
Auto Steering Ideal function
Y= M+e
M=steering wheel angle
Y=Turning radius

Communication system ideal function
Y=M+e
M=signal sent
Y=signal received

Cantilever beam Ideal Function
Y= M/M*+e
M=Load
M*=Cross sectional area
Fuel Pump Ideal Function
Y= M
Y=Fuel volumetric flow rate
M=IV/P current, voltage,& backpressure
Page 27

Summary
• RD methods and Reliability methods both have functionality at their
core. RD methods attempt to optimize the designs toward ideal function,
diverting energy from creating problems and dysfunction. Reliability
methods attempt to minimize dysfunction through mechanistic
understand and mitigation of the root causes for problems.

• RD methods actively change design parameters to efficiently and cost
effectively explore viable design space. Reliability methods subject the
designs to stresses, accelerating stresses, and even highly accelerated
stresses, [to improve time and cost of testing]. First principle physical
models are considered where available to predict stability.

• Both RE and RD methods have strong merits, and learning when and
how to apply each is a great advantage to product engineering teams.

Page 28

Prognostics as a Tool
for Reliable Systems
by Doug Goodman

Page 29

Poll Question 2
Have you ever used Prognostics ?
c) We tried once

Page 30

Reliability “Bathtub Curve”

Prognostics Trigger Point
Failure rate
Infant Useful Life Normal
Mortality Period Wearout
Period

Time
Threshold Trigger Points Advanced Warning of Failure (RUL)
are selectable

Page 31

Prognostic Solutions
• Ridgetop develops
electronic prognostic
solutions for critical
systems:
– Sensor array
detectors
– Harnesses for
“prognostics-enabling”
critical systems, and
– Sentinel software to
comprise a complete
solution. Page 32

Electronic Prognostics
• Electronics are the keystone to successful deployment of
complex systems (50+ MPUs in an automobile)

• Large MTBF and Statistical Process Control and Centering
methods are not sufficient alone for reliability due to “outliers”
(e.g. Toyota Prius, Deepwater Horizon Drilling Rig, Boeing
787)

• Ridgetop technology exists to pinpoint degrading systems
before they fail; supporting operational readiness objectives
and cost-saving Prognostics/Health Management (PHM) and
Condition Based Maintenance (CBM) initiatives.

Page 33

Prognostics Health Management
(PHM)

Page 34

Degradation Rates Depend on
Environmental Conditions
MTBF statistical expected life

Usage Environment
 Usage monitoring would
provide a safety benefit if
actual usage is more
severe than predicted (see
the red region, T1). T1 T2
 Service life can be
extended beyond normal
replacement time if the
actual usage severity is
known (see the green
region, T2).
PHM enables replacement only upon evidence of need

Page 35

Degradation Example
Good Power System Degraded Power System

State of State of Health
Health

Degraded VR
Threshold
End-of-Life End-of-Life

Both supplies provide regulated voltage, but one is degraded and will
soon fail.

Page 36

Prognostic Advantages
• Prognostics provides advanced warning of impending
failure conditions on critical systems.
• Physical evidence of degradation is the basis for service,
not an arbitrary time interval.
• PHM and CBM maintenance strategies can reduce
support costs through optimized timing of service and
parts replacement.
• Autonomic logistics systems can be established, placing
spare parts and provisions where needed.

Page 37

Faults Occur at Multiple Levels
in Systems

Page 38

Sentinel NetworkTM
• Collection and analysis hub
for PHM
• Scalable, system level State
of Health (SoH) Analysis &
Prognostics
• Automatic SNMP-based
Sensor Network Discovery
• Troubleshooter
• System stability cost
reduction for tactical
networks

Page 39

Airborne Power System Monitoring
• PHM applied to power
systems in harsh
environment
• Apache Helicopter where
vibration, heat, shock all
can reduce lifetime of
deployed systems
• Extracts and processes
eigenvalues as a metric
of health

Page 40

Prognostic Health Management
Ecosystem
2 Identified Design
Integrated
Improvements
Communicate Diagnostic/Prognostics
PHM sensor
Design Platform
data
3

Address
1 ECRs
Real-time and
Health & RUL Improve
Subsystem
Parts
OEM

CBM Actions
4

Scheduler Minimize Inventory
5
Replacement
Parts
Line Replaceable Parts
Unit (LRU)
Maintenance
Page 41

HALT/AST
by Mike Silverman

Page 42

Poll Question 3
Have you ever used the technique HALT ?
c) We tried once

Page 43

INTRODUCTION

 HALT began 40 years ago with a simple idea of testing
beyond specifications in order to better understand
design margins.
 Over the past 40 years, thousands of engineers around
the world have been exposed to the concepts of HALT
and have tried the techniques.

What have we learned in the past 40 Years?

Page 44

HISTORY OF HALT/HASS

 HP started performing Stress for Life (STRIFE) testing in
the early 70’s. Some people consider STRIFE the
predecessor to HALT.
 Reliability cannot be achieved by adhering to
detailed specifications. Reliability cannot be
achieved by formula or by analysis. Some of
these may help to some extent, but there is only
one road to reliability. Build it, test it, and fix the
things that go wrong. Repeat the process until the
desired reliability is achieved. It is a feedback
process and there is no other way.
 David Packard, 1972
Page 45


 Dr. Gregg Hobbs officially coined the term HALT in 1988.
 For the next two decades, Gregg traveled around the
world teaching the concepts of HALT and HASS.
 Many of you in this room probably attended that seminar.

Page 46

 Over the next 17 years, HALT labs popped up around the
world. Today I estimate there are about 200 HALT labs in
the world.
 This has exposed literally thousands of engineers to the
processes.
 However, the methodology being practiced is inconsistent.
 Standards have and are being written (more like
guidelines)
 Books have been published
 Conferences have been formed

Page 47

 Standards/Guidelines/References to HALT/HASS
 IEC 62506: “Accelerated Testing”
 IEST-RP-PR003: “HALT and HASS”
 IPC-9592: “Performance Parameters for Power
Conversion Devices”

Page 48


 Books on HALT/HASS
 “Accelerated Reliability Engineering: HALT and
HASS”, Gregg Hobbs, 2000
 “HALT, HASS, and HASA Explained”, Harry W.
McLean, 2009
 “Improving Product Reliability: Strategies and
Implementation”, Levin and Kalal, 2003
 “Accelerated Testing and Validation”, Alex Porter of
Intertek, 2004
 “How Reliable Is Your Product: 50 Ways to Improve
Your Product Reliability”, Mike Silverman, 2010

Page 49

 Conferences
 This ASTR Conference started in 1995 and we have
held every year since except 2001. This is the only
conference solely dedicated to accelerated testing
 Other conferences with an ALT track
 Applied Reliability Symposium (ARS)
 Reliability and Maintainability Symposium (RAMS)
 Other conference with a reliability focus
 IRPS
 Prognostics Conference (two of them)

Page 50

WHAT IS HALT?

HALT: A design technique used to discover product
weaknesses and improve design margins. The intent is to
systematically subject a product to stress stimuli
well beyond the expected field environments in order to
determine and expand the operating and destruct limits of
your product.

- 50 Ways to Improve Your Product Reliability, Mike Silverman

Page 51

WHAT IS NOT HALT?
What are some classic HALT misconceptions:
 My product does not experience vibration so we can’t use it
 The spec for this component is 70C so we can’t go above that
in HALT
 We can’t drill holes in the product because it will change the
airflow
 We must mount it in the same direction as it will be mounted in
the field
 We don’t need to go above the first failure point because that is
what will fail first
 Run to preset levels (remember this is not a pass/fail test)
 Don’t stress beyond specifications
 Only perform HALT at system level
 Just perform HALT only when diags are fully ready
Page 52

BASICS OF HALT

Start low and step up the
stress, testing the product
during the stressing

Page 53

BASICS OF HALT

Gradually increase
stress level until a
failure occurs

Page 54

BASICS OF HALT

Analyze
the failure
Page 55

BASICS OF HALT

Make
temporary
improvements Page 56

BASICS OF HALT
Increase
stress and
start
process
over

Page 57

BASICS OF HALT

Fundamental
Technological
Limit

Page 58

BASICS OF HALT
Classic S-N Diagram
(stress vs. number of cycles)

S0= Normal Stress conditions
S2
N0= Projected Normal Life

S1

S0

N2 N1 N0
Page 59

BASICS OF HALT
Classic S-N Diagram
(stress vs. number of cycles)

Point at which failures become non-relevant

S0= Normal Stress conditions
S2
N0= Projected Normal Life

S1

S0

N2 N1 N0
Page 60

BASICS OF HALT

Lower Lower Upper Upper
Destruct Oper. Product Oper. Destruct
Limit Limit Operational Limit Limit
Specs

Stress

Page 61

BASICS OF HALT

Lower Lower Upper Upper
Destruct Oper. Product Oper. Destruct
Limit Limit Operational Limit Limit
Specs

Destruct
Margin
Operating
Margin

Stress

Page 62

NEW ADVANCES IN HALT

 Along with improvements in chamber
technology, there have been advances in the
methodology as well.
 Harry McLean’s HALT Calculator
 To determine “Guard Band” Limits during the
HALT Plan
 To determine AFR after HALT
 Using FMEA to determine specific areas to test for
 Linking HALT to ALT
 Using HALT for software/firmware issues

Page 63

FUTURE OF HALT AND HASS
 The number of companies performing HALT will
continue to rise as more labs obtain HALT equipment
 The need for more education will continue to increase
 Standards/guidance docs will gain more importance as
more companies and labs are doing HALT, many
incorrectly.
 Chambers will need to provide stresses in addition to
temperature and vibration to keep up with the physics
of the failures (especially due to smaller packages and
MEMs devices).
 Move away from people and move to process
 HALT as acronym will fade away
 Less HALT and more emphasis on DFR including HALT
Page 64

CONCLUSION

 In this presentation
 we took you through 40 years of HALT
 showed you advances that have been made
 pointed out areas where improvements are
still needed

Page 65

Accelerated Reliability Growth Testing
by Milena Krasich

Page 66

Poll Question 4
For the last RGT you performed, did you
have a chance to plan the duration and the
stresses?
a) Planned both
b) Planned duration only
c) Planned test environments
d) Did not plan RG test

Page 67

Tutorial Objectives
Show a synopsis of the tutorial which will be presented at the
ASTR 2013.
 Reliability Growth Test objectives
 Explain traditional Reliability Growth test methodology
 Show shortcomings of the traditional methods
• Entire item failure rate not calculated
• Test duration too long for the modern high reliability items
• Little or no relationship of reliability and stresses
 Show principles of the Physics of Failure test methodology
 Show how the Reliability growth test based on PoF is constructed
 Show how the expected stresses are applied and accelerated
 Show reliability measures
 Show advantages of the test PoF test design and acceleration
 Show achieved considerable test cost reduction.
Page 68

Traditional RG Test Methodology
Applied stresses in test - magnitude equal to those in use
 Involved assumptions of stress average magnitudes
Overall test duration determined based on the initial and goal
reliability measure: failure rates Mean Time Between
Failures, MTBF (or MTTF)
Environmental or operational stresses applied in sequence or
simultaneously at the use levels
 Applied stress duration determined by engineering judgment
 Overall test duration and stress application are unrelated to use profiles
or required life or mission of the product
Additional errors:
 Random failure rates and those of not corrected failure modes not
added into the final failure rates

Page 69

Mathematics of Traditional Reliability Growth
Failure modes types in test:
 Systematic: corrected in test (Type B), not corrected (Type A), Random -
constant
0,06 Item (t ) B (t ) A (t ) r (t )
1
Item (t ) t A (t ) r (t )
1
Item (t ) t
0,05
Failure intensity/failure rate (failures/hour)

Only type B failure modes failure
0,04
rates are accounted for in a
reliability test program – those that
show growth expressed by the
0,03 power law model; the type A and
S(t)= A(t)+ r(t)+ B(t)
random remain constant.

0,02
r(t)

The only failure modes with
decreasing failure rates
0,01
B(t) (power law)

A(t)

0
0 1000 2000 3000 4000 5000 6000
Test duration (hours)
Page 70

Information Needed – Information Obtained
Test duration is mathematically determined from:
1
tF log F tF log 1 t1
tF log t1
F 1 t1 1
t1 tF e
 Where:
 F = final product MTBF (for mitigated. “fixed” failure modes only) – given goal
 I = initial product MTBF (for failure modes that will be mitigated) - assumed
 tF =test duration needed to achieve the final MTBF for fixed failure modes
 tI = initial test time (has various explanations) - assumed
 = parameter initially assumed, then determined from analysis (power law) –
 the test duration might change dependent on failure mitigation success
 The test duration too high for high reliability items and depends on three
assumed parameters
 Failure rate of the non-mitigated failure modes and those considered random
are assumed or predicted
Information obtained from mathematical Reliability Growth technique is
NOT product MTBF, it only is MTBF from the improved, B, failure modes
 The A type (not fixed) and random failure modes are – forgotten!
Page 71

Physics of Failure and Reliability
Failures occur when an item is not strong enough to withstand one or
more attributes of a stress:
 Level, duration, or repetitions of its application
• The higher the level the shorter duration or less repetitions induce a failure
The area of overlap of strength and stress distributions
represents probability of failure for each of the stresses;
L, L = mean and standard deviation of the load
distribution = b L
S, S = mean and standard deviation of the strength
distribution = a S

• If the mean of strength is a k times multiple of the mean of stress (load) and the
standard deviations of each are a and b times their respective mean
values, reliability of an item regarding each use stress (i), and the total reliability
will be: S
k L_i L_i RItem (t0 ) RStressi (ti )
Ri (k , L _i)
2 2 i 1
a k L_i b L_i

Page 72

Physics of Failure Reliability – Margin k Selection
Allocate reliability regarding each of the expected stresses in use
 The cumulative damage and ultimately failure due to a stress is proportional
to the stress level and its duration. For the stress applied at the same level
as in life, the cumulative damage model is: D(t ) S (t ) dt
1.00
t
For the allocated reliability
0.95
regarding each stress, select
0.90 the value of margin k which
0.85
would multiply its duration in use
to be applied in test;
0.80
Apply stresses simultaneously
Reliability
Reliability

0.75
b=0,5
whenever possible;
0.70
a=0,05
b=0,2
If the same stress type is
a=0,05
b=0,05 applied at different levels in
a=0,05
0.65
b=0,2
a=0,02
use, recalculate their durations
0.60 b=0,1
a=0.02
to the highest level (using
0.55
b=0,05
a=0,02 acceleration factors);
The most common values for a
0.50
1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50
and b are:
Multiplier k a = 0.05, b = 0.2
Page 73

Test Acceleration
Each of the stresses is accelerated in test to allow for shorter test
duration
Total item failure rate is the sum of its failure rates regarding each
individual stress ( 0 is the item total failure rate in use condition and A is
the accelerated item total failure rate (in reliability growth is equivalent
to ):
NS

A ATest 0 Aj i
i 1 j i

Product j exists when the stresses 1 to j produce the same failure mode.
Stress acceleration models for different stresses – example:
 inverse power law model (usually applicable to thermal
cycling, vibration, shock, humidity);
 Arrhenius model (used for temperature acceleration using absolute temperature);
 Eyring model (used also when the thermal stress is a factor in process acceleration);
 step stress model, where the stress is increasing in steps;
 fatigue model representing the degradation due to the repetitious stress.

Page 74

Test Example B Failure Modes
Stress/Requirement/Property Symbol/Value Units Determination of factor k – for major
Product life t0 h
stresses:
1
Time ON ta h/day
Ri (t 0 ) R0 (t 0 ) 4 0.946 k=1.5
Internal temperature when ON T ON °C 1

Internal temperature when OFF T OFF °C 0,95

0,9
Temperature change T Use °C
0,85
Rate of temperature change ς Use °C/min
0,8
Number of thermal cycles cT Cycles/day

Reliability
0,75 a=0,1
Temperature rise over the ambient T °C b=0,1
0,7
Relative humidity RH Use %
0,65
Distance travelled in product life D miles
0,6
Vibration level in use W Use g
0,55
Operational. (ON/OFF) cycling c Cycles/day
0,5
1,00 1,05 1,10 1,15 1,20 1,25 1,30 1,35 1,40 1,45 1,50
Stresses: Multiplier k

Thermal cycling
Thermal exposure (thermal dwell) Thermal dwell (normalize exposure when OFF to duration
Humidity at ON temperature):
Vibration tON _ N tON tOFF exp
Ea 1 1
Operational cycling kB TOFF 273 TON 273
Thermal cycling tON _ N 8,754 hours
TTest
m 1/ 3 NTC _ Use k Duration of accelerated exposure:
ATC ARamp _ Rate Test NTC _ Test
TUse Uset ATC ARamp _ Rate tT _ Test tON _ N k exp
Ea 1 1
kB TON 273 TTest 273
One thermal cycle in test = 24 hours in life
tT _ Test 168.1 h
Page 75
.

Test Example, Cont.
 The thermal exposure is combined with the thermal cycling, distributed over the high temperature:
 The test cycle profile:
tTC 2 (ramp time) (temp.Stabilizat ion ThermalDwell) Dwell at cold
125
tTC 2 22.3 5 52.3 min 0.875 h
10
 Humidity: Test 95% RH and temperature TRH= 85 °C (65 °C chamber + 20 °C internal temperature
rise) h
RHUse Ea 1 1
t RH _ Test _ Test tON _ N exp
RH Test kB TON 273 TRH 273
h 2.3
t RH _ Test 300 h

 Vibration: 150,000 miles, 150 hours per axis vibration at 1.7 g rms. Test level: 3.2 g rms
w
WUse
tVib _ Test k tVib _ Use
WTest
With : w 4
tVib _ Test 18 hours per axis
Data for reliability plotting:
Failure Time to Cumulative (t) log(t) log[ (t)]
failure time to
h failure Initial B failure modes MTBF 100,000 hours, final 106hours
(n=24)
1 3,821.33 91,711,92 91 ,711.92 4.96 4.96 Initial test time: 100 hours
2 5,781.33 138,751.92 69 ,375.96 5.14 4.84
3 14,016 336,384.00 112 ,128 5.53 5.05 Total traditional test time: 4.6x103hours
4 18,563.44 445 522,56 111, 380.64 5.65 5.05
t 0*k 131.400 3 ,153 ,600 788 ,400 6.50 5.90 Final test reliability (B failure modes): 0.99997
Final MTBF (improved failure modes):1,431,964 hours
Page 76
Total accelerated test time; 526 hours

Why Accelerated Reliability Growth ?
The test duration covers product entire life
 It allows detection of all design problems, not only those that appear in a small
fraction of product life
 It enables estimate of failure rate regarding product random events,
disregarded in traditional RG testing
 The failure rate achieved by design improvement with the random failure rate
provides realistic estimate of total product reliability
Test duration is determined based on required total reliability in view of
product physical cumulative damage from life stresses in use;
Test acceleration allows achievement of very reasonable test duration,
shorter than traditional mathematically derived testing
 The reliability improvement through test is no longer cost prohibitive
Test failure times are projected to their appearance in real life and the
analysis uses this data;
Even though covering the product expected life (durability information), it is
still considerably shorter than the traditional reliability
Page 77

Summary
What each of these 4 techniques have in
common is that
1) Each is a progressive accelerated
reliability technique being used today
2) Each will be highlighted as tutorials in our
Accelerated Stress Testing and Reliability
(ASTR) Workshop Oct 9-11 in San Diego

Page 78
78

Summary
What each of these 4 techniques have in
common is that
1) Each is a progressive accelerated
reliability technique being used today
2) Each will be highlighted as tutorials in our
Accelerated Stress Testing and Reliability
(ASTR) Workshop Oct 9-11 in San Diego

Page 79
79

To find out more about this year’s
Accelerated Stress Testing & Reliability (ASTR)
Workshop
October 9-11, 2013 San Diego, CA

EMAIL: mikes@opsalacarte.com
(MIKE IS THIS YEAR’S ASTR GENERAL CHAIR)

ALL THAT RESPOND WILL BE ENTERED INTO A DRAWING
FOR A FREE REGISTRATION TO THE CONFERENCE

Page 80

Contact Information
Ops A La Carte, LLC Ops A La Carte, LLC
Mike Silverman Lou LaVallee
Managing Partner Senior Reliability Engineer
(408) 472-3889 (585) 281-1882
mikes@opsalacarte.com loul@opsalacarte.com
www.opsalacarte.com www.opsalacarte.com

Raytheon Ridgetop Group
Milena Krasich Doug Goodman
Sr. System Engineer CEO
978-440-1578 (520) 742-3300
Milena_Krasich@raytheon.com doug.goodman@ridgetopgroup.com
www.ridgetopgroup.com

Page 82

Accelerated reliability techniques in the 21st century

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Accelerated reliability techniques in the 21st century

Semelhante a Accelerated reliability techniques in the 21st century (20)

Mais de ASQ Reliability Division

Mais de ASQ Reliability Division (20)

Último

Último (20)

Accelerated reliability techniques in the 21st century

Notas do Editor