Data analysis ( Bio-statistic )

1
Interpretation of Clinical Biochemical Data
◘ Course Objectives :
(1) To distinguish between people in the population who have the diseases
and those who do not
(2) To determine how good the test is in separating populations of people
with and without the disease in question.
◘ The Objectives include:
(1) Define the Key Terms
(2) Discuss how statistics is used to determine normal values and establish
Quality Control ranges
(3) Given numerical observations, calculate the:
a. Mean
b. Range
c. SD
d. CV%
e. CD
◘ INTRODUCTION
(1) When evaluating laboratory results, how do we determine what is
normal or acceptable? That is: What is “normal” or “OK”?
(2) When does a laboratory test result become “weird” or “abnormal”?
When do we become uncomfortable with a result?
(3) At some point we have to draw a “line”… on this side of the line you
are normal … on the other side of the line, you are abnormal

2
◘ Review of Statistical concepts
• Measures of Central tendency: How numerical values can be
expressed as a central value?. E.g., mean
• Dispersal about the central value:
How spread out are the numbers? E.g. range, or SD
Using these 2 main ideas we can begin to:
(1) Understand how basic statistics are used in clinical chemistry to define
normal values, and
(2) When our instruments are (or are not) generating expected numerical
results
• Analytical Tests include both
• Screening Tests
• Diagnostic Tests
Both should have Validity and Reliability
◘ Validity and reliability of analytical tests
• Validity (‫الصالحية‬ ‫لتحقيق‬ ‫الهدف‬ )
The degree to which the tests accomplish the purpose for which they are
being used.
• Reliability (‫دقة‬ ‫اإلختبار‬ )
The measure of how stable, dependable, trustworthy, and consistent a test
is in measuring the same thing each time
◘ Quality Assurance & Quality Control
• Quality Assurance (QA)
Includes pre-analytic, analytic and post analytic factors
• Accuracy versus Precision
The laboratory must produce results that are both accurate and
reproducible.

3
◘ Classification of errors
The variables:
Pre-analytical variables
• Right specimen from right patient and in right condition
Analytical variables
• All parts of testing procedure performed properly, controls in range
Post analytical variables
• Correct report to correct person, interpreted correct
• All the phases of the testing process are subject to errors and must be
closely monitored, to maintain quality assurance.
♦ Accuracy: A measure of how close the
observations are to the “true” or “correct” value
• In the laboratory, we need to report tests with
accuracy and precision.
• But how accurate do we need to be? It is not possible to hit the bulls-eye
every time.
So how close is “close enough?”•
♦ Precision
3 possible testing outcomes - Hitting the targe

4
◘ DESCRIBING DATA
♦ Measures of the center of data
Mean
♦ Measure of data variability
Standard deviation (variance)
Range
◘ Establishment of Reference Ranges
Each lab must establish its own reference ranges (normal range) based
on local population.
You should not depend on the kit‟s reference range
The following steps show how to calculate the reference range:
Mean & Variance
Step (1): Sample Mean: The Average or Arithmetic Mean
Add up data, then divide by sample size (n)
The sample size n is the number of observations (pieces of data)
Mean: Example
Five systolic blood pressures (mmHg)
• (n = 5)
• 120, 80, 90, 110, 95
• Can be represented with math type notation:
• x1= 120
• x2 = 80
• x2 = 90
• x2 = 110
• x5 = 95
• The sample mean is easily computed by adding up the five values and
dividing by five.
• In statistical notation the sample mean is frequently represented by a
letter with a line over it

5
♦ For example (pronounced “x bar”)
Notes on Sample Mean
• Generic formula representation
• In the formula to find the mean, we use the
“summation sign”: Σ
• This is just mathematical shorthand for “add up all of the observations”
Notes on Sample Mean
Also called sample average or arithmetic mean
Sensitive to extreme values
One data point could make a great change in
sample mean
Step (2): Describing Variability
Sample variance (s²): Is the average of
the square of the deviations about the
sample mean
Sample standard deviation (s or
SD): Is the square root of s² (to determine
range)

6
Standard Deviation (SD)
Ʃ = the sum of all observations
= the mean value
x = the value of each individual observation
n = number of observations
• The greater the SD, the more spread out the observations are
Example: Describing Variability
Five systolic blood pressures (mmHg) (n = 5): 120, 80, 90, 110, 95
The sample mean x= 99 mmHg

7
Example: Describing Variability
Sample variance (s²)
Sample standard deviation (SD) = √s²
SD = 15.97 (mmHg)
∴ Blood pressure = 99 ± 16 mmHg
∴ Blood pressure = 99 ± 16 mmHg

8
• A standard normal curve showing the normal range, which encloses
95% of the data and is bounded by the mean ± 2 standard deviations.
• Five systolic blood pressures (mmHg) (n = 5) : 120, 80, 90, 110, 95
• A standard normal curve showing the normal range, which encloses
95% of the data and is bounded by the mean ± 2 standard deviations.
(67 – 131)

9
♦ Notes on SD
• The bigger SD is, the more variability there is
• SD measures the spread about the mean
• SD can equal 0 only if there is no spread
• All n observations have the same value
• The units of SD are the same as the units of the data (for example, mm
Hg)
◘ Establishment of Reference Ranges
♦ Reference ranges:
Normal ranges ( reference ranges ) are defined as being within +/- 2
Standard Deviations from the mean
Each lab must establish its own reference ranges based on local
population
♦ Factors affecting reference ranges
Age
Sex
Diet
Medications
Physical activity
Pregnancy
Personal habits ( smoking, alcohol )
Geographic location ( altitude )
Body weight
Laboratory instrumentation ( methodologies )
Laboratory reagents

11
♦ An example of the Standard Deviation to determine Reference Range
• A minimum of 20 observations should be sampled in order to obtain
valid results (but I will use just 6 to save time)
• Let us determine the normal range for fasting plasma glucose using 6
people:
Ahmed glucose = 98 mg/dl
Aly glucose = 100 mg/dl
Fahmy glucose = 105 mg/dl
Magdy glucose = 150 mg/dl
Mohamed glucose = 102 mg/dl
Nabil glucose = 101 mg/dl
- Mean = 109 mg/dl
- SD = 20.0 mg/dl
- 2 SD = 40.0 mg/dl
• That means that the normal range for this group is 109 ± 40, or from 69
– 149 which is Mean ± 2 SD
• Magdy is considered abnormal if we use these commonly accepted
criteria to define normal and abnormal
Example 2: Use of Standard Deviation to obtain the „range of acceptable
results’
Mean of group of control values = 104 mg/dL
Standard Deviation = ± 5 mg/dL
1. Determine the Range of ± 2 SD; (which will allow you to evaluate the
reference range)
2. Is a control value of 100 mg/dL acceptable?
3. Using 95% confidence limits, how often will a control be out of range
(statistically)?

11
Answer: 5% of the time
That is 1 out of every 20 times!
What if the control specimen is “out of control?”
“Out of control” means that there is too much dispersion in your result
compared with the rest of the results – it‟s “weird”
This suggests that something is wrong with the process that generated
that observation
Patient test results cannot be reported to physicians when there is
something wrong with the testing process that is generating
inaccurate reports
Things that can go wrong, and their corrections
I.e. Corrective methods
1. Instrumentation malfunction ( fix the machine)
2. Reagents deteriorated, contaminated, improperly prepared or simply
used up (get new reagents)
3. Tech error (identify error and repeat the test)
4. Control specimen is deteriorated or improperly prepared (get new
control)
◘ Coefficient of Variation (CV %)
• A way of expressing standard deviation in terms of average value of the
observations used in the calculation
• The CV allows us to compare different sets of observations relative to
their means.

12
♦ Example:
• Compare two different procedures for glucose
They use different reagents, have different means, produce different
SDs, etc.
Since they are different and cannot be compared directly, use CV
formula.
The CV allows us to compare different sets of observations relative to
their means
You cannot use the SD to compare different groups of data because
they are measuring different observations - you cannot compare apples to
oranges.
The CV can turn all groups of observations into a percentage of their
relative means - everything gets turned into “oranges.”
The smaller the CV, the more reproducible the results: more values are
closer to the mean.
◘ Critical Differences (CD)
• The preceding discussion relates to the comparison of measured data
with expected values based on observations on other people.
• When a measurement has been repeated, it is more relevant to consider it
in relation to the previous value (Follow-up).
The relevant question is whether the two values differ significantly.
This will depend on two factors:
1) The change in the parameter level being measured (biological
variation), and
2) The analytical variation
Both of these can be determined from the results of repeated
measurements of the same samples, and a function known as the critical
difference (CD) calculated from the equation:
3) Where SDA and SDB are the analytical and biological standard
deviations, respectively (they should be available for each Lab)

13
- Values for some CDs should be available from the local laboratory
♦ Critical differences are independent of normal ranges.
- Indeed, two results may be critically different yet both are within the
normal range.
♦ Example:
• The typical human reference ranges for serum creatinine are
- 0.5 to 1.0 mg/dL for women
- and 0.7 to 1.2 mg/dL for men.
• Consider, for example, an increase in serum creatinine concentration
from 0.9 to 1.13 mg/dL = 0.23 mg/dL
• The CD for creatinine in this range of concentrations is about 0.19
mg/dL (Calculated from different laboratories) (See References).
• Thus, an increase of 0.23 mg/dL is of potential clinical significance
(implying a decrease in renal function), even though both values are
within the normal range.
• This discussion emphasizes the fact that serial measurements in
individuals can (and in practice frequently are) more informative than
„one-off‟ measurements.
NB: Creatinine 1 mg/dL = 88.4 μmol/L

14
◘ Screening Tests
♦ Terms Related to Screening Tests
Sensitivity - ability of a test to identify those who have disease
Specificity - ability of a test to exclude those who do not have
disease
Tests with dichotomous results – tests that give either positive
or negative results
Tests of continuous variables – tests that do not yield obvious
“positive” or “negative” results, but require a cutoff level to be
established as criteria for distinguishing between “positive” and
“negative” groups
An important public health consideration, is:
How good is the test at identifying people with the disease and
without the disease?
♦ In other words:
If we screen a population, what proportion of people who have the
disease will be correctly identified?

16
Sample variance (s²)◘Sample Mean◘
◘ The normal range = the
mean ± 2 standard
deviations.
◘Sample standard deviation (s or SD)
Critical Differences (CD)◘
• SDA and SDB are the analytical and
biological standard deviations,
respectively
◘ Coefficient of Variation (CV %)
◘ The median
◘ The Mode
◘ Screening Test

Data analysis ( Bio-statistic )

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Data analysis ( Bio-statistic )

Semelhante a Data analysis ( Bio-statistic ) (20)

Mais de Amany Elsayed

Mais de Amany Elsayed (20)

Último

Último (20)

Data analysis ( Bio-statistic )