SlideShare a Scribd company logo
1 of 11
Download to read offline
MedPage Tools

                          .com
                                  Guide to Biostatistics
Study Designs                     Here is a compilation of important epidemiologic concepts and
                                  common biostatistical terms used in medical research. You can
                                  use it as a reference guide when reading articles published on
n   How Research Is Classified
                                  MedPage Today or download it to keep near the reading stand
                                  where you keep your print journals. For more detailed infor-
n   Terminology
                                  mation on these topics, use the reference list at the end of this
                                  presentation.
n   Important epidemiologic
    
    concepts
                                  Study Designs in Clinical Research

Descriptive Statistics            How research is classified1
                                                                                       Did
n   Measures of central
                                                                     Yes           researcher
                                                                                      assign
    tendency                                                                        exposures?

n   Measures of spread
                                       Experimental
                                                                                          No
                                           study
                                                                                 Observational
n   Measures of frequency of
    
                                                                                    study
    events
                                               Is          Yes
                                                                 Randomised
n   Measures of Association
                                         allocation
                                                                  controlled
                                          Random?                                    Is there a
                                                                                    Comparison
n   Terms used to describe the
    
                                                                                      group?
    quality of measurements                     No

                                       Non-Randomised
n   Measures of diagnostic test
                                       controlled Trial
                                                                                                    No
                                                                      Yes
    accuracy
                                                                       Analytical           Descriptive
                                                                        Study                 Study
n   Expressions used when
    
    making inferences about
    data
                                                                      Direction of
n   Multivariable Regression
                                                                     the study?
    Methods
                                                       Exposure                Exposure            Exposure and
                                                                                                    Outcome at
                                                       Outcome                 Outcome             the same time
References
                                                Cohort               Case control          Cross-sectional
                                                 study                  study                  study
Terminology
                                  Clinical Trial Experimental study in which the exposure status
                          .com    (e.g. assigned to active drug versus placebo) is determined by
                                  the investigator.

                                  Randomized Controlled Trial A special type of clinical trial
Study Designs                     in which assignment to an exposure is determined purely by
                                  chance.
n   How Research Is Classified    Cohort Study Observational study in which subjects with an
                                  exposure of interest (e.g. hypertension) and subjects without
n   Terminology                   the exposure are identified and then followed forward in time to
                                  determine outcomes (e.g. stroke).
n   Important epidemiologic
    
    concepts                      Case-Control Study Observational study that first identifies a
                                  group of subjects with a certain disease and a control group
                                  without the disease, and then looks to back in time (e.g. chart
                                  review) to find exposure to risk factors for the disease. This type
Descriptive Statistics            of study is well suited for rare diseases.
                                  Cross-Sectional Study Observational study that is done to ex-
n   Measures of central
                                 amine presence or absence of a disease or presence or absence
    tendency                      of an exposure at a particular time. Since exposure and outcome
                                  are ascertained at the same time, it is often unclear if the expo-
n   Measures of spread
                                 sure preceded the outcome.

n   Measures of frequency of
                                 Case Report or Case Series Descriptive study that reports on
    events                        a single or a series of patients with a certain disease. This type
                                  of study usually generates a hypothesis but cannot test a hy-
n   Measures of Association
                                 pothesis because it does not include an appropriate comparison
                                  group.
n   Terms used to describe the
    
    quality of measurements       Important Epidemiologic Concepts
                                  Bias Any systematic error in the design or conduct of a study
n   Measures of diagnostic test
                                 that results in a mistaken estimate of an exposure’s effect on risk
    accuracy                      of disease.

n   Expressions used when
                                 Selection Bias Bias introduced by the way in which participants
    making inferences about       are chosen for a study. For example, in a case-control study using
    data                          different criteria to select cases (e.g. sick, hospitalized population)
                                  versus controls (young, healthy outpatients) other than the pres-
n   Multivariable Regression
                                 ence of disease can lead the investigator to a false conclusion
    Methods                       about an exposure.
                                  Confounding This occurs when an investigator falsely concludes
                                  that a particular exposure is causally related to a disease without
                                  adjusting for other factors that are known risk factors for the
References                        disease and are associated with the exposure.
Descriptive Statistics
                          .com
                                  Measures of Central Tendency
                                  Mean equals the sum of observations divided by the number of
                                  observations.
Study Designs
                                  Median equals the observation in the center when all observa-
                                  tions are ordered from smallest to largest; when there is an even
n   How Research Is Classified
                                  number of observations the median is defined as the average of
                                  the middle two values.
n   Terminology
                                  Mode equals the most frequently occurring value among all
n   Important epidemiologic
                                 observations.
    concepts
                                  Measures of Spread
                                  Spread (or variability) describes the manner in which data are
Descriptive Statistics            scattered around a specific value (such as the mean). The most
                                  commonly used measures of spread are:
n   Measures of central
                                 Range is the difference between the largest observation and
    tendency                      the smallest.

n   Measures of spread
                                 Standard Deviation measures the spread of data around the
                                  mean. One standard deviation includes 68% of the values in a
n   Measures of frequency of
                                 sample population and two standard deviations include 95% of
    events                        the values.
                                  Standard Error of the Mean describes the amount of variability
n   Measures of Association
                                 in the measurement of the population mean from several differ-
                                  ent samples. This is in contrast to the standard deviation which
n   Terms used to describe the
                                 measures the variability of individual observations in a sample.
    quality of measurements
                                  Percentile equals the percentage of a distribution that is below
                                  a specific value. As an example, a child is in the 80th percentile
n   Measures of diagnostic test
    
                                  for height if only 20% of children of the same age are taller than
    accuracy
                                  he is.
n   Expressions used when
                                 Interquartile Range refers to the upper and lower bound-
    making inferences about       ary defining the middle 50 percent of observations. The upper
    data                          boundary is the 75th percentile and the lower boundary is the
                                  25th percentile.
n   Multivariable Regression
    
    Methods                       Measures of Frequency of Events
                                  Incidence The number of new events (e.g. death or a particular
                                  disease) that occur during a specified period of time in a popula-
References                        tion at risk for developing the events.

                                  Incidence Rate A term related to incidence that reports the
                                  number of new events that occur over the sum of time indi-
                                  viduals in the population were at risk for having the event (e.g.
events/person-years).
                                  Prevalence The number of persons in the population affected by
                          .com    a disease at a specific time divided by the number of persons in
                                  the population at the time.


Study Designs                     Measures of Association
                                  The types of measures used to define the association between
n   How Research Is Classified    exposures and outcome depends upon the type of data. For
                                  categorical variables, the relative risk and odds ratio are com-
n   Terminology                   monly used to describe the relationship between exposures and
                                  outcome.
n   Important epidemiologic
    
                                  Relative risk and cohort studies The relative risk (or risk ratio)
    concepts
                                  is defined as the ratio of the incidence of disease in the exposed
                                  group divided by the corresponding incidence of disease in the
                                  unexposed group (Figure 2). Relative risk can be calculated in co-
Descriptive Statistics            hort studies such as the Framingham Heart Study where subjects
                                  with certain exposures (e.g. hypertension, hyperlipidemia) were
n   Measures of central
                                 followed prospectively for cardiovascular outcomes. The inci-
    tendency                      dence of cardiac events in subjects with and without exposures
                                  was then used to calculate relative risk and determine whether
n   Measures of spread
                                 exposures were cardiac risk factors.
                                  Odds ratio and case-control studies The odds ratio is defined
n   Measures of frequency of
                                 as the odds of exposure in the group with disease divided by the
    events                        odds of exposure in the control group (Figure 1). As described
                                  above, subjects are selected on the basis of disease status in
n   Measures of Association
                                 case-control studies, therefore it is not possible to calculate the
                                  rate of development of disease given presence or absence of
n   Terms used to describe the
                                 exposure. So, the odds ratio is often used to approximate the
    quality of measurements
                                                       Cohort Study                    Case Control Study
n   Measures of diagnostic test
                                                        Disease                               Disease
    accuracy                                       Yes                 No               Yes               No

n   Expressions used when
    
    making inferences about                +       A                   B                 A                B
    data
                                    Test




n   Multivariable Regression
    
    Methods                                –       C                   D                 C                D



                                                       Relative Risk                         Odds Ratio
References                                              A/(A+B)                              A/C = A×D
                                                        C/(C+D)                              B/D B×C

                                  Figure 1: In a case-control study, the odds ratio can be used to approximate the
                                             relative risk under the assumption that the disease is rare.
relative risk in case-control studies. For example, a case-control
                                  study was done to evaluate the relationship between artificial
                          .com    sweeteners and bladder cancer. The odds of artificial sweetener
                                  use in the cases and controls were used to calculate an odds
                                  ratio and determine whether sweeteners were associated with
                                  bladder cancer. Under the assumption that the disease under
Study Designs                     consideration is rare (e.g. bladder cancer), the odds ratio gives a
                                  stable, unbiased estimate of the relative risk (Figure 1). The odds
n   How Research Is Classified    ratio from a case-control study nested within a defined cohort
                                  also approximates the relative risk even when the rare disease
n   Terminology                   assumption is not held.
                                  If the disease is rare, AB and CD. So, A/(A + B) is approximated
n   Important epidemiologic
                                 by A/B and C/(C + D) approximated by C/D. In this situation, the
    concepts                      relative risk equals (A/B)/(C/D) which, rearranged, equals the odds
                                  ratio A×D/B×C
                                  Absolute risk The relative risk and odds ratio provide a measure
Descriptive Statistics            of risk compared with a standard. However, it is sometimes desir-
                                  able to know the absolute risk. For example, a 40% increase in
n   Measures of central
                                 risk of heart disease because of a particular exposure does not
    tendency                      provide insight into the likelihood that exposure in an individual
                                  patient will lead to heart disease.
n   Measures of spread
    
                                  The Attributable risk or Risk difference is a measure of abso-
                                  lute risk. It represents the excess risk of disease in those exposed
n   Measures of frequency of
    
                                  taking into account the background rate of disease. The attribut-
    events
                                  able risk is defined as the difference between the incidence rates
                                  in the exposed and non-exposed groups.
n   Measures of Association
    
                                  A related term, the Population Attributable Risk is used to de-
n   Terms used to describe the
                                 scribe the excess rate of disease in the total study population of
    quality of measurements       exposed and non-exposed individuals that is attributable to the
                                  exposure. This measure is calculated by multiplying the Attributable
n   Measures of diagnostic test
                                 risk by the proportion of exposed individuals in the population.
    accuracy                      Number needed to treat (NNT) The number of patients who
                                  would need to be treated to prevent one adverse outcome is
n   Expressions used when
                                 often used to present the results of randomized trials. NNT is the
    making inferences about       reciprocal of the absolute risk reduction (the absolute adverse
    data                          event rate for placebo minus the absolute adverse event rate for
                                  treated patients). This approach can be used in studies of vari-
n   Multivariable Regression
                                 ous interventions including both treatment and prevention. The
    Methods                       estimate for NNT is subject to considerable error and is generally
                                  presented with 95% confidence intervals so that it can be prop-
                                  erly interpreted.
References
                                  Terms Used To Describe The Quality Of Measurements
                                  Reliability The concept of reliability or reproducibility is related
                                  to the amount of error in any measurement (e.g. blood pressure
measurement). A more formal definition of reliability is variability
                                  between subjects divided by inter-subject variability plus mea-
                          .com    surement error. Thus, reliability is greater when measurement er-
                                  ror is minimal. There are several types of reliability including: inter
                                  and intra-observer reliability and test-retest reliability.
                                  Percent agreement and the kappa statistic are often used to re-
Study Designs                     port reliability. The kappa statistic takes into account agreement
                                  that would be seen by chance alone while percent agreement
n   How Research Is Classified    does not. Generally, a kappa greater than 0.75 represents excel-
                                  lent agreement beyond chance, a kappa below 0.40 represents
n   Terminology                   poor agreement and a kappa of 0.40-0.75 represents intermedi-
                                  ate to good agreement.
n   Important epidemiologic
    
    concepts                      Validity refers to the extent to which a test or surrogate is mea-
                                  suring what we think it is measuring. There are several types of
                                  validity that can be measured including content validity (the
                                  extent to which the measure reflects the dimensions of a particu-
Descriptive Statistics            lar problem), construct validity (the extent to which a measure
                                  conforms to an external established phenomenon), and criterion
n   Measures of central
                                 validity (the extent to which a measure correlates with a gold
    tendency                      standard or can predict an observable phenomenon). These
                                  types of validity are often applied to questionnaires in which the
n   Measures of spread
                                 truth is not physically verifiable.

n   Measures of frequency of
                                 Measures Of Diagnostic Test Accuracy
    events                        Sensitivity is defined as the ability of the test to identify cor-
                                  rectly those who have the disease. It is the number of subjects
n   Measures of Association
                                 with a positive test who have disease divided by all subjects who
                                  have the disease. A test with high sensitivity has few false nega-
n   Terms used to describe the
                                 tive results.
    quality of measurements
                                  Specificity is defined as the ability of the test to identify cor-
n   Measures of diagnostic test
                                 rectly those who do not have the disease. It is the number of
    accuracy                      subjects who have a negative test and do not have the disease
                                  divided by the number of subjects who do not have the disease.
n   Expressions used when
                                 A test with high specificity has few false positive results.
    making inferences about       Sensitivity and specificity are test characteristics that are most
    data                          useful when assessing a test used to screen a free-living popula-
                                  tion. These test characteristics are also interdependent: an in-
n   Multivariable Regression
                                 crease in sensitivity is accompanied by a decrease in specificity
    Methods                       and visa versa. This is illustrated best by continuous tests where
                                  the cut-off for a positive test result can be varied. For example,
                                  consider the use of the white blood cell (WBC) count as a test to
                                  diagnose bacterial infection. If one sets a high cut-off for a posi-
References
                                  tive test (e.g. WBC 25,000) then the test will have a low sensitiv-
                                  ity and high specificity compared to the test characteristics if the
                                  cut-off is lower (e.g. WBC10,000).
Predictive values are important for assessing how useful a test
                                  will be in the clinical setting at the individual patient level. The
                          .com    positive predictive value is the probability of disease in a pa-
                                  tient with a positive test. Conversely, the negative predictive
                                  value is the probability that the patient does not have disease if
                                  he has a negative test result.
Study Designs                     Predictive values depend on the prevalence of a disease in a
                                  population. A test with a given sensitivity and specificity can
n   How Research Is Classified    have different predictive values in different patient populations.
                                  If the test is used in a population with a high prevalence, it will
n   Terminology                   have a high positive predictive value and the same test will have
                                  a low positive predictive value when used in a population with
n   Important epidemiologic
                                 low disease prevalence. For example, a positive stool test for oc-
    concepts                      cult blood is much more likely to be predictive of colon cancer in
                                  a population of elderly people compared with twenty year olds.

Descriptive Statistics                                  Disease

                                                  Yes              No
n   Measures of central
                                                                                              Sensitivity
    tendency                                                                                    A/(A+C)
                                          +       A                 B
                                                                                               Specificity
n   Measures of spread
    
                                                                                                D/(B+D)
                                   Test




n   Measures of frequency of
                                                                                     Positive Predictive Value
    events                                -       C                 D
                                                                                               A/(A+B)

                                                                                      Negative Predictive Value
n   Measures of Association
                                                                                             D/(C+D)


n   Terms used to describe the
                                 Figure 2: Calculating sensitivity, specificity, and predictive values
    quality of measurements

n   Measures of diagnostic test
                                 Likelihood ratios Calculating likelihood ratios is another meth-
    accuracy                      od of assessing the accuracy of a test in the clinical setting. Likeli-
                                  hood ratios also offer the advantage of being independent of
n   Expressions used when
                                 disease prevalence.
    making inferences about       The likelihood ratio indicates how much a given diagnostic test
    data                          result will raise or lower the odds of having a disease relative to
                                  the prior probability of disease. Each diagnostic test is character-
n   Multivariable Regression
                                 ized by two likelihood ratios: a positive likelihood ratio that tells
    Methods                       us the odds of disease if the test result is positive and a negative
                                  likelihood ratio that tells us the odds of disease if the test result is
                                  negative:
References                        LR+ = Sensitivity / (1- Specificity)
                                  LR- = (1- Sensitivity) / Specificity
                                  A likelihood ratio greater than 1 increases the odds that the per-
son has the target disease, and the higher the LR the greater this
                                  increase in odds. Conversely, a likelihood ratio less than 1 dimin-
                          .com    ishes the odds that the patient has the target disease.

                                  Expressions Used When Making Inferences About Data
                                  Confidence Intervals The results of any study sample are an
Study Designs                     estimate of the true value in the entire population. The true value
                                  may actually be greater or less than what is observed. A confi-
n   How Research Is Classified    dence interval gives a range of values within which there is a high
                                  probability (95% by convention) that the true population value
n   Terminology                   can be found. The confidence interval takes into consideration the
                                  number of observations and the standard deviation in the sample
n   Important epidemiologic
                                 population. The confidence interval narrows as the number of
    concepts                      observations increases or standard deviation decreases.

                                  Errors In hypothesis testing, there are two types of errors:

Descriptive Statistics              Type I error (alpha) is the probability of incorrectly concluding
                                    there is a statistically significant difference in the population
n   Measures of central
                                   when none exists. This type of error is also called alpha and is
    tendency                        the number after a P-value. A P0.05 means that there is a less
                                    than 5% chance that the difference could have occurred by
n   Measures of spread
                                   chance.
                                    Type II error (beta) is the probability of incorrectly concluding
n   Measures of frequency of
                                   that there is no statistically significant difference in a popula-
    events                          tion when one exists.
                                  Power is a measure of the ability of a study to detect a true dif-
n   Measures of Association
    
                                  ference. It is measured as 1- type II error rate or 1-beta. Every
                                  researcher should perform a power calculation prior to carrying
n   Terms used to describe the
    
                                  out a study to determine the number of observations needed
    quality of measurements
                                  to detect a desired degree of difference. Ideally this difference
                                  should equal the smallest difference that would still be consid-
n   Measures of diagnostic test
    
                                  ered to be clinically important. However, the smaller the dif-
    accuracy
                                  ference, the greater the number of observations needed. For
                                  example, it takes fewer patients to observe a 50% reduction in
n   Expressions used when
                                 mortality from a new therapy than a 5% reduction.
    making inferences about
    data
                                  Multivariable Regression Methods
n   Multivariable Regression
                                 In medical research, one is often interested in studying the inde-
    Methods                       pendent effect of multiple risk factors on outcome. For example,
                                  we may want to know the independent effect of age, gender and
                                  smoking status on the risk of having a myocardial infarction. Fur-
                                  thermore, we may want to know if smoking raises the risk equally
References                        in men and women. Multivariable regression methods allow us
                                  to answer these types of questions by simultaneously account-
                                  ing for multiple variables. The type of regression model used
                                  depends on the type of outcome data being evaluated.
Multiple linear regression is used when the outcome data is
                                  a continuous variable such as weight. For example, one could
                          .com    estimate the effect of a diet on weight after adjusting for the ef-
                                  fect of confounders such as smoking status. Another use of this
                                  method is to predict a linear variable based on known variables.
                                  Logistic regression is used when the outcome data is binary
Study Designs                     such as cure or no cure. Logistic regression can be used to esti-
                                  mate the effect of an exposure on a binary outcome after adjust-
n   How Research Is Classified    ing for confounders. Logistic regression can also be used to find
                                  factors that discriminate two groups or to find prognostic indica-
n   Terminology                   tors for a binary outcome. This method can also be applied to
                                  case-control studies.
n   Important epidemiologic
    
    concepts
                                  Survival Analysis
                                  In survival analysis, one is commonly interested in the time until
                                  some event such as the time from treatment of disease to death.
Descriptive Statistics            In the study population, only some subjects will have the event of
                                  interest (e.g. death, stroke), others will have alternate events or no
n   Measures of central
                                 events. The duration of follow-up will also vary among subjects
    tendency                      and it is important to account for the different follow-up times.

n   Measures of spread
                                 The Kaplan-Meier analysis and a regression method, the Cox pro-
                                  portional hazards analysis are two methods of survival analysis that
n   Measures of frequency of
                                 account for inter-subject variation in events and follow-up time.
    events                        Kaplan-Meier analysis measures the ratio of surviving subjects
                                  (or those without an event) divided by the total number of sub-
n   Measures of Association
                                 jects at risk for the event. Every time a subject has an event, the
                                  ratio is recalculated. These ratios are then used to generate a
n   Terms used to describe the
                                 curve to graphically depict the probability of survival (Figure 3).
    quality of measurements

n   Measures of diagnostic test
                                                   100
    accuracy                                                                          Drug Group

    Expressions used when
    
                                     Percent Survival




n
    making inferences about
    data

n   Multivariable Regression
    
    Methods                                                                       Placebo Group



References                                              0

                                                                     Follow-up Time

                                  Figure 3: Kaplan-Meier Survival Curves
In studies with an intervention arm and a control arm, one can
                                  generate two Kaplan-Meier curves. If the curves are close to-
                          .com    gether or cross, a statistically significant difference is unlikely to
                                  exist. Statistical tests such as the log-rank test can be used to
                                  confirm the presence of a significant difference.
                                  Cox proportional hazards analysis is similar to the logistic
Study Designs                     regression method described above with the added advantage
                                  that it accounts for time to a binary event in the outcome vari-
n   How Research Is Classified    able. Thus, one can account for variation in follow-up time among
                                  subjects. Like the other regression methods described above, it
n   Terminology                   can be used to study the effect of an exposure on outcome after
                                  adjusting for confounders. Cox analysis can also be used to find
n   Important epidemiologic
                                 prognostic indicators for survival in a given disease.
    concepts
                                  The hazard ratio that results from this analysis can be interpreted
                                  as a relative risk (risk ratio). For example, a hazard ratio of 5
                                  means that the exposed group has five times the risk of having
Descriptive Statistics            the event compared to the unexposed group.

n   Measures of central
    
    tendency
                                  Rubeen K. Israni, M.D.
n   Measures of spread
                                 Fellow, Renal-Electrolyte and Hypertension Division,
                                  University of Pennsylvania School of Medicine.
n   Measures of frequency of
    
    events

n   Measures of Association
    

n   Terms used to describe the
    
    quality of measurements

n   Measures of diagnostic test
    
    accuracy

n   Expressions used when
    
    making inferences about
    data

n   Multivariable Regression
    
    Methods



References
References
                                  1. Grimes DA, Schultz KF: An overview of clinical research: the lay of the land.
                          .com       Lancet 359:57-61, 2002

                                  2. Grimes DA, Schultz KF: Bias and causal associations in observational
                                     research. Lancet 359:248-252, 2002.
Study Designs                     3. Gordis L: Epidemiology, 3rd Edition, Philadelphia, Elsevier Saunders, 2004.

■   How Research Is Classified    4. Rosner B: Fundamentals of Biostatistics, 4th Edition, Daxbury Press, 1995.

                                  5. Grimes DA, Schultz KF: Cohort studies: marching towards outcomes. Lancet
■   Terminology                      359: 341-345, 2002.

■   Important epidemiologic       6. Schultz KF, Grimes DA: Case-control studies: research in reverse. Lancet
    concepts                         359:431-434, 2002.

                                  7. Streiner DL, Norman GR: Health Measurement Scales: A Practical Guide to
                                     their Development and Use, 2nd Edition, New York, Oxford University Press,
                                     2000.
Descriptive Statistics
                                  8. Jaeschke R, Guyatt GH, Sackett DL: Users’ guides to the medical literature. III.
■   Measures of central              How to use an article about a diagnostic test. B. What are the results and
    tendency                         will they help me in caring for my patients? The Evidence-Based Medicine
                                     Working Group. Jama 271:703-707, 1994.
■   Measures of spread
                                  9. Guyatt G, Jaeshke R, Heddle N, et al. Basic statistics for clinicians: 2.
                                     Interpreting study results: confidence intervals. Cmaj 152:169-173, 1995.
■   Measures of frequency of
    events                        10. Katz MH: Multivariable analysis: a primer for readers of medical research.
                                      Ann Intern Med 138:644-650, 2003.
■   Measures of Association       11. Campbell MJ: Statistics at Square Two, 4th Edition, London, BMJ Publishing
                                      Group, 2004.
■   Terms used to describe the
    quality of measurements

■   Measures of diagnostic test
    accuracy

■   Expressions used when
    making inferences about
    data

■   Multivariable Regression
    Methods



References

                                  Copyright © 2007 MedPage Today, LLC. All Rights Reserved.
                                  Source: http://www.medpagetoday.com/

More Related Content

What's hot

Seneca psych 100 - class one - introduction to psychology and research methods
Seneca   psych 100 - class one - introduction to psychology and research methodsSeneca   psych 100 - class one - introduction to psychology and research methods
Seneca psych 100 - class one - introduction to psychology and research methodscgoldfried
 
Gemechu keneni(PhD) document
Gemechu keneni(PhD) documentGemechu keneni(PhD) document
Gemechu keneni(PhD) documentgetahun bekana
 
Nursingnotes.info nursing-research-review
Nursingnotes.info nursing-research-reviewNursingnotes.info nursing-research-review
Nursingnotes.info nursing-research-reviewgrey clemente
 
Comparison of effect sizes associated with surrogate and final primary endpoi...
Comparison of effect sizes associated with surrogate and final primary endpoi...Comparison of effect sizes associated with surrogate and final primary endpoi...
Comparison of effect sizes associated with surrogate and final primary endpoi...HTAi Bilbao 2012
 
Qualitative research Mkep
Qualitative research MkepQualitative research Mkep
Qualitative research MkepGalih Priambodo
 
Casual research design experimentation
Casual research design experimentationCasual research design experimentation
Casual research design experimentationRohit Kumar
 
Michael Festing - The Principles of Experimental Design
Michael Festing - The Principles of Experimental DesignMichael Festing - The Principles of Experimental Design
Michael Festing - The Principles of Experimental DesignMedicReS
 
Research Methodology - Study Designs
Research Methodology - Study DesignsResearch Methodology - Study Designs
Research Methodology - Study DesignsAzmi Mohd Tamil
 
Research methods revision 2015
Research methods revision 2015Research methods revision 2015
Research methods revision 2015coburgpsych
 
Two-Factor Experiment and Three or More Factor Experiments
Two-Factor Experiment and Three or More Factor ExperimentsTwo-Factor Experiment and Three or More Factor Experiments
Two-Factor Experiment and Three or More Factor ExperimentsMary Anne (Riyan) Portuguez
 
Research definition (1) word
Research definition (1) wordResearch definition (1) word
Research definition (1) wordMarileen Maglaque
 
Statistics basics for oncologist kiran
Statistics basics for oncologist kiranStatistics basics for oncologist kiran
Statistics basics for oncologist kiranKiran Ramakrishna
 

What's hot (16)

Landau dunn 20march2013
Landau dunn 20march2013Landau dunn 20march2013
Landau dunn 20march2013
 
Seneca psych 100 - class one - introduction to psychology and research methods
Seneca   psych 100 - class one - introduction to psychology and research methodsSeneca   psych 100 - class one - introduction to psychology and research methods
Seneca psych 100 - class one - introduction to psychology and research methods
 
Gemechu keneni(PhD) document
Gemechu keneni(PhD) documentGemechu keneni(PhD) document
Gemechu keneni(PhD) document
 
Nursingnotes.info nursing-research-review
Nursingnotes.info nursing-research-reviewNursingnotes.info nursing-research-review
Nursingnotes.info nursing-research-review
 
Comparison of effect sizes associated with surrogate and final primary endpoi...
Comparison of effect sizes associated with surrogate and final primary endpoi...Comparison of effect sizes associated with surrogate and final primary endpoi...
Comparison of effect sizes associated with surrogate and final primary endpoi...
 
Qualitative research Mkep
Qualitative research MkepQualitative research Mkep
Qualitative research Mkep
 
Basics in Research
Basics in Research Basics in Research
Basics in Research
 
Casual research design experimentation
Casual research design experimentationCasual research design experimentation
Casual research design experimentation
 
Michael Festing - The Principles of Experimental Design
Michael Festing - The Principles of Experimental DesignMichael Festing - The Principles of Experimental Design
Michael Festing - The Principles of Experimental Design
 
Research design and approachs
Research design and approachsResearch design and approachs
Research design and approachs
 
Research Methodology - Study Designs
Research Methodology - Study DesignsResearch Methodology - Study Designs
Research Methodology - Study Designs
 
Research methods revision 2015
Research methods revision 2015Research methods revision 2015
Research methods revision 2015
 
Two-Factor Experiment and Three or More Factor Experiments
Two-Factor Experiment and Three or More Factor ExperimentsTwo-Factor Experiment and Three or More Factor Experiments
Two-Factor Experiment and Three or More Factor Experiments
 
Research definition (1) word
Research definition (1) wordResearch definition (1) word
Research definition (1) word
 
Statistics basics for oncologist kiran
Statistics basics for oncologist kiranStatistics basics for oncologist kiran
Statistics basics for oncologist kiran
 
pengajian malaysia
pengajian malaysiapengajian malaysia
pengajian malaysia
 

Viewers also liked

Ezz eazy biostatistics for crash course
Ezz eazy biostatistics for crash courseEzz eazy biostatistics for crash course
Ezz eazy biostatistics for crash courseBasalama Ali
 
Basic epidemiology by bonita
Basic epidemiology by bonitaBasic epidemiology by bonita
Basic epidemiology by bonitaSM Lalon
 
Chronic obstructive pulmonary disease ppt
Chronic obstructive pulmonary disease   pptChronic obstructive pulmonary disease   ppt
Chronic obstructive pulmonary disease pptMeklelle university
 
Commonly used Statistics in Medical Research Handout
Commonly used Statistics in Medical Research HandoutCommonly used Statistics in Medical Research Handout
Commonly used Statistics in Medical Research HandoutPat Barlow
 
Basics of medical statistics
Basics of medical statisticsBasics of medical statistics
Basics of medical statisticsRamachandra Barik
 
Medical Statistics Pt 1
Medical Statistics Pt 1Medical Statistics Pt 1
Medical Statistics Pt 1Fastbleep
 
Commonly Used Statistics in Medical Research Part I
Commonly Used Statistics in Medical Research Part ICommonly Used Statistics in Medical Research Part I
Commonly Used Statistics in Medical Research Part IPat Barlow
 

Viewers also liked (10)

Ezz eazy biostatistics for crash course
Ezz eazy biostatistics for crash courseEzz eazy biostatistics for crash course
Ezz eazy biostatistics for crash course
 
Epidemiology
EpidemiologyEpidemiology
Epidemiology
 
Basic epidemiology by bonita
Basic epidemiology by bonitaBasic epidemiology by bonita
Basic epidemiology by bonita
 
Chronic obstructive pulmonary disease ppt
Chronic obstructive pulmonary disease   pptChronic obstructive pulmonary disease   ppt
Chronic obstructive pulmonary disease ppt
 
Medical statistics
Medical statisticsMedical statistics
Medical statistics
 
Commonly used Statistics in Medical Research Handout
Commonly used Statistics in Medical Research HandoutCommonly used Statistics in Medical Research Handout
Commonly used Statistics in Medical Research Handout
 
Basics of medical statistics
Basics of medical statisticsBasics of medical statistics
Basics of medical statistics
 
Medical Statistics Pt 1
Medical Statistics Pt 1Medical Statistics Pt 1
Medical Statistics Pt 1
 
Commonly Used Statistics in Medical Research Part I
Commonly Used Statistics in Medical Research Part ICommonly Used Statistics in Medical Research Part I
Commonly Used Statistics in Medical Research Part I
 
INTRODUCTION TO BIO STATISTICS
INTRODUCTION TO BIO STATISTICS INTRODUCTION TO BIO STATISTICS
INTRODUCTION TO BIO STATISTICS
 

Similar to Basic biostatistics

Effective strategies to monitor clinical risks using biostatistics - Pubrica....
Effective strategies to monitor clinical risks using biostatistics - Pubrica....Effective strategies to monitor clinical risks using biostatistics - Pubrica....
Effective strategies to monitor clinical risks using biostatistics - Pubrica....Pubrica
 
Living evidence 3
Living evidence 3Living evidence 3
Living evidence 3stanbridge
 
Summarising a journal article
Summarising a journal articleSummarising a journal article
Summarising a journal articleAaron Poppleton
 
Introduction to educational research
Introduction to educational researchIntroduction to educational research
Introduction to educational researchMuntajab Abbas
 
Medpage guide-to-biostatistics
Medpage guide-to-biostatisticsMedpage guide-to-biostatistics
Medpage guide-to-biostatisticsElsa von Licy
 
Basics statistics
Basics statistics Basics statistics
Basics statistics BITS
 
Research Design for health care students
Research Design for health care studentsResearch Design for health care students
Research Design for health care studentsCharu Parthe
 
Understanding Educational Enquiry
Understanding Educational EnquiryUnderstanding Educational Enquiry
Understanding Educational Enquirysyuserena
 
Experimental research method
Experimental research methodExperimental research method
Experimental research methodKinjal Shah
 
Types of studies
Types of studiesTypes of studies
Types of studiesAbdo_452
 
1_Introduction__lecture.pdfectopic pregnancy
1_Introduction__lecture.pdfectopic pregnancy1_Introduction__lecture.pdfectopic pregnancy
1_Introduction__lecture.pdfectopic pregnancyAmiraNoori
 

Similar to Basic biostatistics (20)

Effective strategies to monitor clinical risks using biostatistics - Pubrica....
Effective strategies to monitor clinical risks using biostatistics - Pubrica....Effective strategies to monitor clinical risks using biostatistics - Pubrica....
Effective strategies to monitor clinical risks using biostatistics - Pubrica....
 
Living evidence 3
Living evidence 3Living evidence 3
Living evidence 3
 
RSS 2012 Study designs
RSS 2012 Study designsRSS 2012 Study designs
RSS 2012 Study designs
 
Summarising a journal article
Summarising a journal articleSummarising a journal article
Summarising a journal article
 
RSS study design
RSS study designRSS study design
RSS study design
 
Introduction to educational research
Introduction to educational researchIntroduction to educational research
Introduction to educational research
 
Medpage guide-to-biostatistics
Medpage guide-to-biostatisticsMedpage guide-to-biostatistics
Medpage guide-to-biostatistics
 
Basics statistics
Basics statistics Basics statistics
Basics statistics
 
1 a).pptx
1 a).pptx1 a).pptx
1 a).pptx
 
Research design
Research designResearch design
Research design
 
Research Design for health care students
Research Design for health care studentsResearch Design for health care students
Research Design for health care students
 
Understanding Educational Enquiry
Understanding Educational EnquiryUnderstanding Educational Enquiry
Understanding Educational Enquiry
 
RESEARCH DESIGN.pptx
RESEARCH DESIGN.pptxRESEARCH DESIGN.pptx
RESEARCH DESIGN.pptx
 
02 basic researchmethod
02 basic researchmethod02 basic researchmethod
02 basic researchmethod
 
Webiner.pptx
Webiner.pptxWebiner.pptx
Webiner.pptx
 
Experimental research method
Experimental research methodExperimental research method
Experimental research method
 
Types of studies
Types of studiesTypes of studies
Types of studies
 
F unit 5.pptx
F unit 5.pptxF unit 5.pptx
F unit 5.pptx
 
1_Introduction__lecture.pdfectopic pregnancy
1_Introduction__lecture.pdfectopic pregnancy1_Introduction__lecture.pdfectopic pregnancy
1_Introduction__lecture.pdfectopic pregnancy
 
Research design mahesh
Research design maheshResearch design mahesh
Research design mahesh
 

Basic biostatistics

  • 1. MedPage Tools .com Guide to Biostatistics Study Designs Here is a compilation of important epidemiologic concepts and common biostatistical terms used in medical research. You can use it as a reference guide when reading articles published on n How Research Is Classified MedPage Today or download it to keep near the reading stand where you keep your print journals. For more detailed infor- n Terminology mation on these topics, use the reference list at the end of this presentation. n Important epidemiologic concepts Study Designs in Clinical Research Descriptive Statistics How research is classified1 Did n Measures of central Yes researcher assign tendency exposures? n Measures of spread Experimental No study Observational n Measures of frequency of study events Is Yes Randomised n Measures of Association allocation controlled Random? Is there a Comparison n Terms used to describe the group? quality of measurements No Non-Randomised n Measures of diagnostic test controlled Trial No Yes accuracy Analytical Descriptive Study Study n Expressions used when making inferences about data Direction of n Multivariable Regression the study? Methods Exposure Exposure Exposure and Outcome at Outcome Outcome the same time References Cohort Case control Cross-sectional study study study
  • 2. Terminology Clinical Trial Experimental study in which the exposure status .com (e.g. assigned to active drug versus placebo) is determined by the investigator. Randomized Controlled Trial A special type of clinical trial Study Designs in which assignment to an exposure is determined purely by chance. n How Research Is Classified Cohort Study Observational study in which subjects with an exposure of interest (e.g. hypertension) and subjects without n Terminology the exposure are identified and then followed forward in time to determine outcomes (e.g. stroke). n Important epidemiologic concepts Case-Control Study Observational study that first identifies a group of subjects with a certain disease and a control group without the disease, and then looks to back in time (e.g. chart review) to find exposure to risk factors for the disease. This type Descriptive Statistics of study is well suited for rare diseases. Cross-Sectional Study Observational study that is done to ex- n Measures of central amine presence or absence of a disease or presence or absence tendency of an exposure at a particular time. Since exposure and outcome are ascertained at the same time, it is often unclear if the expo- n Measures of spread sure preceded the outcome. n Measures of frequency of Case Report or Case Series Descriptive study that reports on events a single or a series of patients with a certain disease. This type of study usually generates a hypothesis but cannot test a hy- n Measures of Association pothesis because it does not include an appropriate comparison group. n Terms used to describe the quality of measurements Important Epidemiologic Concepts Bias Any systematic error in the design or conduct of a study n Measures of diagnostic test that results in a mistaken estimate of an exposure’s effect on risk accuracy of disease. n Expressions used when Selection Bias Bias introduced by the way in which participants making inferences about are chosen for a study. For example, in a case-control study using data different criteria to select cases (e.g. sick, hospitalized population) versus controls (young, healthy outpatients) other than the pres- n Multivariable Regression ence of disease can lead the investigator to a false conclusion Methods about an exposure. Confounding This occurs when an investigator falsely concludes that a particular exposure is causally related to a disease without adjusting for other factors that are known risk factors for the References disease and are associated with the exposure.
  • 3. Descriptive Statistics .com Measures of Central Tendency Mean equals the sum of observations divided by the number of observations. Study Designs Median equals the observation in the center when all observa- tions are ordered from smallest to largest; when there is an even n How Research Is Classified number of observations the median is defined as the average of the middle two values. n Terminology Mode equals the most frequently occurring value among all n Important epidemiologic observations. concepts Measures of Spread Spread (or variability) describes the manner in which data are Descriptive Statistics scattered around a specific value (such as the mean). The most commonly used measures of spread are: n Measures of central Range is the difference between the largest observation and tendency the smallest. n Measures of spread Standard Deviation measures the spread of data around the mean. One standard deviation includes 68% of the values in a n Measures of frequency of sample population and two standard deviations include 95% of events the values. Standard Error of the Mean describes the amount of variability n Measures of Association in the measurement of the population mean from several differ- ent samples. This is in contrast to the standard deviation which n Terms used to describe the measures the variability of individual observations in a sample. quality of measurements Percentile equals the percentage of a distribution that is below a specific value. As an example, a child is in the 80th percentile n Measures of diagnostic test for height if only 20% of children of the same age are taller than accuracy he is. n Expressions used when Interquartile Range refers to the upper and lower bound- making inferences about ary defining the middle 50 percent of observations. The upper data boundary is the 75th percentile and the lower boundary is the 25th percentile. n Multivariable Regression Methods Measures of Frequency of Events Incidence The number of new events (e.g. death or a particular disease) that occur during a specified period of time in a popula- References tion at risk for developing the events. Incidence Rate A term related to incidence that reports the number of new events that occur over the sum of time indi- viduals in the population were at risk for having the event (e.g.
  • 4. events/person-years). Prevalence The number of persons in the population affected by .com a disease at a specific time divided by the number of persons in the population at the time. Study Designs Measures of Association The types of measures used to define the association between n How Research Is Classified exposures and outcome depends upon the type of data. For categorical variables, the relative risk and odds ratio are com- n Terminology monly used to describe the relationship between exposures and outcome. n Important epidemiologic Relative risk and cohort studies The relative risk (or risk ratio) concepts is defined as the ratio of the incidence of disease in the exposed group divided by the corresponding incidence of disease in the unexposed group (Figure 2). Relative risk can be calculated in co- Descriptive Statistics hort studies such as the Framingham Heart Study where subjects with certain exposures (e.g. hypertension, hyperlipidemia) were n Measures of central followed prospectively for cardiovascular outcomes. The inci- tendency dence of cardiac events in subjects with and without exposures was then used to calculate relative risk and determine whether n Measures of spread exposures were cardiac risk factors. Odds ratio and case-control studies The odds ratio is defined n Measures of frequency of as the odds of exposure in the group with disease divided by the events odds of exposure in the control group (Figure 1). As described above, subjects are selected on the basis of disease status in n Measures of Association case-control studies, therefore it is not possible to calculate the rate of development of disease given presence or absence of n Terms used to describe the exposure. So, the odds ratio is often used to approximate the quality of measurements Cohort Study Case Control Study n Measures of diagnostic test Disease Disease accuracy Yes No Yes No n Expressions used when making inferences about + A B A B data Test n Multivariable Regression Methods – C D C D Relative Risk Odds Ratio References A/(A+B) A/C = A×D C/(C+D) B/D B×C Figure 1: In a case-control study, the odds ratio can be used to approximate the relative risk under the assumption that the disease is rare.
  • 5. relative risk in case-control studies. For example, a case-control study was done to evaluate the relationship between artificial .com sweeteners and bladder cancer. The odds of artificial sweetener use in the cases and controls were used to calculate an odds ratio and determine whether sweeteners were associated with bladder cancer. Under the assumption that the disease under Study Designs consideration is rare (e.g. bladder cancer), the odds ratio gives a stable, unbiased estimate of the relative risk (Figure 1). The odds n How Research Is Classified ratio from a case-control study nested within a defined cohort also approximates the relative risk even when the rare disease n Terminology assumption is not held. If the disease is rare, AB and CD. So, A/(A + B) is approximated n Important epidemiologic by A/B and C/(C + D) approximated by C/D. In this situation, the concepts relative risk equals (A/B)/(C/D) which, rearranged, equals the odds ratio A×D/B×C Absolute risk The relative risk and odds ratio provide a measure Descriptive Statistics of risk compared with a standard. However, it is sometimes desir- able to know the absolute risk. For example, a 40% increase in n Measures of central risk of heart disease because of a particular exposure does not tendency provide insight into the likelihood that exposure in an individual patient will lead to heart disease. n Measures of spread The Attributable risk or Risk difference is a measure of abso- lute risk. It represents the excess risk of disease in those exposed n Measures of frequency of taking into account the background rate of disease. The attribut- events able risk is defined as the difference between the incidence rates in the exposed and non-exposed groups. n Measures of Association A related term, the Population Attributable Risk is used to de- n Terms used to describe the scribe the excess rate of disease in the total study population of quality of measurements exposed and non-exposed individuals that is attributable to the exposure. This measure is calculated by multiplying the Attributable n Measures of diagnostic test risk by the proportion of exposed individuals in the population. accuracy Number needed to treat (NNT) The number of patients who would need to be treated to prevent one adverse outcome is n Expressions used when often used to present the results of randomized trials. NNT is the making inferences about reciprocal of the absolute risk reduction (the absolute adverse data event rate for placebo minus the absolute adverse event rate for treated patients). This approach can be used in studies of vari- n Multivariable Regression ous interventions including both treatment and prevention. The Methods estimate for NNT is subject to considerable error and is generally presented with 95% confidence intervals so that it can be prop- erly interpreted. References Terms Used To Describe The Quality Of Measurements Reliability The concept of reliability or reproducibility is related to the amount of error in any measurement (e.g. blood pressure
  • 6. measurement). A more formal definition of reliability is variability between subjects divided by inter-subject variability plus mea- .com surement error. Thus, reliability is greater when measurement er- ror is minimal. There are several types of reliability including: inter and intra-observer reliability and test-retest reliability. Percent agreement and the kappa statistic are often used to re- Study Designs port reliability. The kappa statistic takes into account agreement that would be seen by chance alone while percent agreement n How Research Is Classified does not. Generally, a kappa greater than 0.75 represents excel- lent agreement beyond chance, a kappa below 0.40 represents n Terminology poor agreement and a kappa of 0.40-0.75 represents intermedi- ate to good agreement. n Important epidemiologic concepts Validity refers to the extent to which a test or surrogate is mea- suring what we think it is measuring. There are several types of validity that can be measured including content validity (the extent to which the measure reflects the dimensions of a particu- Descriptive Statistics lar problem), construct validity (the extent to which a measure conforms to an external established phenomenon), and criterion n Measures of central validity (the extent to which a measure correlates with a gold tendency standard or can predict an observable phenomenon). These types of validity are often applied to questionnaires in which the n Measures of spread truth is not physically verifiable. n Measures of frequency of Measures Of Diagnostic Test Accuracy events Sensitivity is defined as the ability of the test to identify cor- rectly those who have the disease. It is the number of subjects n Measures of Association with a positive test who have disease divided by all subjects who have the disease. A test with high sensitivity has few false nega- n Terms used to describe the tive results. quality of measurements Specificity is defined as the ability of the test to identify cor- n Measures of diagnostic test rectly those who do not have the disease. It is the number of accuracy subjects who have a negative test and do not have the disease divided by the number of subjects who do not have the disease. n Expressions used when A test with high specificity has few false positive results. making inferences about Sensitivity and specificity are test characteristics that are most data useful when assessing a test used to screen a free-living popula- tion. These test characteristics are also interdependent: an in- n Multivariable Regression crease in sensitivity is accompanied by a decrease in specificity Methods and visa versa. This is illustrated best by continuous tests where the cut-off for a positive test result can be varied. For example, consider the use of the white blood cell (WBC) count as a test to diagnose bacterial infection. If one sets a high cut-off for a posi- References tive test (e.g. WBC 25,000) then the test will have a low sensitiv- ity and high specificity compared to the test characteristics if the cut-off is lower (e.g. WBC10,000).
  • 7. Predictive values are important for assessing how useful a test will be in the clinical setting at the individual patient level. The .com positive predictive value is the probability of disease in a pa- tient with a positive test. Conversely, the negative predictive value is the probability that the patient does not have disease if he has a negative test result. Study Designs Predictive values depend on the prevalence of a disease in a population. A test with a given sensitivity and specificity can n How Research Is Classified have different predictive values in different patient populations. If the test is used in a population with a high prevalence, it will n Terminology have a high positive predictive value and the same test will have a low positive predictive value when used in a population with n Important epidemiologic low disease prevalence. For example, a positive stool test for oc- concepts cult blood is much more likely to be predictive of colon cancer in a population of elderly people compared with twenty year olds. Descriptive Statistics Disease Yes No n Measures of central Sensitivity tendency A/(A+C) + A B Specificity n Measures of spread D/(B+D) Test n Measures of frequency of Positive Predictive Value events - C D A/(A+B) Negative Predictive Value n Measures of Association D/(C+D) n Terms used to describe the Figure 2: Calculating sensitivity, specificity, and predictive values quality of measurements n Measures of diagnostic test Likelihood ratios Calculating likelihood ratios is another meth- accuracy od of assessing the accuracy of a test in the clinical setting. Likeli- hood ratios also offer the advantage of being independent of n Expressions used when disease prevalence. making inferences about The likelihood ratio indicates how much a given diagnostic test data result will raise or lower the odds of having a disease relative to the prior probability of disease. Each diagnostic test is character- n Multivariable Regression ized by two likelihood ratios: a positive likelihood ratio that tells Methods us the odds of disease if the test result is positive and a negative likelihood ratio that tells us the odds of disease if the test result is negative: References LR+ = Sensitivity / (1- Specificity) LR- = (1- Sensitivity) / Specificity A likelihood ratio greater than 1 increases the odds that the per-
  • 8. son has the target disease, and the higher the LR the greater this increase in odds. Conversely, a likelihood ratio less than 1 dimin- .com ishes the odds that the patient has the target disease. Expressions Used When Making Inferences About Data Confidence Intervals The results of any study sample are an Study Designs estimate of the true value in the entire population. The true value may actually be greater or less than what is observed. A confi- n How Research Is Classified dence interval gives a range of values within which there is a high probability (95% by convention) that the true population value n Terminology can be found. The confidence interval takes into consideration the number of observations and the standard deviation in the sample n Important epidemiologic population. The confidence interval narrows as the number of concepts observations increases or standard deviation decreases. Errors In hypothesis testing, there are two types of errors: Descriptive Statistics Type I error (alpha) is the probability of incorrectly concluding there is a statistically significant difference in the population n Measures of central when none exists. This type of error is also called alpha and is tendency the number after a P-value. A P0.05 means that there is a less than 5% chance that the difference could have occurred by n Measures of spread chance. Type II error (beta) is the probability of incorrectly concluding n Measures of frequency of that there is no statistically significant difference in a popula- events tion when one exists. Power is a measure of the ability of a study to detect a true dif- n Measures of Association ference. It is measured as 1- type II error rate or 1-beta. Every researcher should perform a power calculation prior to carrying n Terms used to describe the out a study to determine the number of observations needed quality of measurements to detect a desired degree of difference. Ideally this difference should equal the smallest difference that would still be consid- n Measures of diagnostic test ered to be clinically important. However, the smaller the dif- accuracy ference, the greater the number of observations needed. For example, it takes fewer patients to observe a 50% reduction in n Expressions used when mortality from a new therapy than a 5% reduction. making inferences about data Multivariable Regression Methods n Multivariable Regression In medical research, one is often interested in studying the inde- Methods pendent effect of multiple risk factors on outcome. For example, we may want to know the independent effect of age, gender and smoking status on the risk of having a myocardial infarction. Fur- thermore, we may want to know if smoking raises the risk equally References in men and women. Multivariable regression methods allow us to answer these types of questions by simultaneously account- ing for multiple variables. The type of regression model used depends on the type of outcome data being evaluated.
  • 9. Multiple linear regression is used when the outcome data is a continuous variable such as weight. For example, one could .com estimate the effect of a diet on weight after adjusting for the ef- fect of confounders such as smoking status. Another use of this method is to predict a linear variable based on known variables. Logistic regression is used when the outcome data is binary Study Designs such as cure or no cure. Logistic regression can be used to esti- mate the effect of an exposure on a binary outcome after adjust- n How Research Is Classified ing for confounders. Logistic regression can also be used to find factors that discriminate two groups or to find prognostic indica- n Terminology tors for a binary outcome. This method can also be applied to case-control studies. n Important epidemiologic concepts Survival Analysis In survival analysis, one is commonly interested in the time until some event such as the time from treatment of disease to death. Descriptive Statistics In the study population, only some subjects will have the event of interest (e.g. death, stroke), others will have alternate events or no n Measures of central events. The duration of follow-up will also vary among subjects tendency and it is important to account for the different follow-up times. n Measures of spread The Kaplan-Meier analysis and a regression method, the Cox pro- portional hazards analysis are two methods of survival analysis that n Measures of frequency of account for inter-subject variation in events and follow-up time. events Kaplan-Meier analysis measures the ratio of surviving subjects (or those without an event) divided by the total number of sub- n Measures of Association jects at risk for the event. Every time a subject has an event, the ratio is recalculated. These ratios are then used to generate a n Terms used to describe the curve to graphically depict the probability of survival (Figure 3). quality of measurements n Measures of diagnostic test 100 accuracy Drug Group Expressions used when Percent Survival n making inferences about data n Multivariable Regression Methods Placebo Group References 0 Follow-up Time Figure 3: Kaplan-Meier Survival Curves
  • 10. In studies with an intervention arm and a control arm, one can generate two Kaplan-Meier curves. If the curves are close to- .com gether or cross, a statistically significant difference is unlikely to exist. Statistical tests such as the log-rank test can be used to confirm the presence of a significant difference. Cox proportional hazards analysis is similar to the logistic Study Designs regression method described above with the added advantage that it accounts for time to a binary event in the outcome vari- n How Research Is Classified able. Thus, one can account for variation in follow-up time among subjects. Like the other regression methods described above, it n Terminology can be used to study the effect of an exposure on outcome after adjusting for confounders. Cox analysis can also be used to find n Important epidemiologic prognostic indicators for survival in a given disease. concepts The hazard ratio that results from this analysis can be interpreted as a relative risk (risk ratio). For example, a hazard ratio of 5 means that the exposed group has five times the risk of having Descriptive Statistics the event compared to the unexposed group. n Measures of central tendency Rubeen K. Israni, M.D. n Measures of spread Fellow, Renal-Electrolyte and Hypertension Division, University of Pennsylvania School of Medicine. n Measures of frequency of events n Measures of Association n Terms used to describe the quality of measurements n Measures of diagnostic test accuracy n Expressions used when making inferences about data n Multivariable Regression Methods References
  • 11. References 1. Grimes DA, Schultz KF: An overview of clinical research: the lay of the land. .com Lancet 359:57-61, 2002 2. Grimes DA, Schultz KF: Bias and causal associations in observational research. Lancet 359:248-252, 2002. Study Designs 3. Gordis L: Epidemiology, 3rd Edition, Philadelphia, Elsevier Saunders, 2004. ■ How Research Is Classified 4. Rosner B: Fundamentals of Biostatistics, 4th Edition, Daxbury Press, 1995. 5. Grimes DA, Schultz KF: Cohort studies: marching towards outcomes. Lancet ■ Terminology 359: 341-345, 2002. ■ Important epidemiologic 6. Schultz KF, Grimes DA: Case-control studies: research in reverse. Lancet concepts 359:431-434, 2002. 7. Streiner DL, Norman GR: Health Measurement Scales: A Practical Guide to their Development and Use, 2nd Edition, New York, Oxford University Press, 2000. Descriptive Statistics 8. Jaeschke R, Guyatt GH, Sackett DL: Users’ guides to the medical literature. III. ■ Measures of central How to use an article about a diagnostic test. B. What are the results and tendency will they help me in caring for my patients? The Evidence-Based Medicine Working Group. Jama 271:703-707, 1994. ■ Measures of spread 9. Guyatt G, Jaeshke R, Heddle N, et al. Basic statistics for clinicians: 2. Interpreting study results: confidence intervals. Cmaj 152:169-173, 1995. ■ Measures of frequency of events 10. Katz MH: Multivariable analysis: a primer for readers of medical research. Ann Intern Med 138:644-650, 2003. ■ Measures of Association 11. Campbell MJ: Statistics at Square Two, 4th Edition, London, BMJ Publishing Group, 2004. ■ Terms used to describe the quality of measurements ■ Measures of diagnostic test accuracy ■ Expressions used when making inferences about data ■ Multivariable Regression Methods References Copyright © 2007 MedPage Today, LLC. All Rights Reserved. Source: http://www.medpagetoday.com/