1. Created by Simpo PDF Creator Pro (unregistered version)
http://www.simpopdf.com
UFOs – Unidentified Flying Objects
Ufology – is a neologism coined to describe the collective
efforts of those who study reports and associated evidence of
unidentified flying objects (UFOs).
Ufologist – A ufo investigator is called a ufologist
Ufo Sightings- Some eye witnesses to the UFOs
Roswell Incident- called the Roswell UFO crash
1947
UFO Conspiracy – Worldwide UFO cover Ups and
related theories
Alien- An extra-terrestrial being is called an alien
2. Created by Simpo PDF Creator Pro (unregistered version)
http://www.simpopdf.com
Cautions about Significance Testing as Reported in the News
1. If the word significant is used to try to convince you that there is an important effect or
relationship, determine if the word is being used in the usual sense or in the statistical sense
only.
EX: Colgate ad on TV: “In clinical studies, Colgate Sensitive toothpaste significantly reduced pain
compared to Sensodyne. We don’t know if they mean in the statistical or practical sense.
EX: An article in the Sacramento Bee (Philp, 1995) described a study done at the University of
California at Davis Medical Center, in which patients were asked to rate their satisfaction with the
physician they saw during their first visit to the center. The purpose of the study (Bertakis et al.,
1995) was to determine if patients were more satisfied with male or with female physicians. The
study was based on a survey completed by 250 patients rating medical residents on a scale from 1
(very dissatisfied) to 5 (very satisfied). Neither patients nor physicians were told the purpose of the
study.
The Bee reported "The female physicians received an average score of 4.27. The men — a
respectable, yet significantly lower score of 4.05."
The null hypothesis is that there is no difference in satisfaction with male and female physicians in
the population from which this sample of patients was drawn.
The alternative is two-sided. The average difference was only 0.23, but the original article (Bertakis
et al., p. 412) reported it as "small but statistically significant (p = 0.02)." Further results were
found based on videotapes of the sessions. Namely, the male physicians spent more time taking
medical histories while the females spent more time on preventive services and obtaining family
information. The original report noted that "This difference is both statistically and clinically
significant" (p. 414).
2. If a study is based on a very large sample size, statistically significant relationships found
may not have much practical importance.
EX: Newsweek reported drop in drug use in high school students based on huge sample sizes.
For instance, a drop of nine-tenths of one percent (0.9%) was observed in alcohol use among tenth
graders in the samples from 1992 to 1993, from 70.2% to 69.3%. Sample size in 1993 was 15,500
for 10th graders, assume similar for 1992. A one-tailed test would result in a p-value of about 0.04.
So it is a significant drop, but probably only in the statistical sense.
EX: Military recruits, n = 507,000. "Spring birthday conveys height advantage" Reuters and Nature.
Spring people had mean height 0.6 cm or 1/4 inch higher than fall people.
3. If you read that "no difference" or "no relationship" has been found in a study, try to
determine the sample size used. Unless the sample size was large, remember that it could be
that there is indeed an important relationship in the population, but that not enough data was
collected to detect it. In other words, the test could have had very low power.
EX: Study comparing memory scores for young and old Americans, deaf Americans and Chinese.
3. Created by Simpo PDF Creator Pro (unregistered version)
http://www.simpopdf.com
Quote in Science News: “Surprisingly, the researchers add, memory scores for older and younger
Chinese did not statistically differ.
But there were only n=30 in each group! If n=60 and same difference, z=1.8, p=.04.
4. If possible, learn what confidence interval accompanies the hypothesis test, if any. Even
then you can be misled into concluding that there is no effect when there really is, but at least
you will have more information about the magnitude of the possible difference or relationship.
EX: Randomized experiment gave kids sweetener xylitol or placebo for ear infections. The p-value
for comparing proportions with infections was .02 (two-tailed).
Syrup type n % with an ear infection
Xylitol 159 28.9% [29%]
Placebo 165 41.2% [41%]
Seems pretty conclusive. BUT a 95% confidence interval for p1 – p2 is 2% to 23%, so almost covers
0%.
5. Try to determine whether the test was one-sided or two-sided. If a test is one-sided, as in
Case Study 23.1 of the text, and details aren't reported, you could be misled into thinking
there is no difference, when in fact there was one, but in the direction opposite to that
hypothesized.
EX: Case Study 23.1 from book “Study Finds no Abnormality in Those Reporting UFOs.”
In fact, those reporting UFOs scored significantly better on psychological tests.
6. Sometimes researchers will perform a multitude of tests, and the reports will focus on those
tests that achieved statistical significance. Remember that if nothing interesting is happening,
and all of the null hypotheses tested are true, then 1 in 20 tests should achieve statistical
significance by chance. Beware of reports where it is evident that many tests were conducted,
but where results of only one or two are presented as "significant."
EX: See Case Study 25.8 for an example of multiple researchers studying same thing, and one
found result in direction opposite to the others. (Spinach and lung cancer in smokers.)
EX: In a news release on Feb. 27, 1998, Reuters reported on a study done by researchers at Kaiser
Permanente Medical Care Program of Northern California. The headline read "Tea Doubles
Chances of Conception." The study asked 187 women who were trying to conceive to record their
daily dietary intake. The article reported "drinking one-half cup or more of tea daily approximately
doubled the odds of conception per cycle." However, in the original journal article (Caan,
Quesenberry and Coates, 1998) it is made clear that the researchers did tests on multiple caffeinated
beverages and tea was the only one to yield a statistically significant effect.
They did not adjust for the fact that they conducted multiple tests. In fact, the Reuters article notes
"The California researchers say they found no significant association between coffee intake and
fertility," thus acknowledging that multiple tests were performed.