STATS 101 WK7 NOTE.pptx

1
1
1
1
1
1
1
1
1
UNIVERSITY OF LIBERIA
T.J.R. FAULKNER COLLEGE OF SCIENCE, TECHNOLOGY AND ENVIRONMENTAL SCIENCES
DEPARTMENT OF MATHEMATICS
STATISTICS PROGRAM
STATS 101: INTRODUCTION TO STATISTICS
Instructor:
Mr. Mulbah K.A. Kromah,
Principal Analyst, Office of the DDGSDP &
Part-time instructor, UL department of mathematics

2
PLAN
1 2
01 INTRODUCTION
02
Overview of the course
outline
06
Random variables
History of statistics
03
Basic definitions, types of data
and other key concepts
04
08
Correlation &
regression
Descriptive statistics
05
07
Probability distributions &
Statistical inference
09
SUMMARY

3
Overview of the course outline
3
OBJECTIVE OF THE COURSE:
 Provide students with a brief history of statistics;
 Help students understand the basic definitions used in statistics, the types of data
used and the basic sampling methods;
 Help students learn how to avoid making misleading conclusions;
 Introduces students to the field of descriptive statistics (Data organization,
visualization and summarization);

4
Overview of the course outline
4
OBJECTIVE OF THE COURSE:
 Introduces students to the concept of random variables;
 Help students learn about the basic index numbers;
 Introduces students to probability distributions;
 Introduces students to statistical inference, correlation and regression

5
ORGANIZATION OF DATA
5
INTRODUCTION TO DESCRIPTIVE STATISTICS
FREQUENCY DISTRIBUTION AND GRAPHS

6
III. ORGANIZATION OF DATA: Frequency distribution and graphs
- 3.3 Data representation using Graphs
Quantitative Data Representation
When data are quantitative, several types of representations are
often used:
 Histogram
 Frequency polygon
 Ogive
 Stem and leaf plot
 Dot plot
 Scatter plot, etc.
6

7
In this course, we will provide a brief explanation of the uses of
four of these graphs, namely:
 Histogram
 Frequency polygon
 Ogive
 Stem and leaf plot
 Scatter plot (specifically time series graph)
 Bar graph
7

8
8

9
Quantitative and quantitative Data Representation
9

10
10
Distribution Shapes
 When describing data, it is important to recognize the shapes of the distribution
values. This is important in understanding which statistical method to use in
analyzing the data.
 A distribution can have many shapes, and one method of analyzing a distribution is
to draw a histogram or frequency polygon for the distribution.
 Distributions are most often not perfectly shaped, so it is not necessary to have an
exact shape but rather to identify an overall pattern.

11
11
Classification of Distribution Shapes

12
12
Avoid using misleading graphs
 Changing the units at the starting point
on the y axis can convey a very different
visual representation of the data.
 Avoid exaggerating a one-dimensional
increase by showing it in two
dimensions.
 Avoid omitting labels or units on the
axes of the graph.
 Always include the basic elements of a
graph (titles, units, source and notes)

13
13
Summary of Graphs and Uses of Each

14
14
Stem and Leaf Plots
 A stem and leaf plot is a data plot that uses part of the data value as the stem and
part of the data value as the leaf to form groups or classes.
 Stem and leaf plot can be used to compare two related distributions (back-to-back
stem and leaf plot)
 When analyzing a stem and leaf plot, one look for peaks and gaps in the distribution.
You should also analyze the form of the distribution (symmetric or skewed). Check
the variability of the data by looking at the spread (range, variance, SD).

15
IIII. ORGANIZATION OF DATA: Frequency distribution and graphs
15
How to construct a Stem and Leaf Plot?
Step 1: Arrange the data in order (Optional but very helpful).
Step 2: Separate the data according to the classes.
Step 3: Plot the data using one of the diagram below:
Leading
digit
(stem)
Trailing
digit for
dist. 2
(leaf)
Trailing
digit for
dist. 1
(leaf)
Trailing
digit
(leaf)
Leading
digit
(stem)
Back-to-back stem and leaf plot

16
16
Exercise 1
Bomi citizens age
55 33 5 37 27
31 42 12 45 5
0 44 6 17 8
3 10 42 9 3
26 34 28 7 55
3 3 9 48 2
Bassa citizen age
30 28 2 10 8
40 23 26 8 3
4 62 42 29 35
2 45 5 27 26
3 40 22 0 16
41 26 11 62 6
The dataset on the right shows
the ages of 30 Bomi and Bassa
citizens extracted from the 2008
NPHC of Liberia. Use this dataset
to construct stem and leaf plots
for the two counties. Use a back-
to-back stem and leaf plot to
compare the two distributions.

17
17
Solution
0 0 2 3 3 3 3 5 5 6 7 8 9 9
1 0 2 7
2 6 7 8
3 1 3 4 7
4 2 2 4 5 8
5 5 5
Distribution of 30 randomly
selected Bomi citizens’ age
0 0 2 2 3 3 4 5 6 8 8
1 0 1 6
2 2 3 6 6 6 7 8 9
3 0 5
4 0 0 1 2 5
5
6 2 2
Distribution of 30 randomly
selected Bassa citizens’ age
Note: There are no data in the sixth class for Bassa. Do not put 0 in the leaf for this class,
just leave it blank (that is why we only wrote the stem number, 5)

18
18
Solution
0 0 2 3 3 3 3 5 5 6 7 8 9 9
1 0 2 7
2 6 7 8
3 1 3 4 7
4 2 2 4 5 8
5 5 5
Bomi
8 8 6 5 4 3 3 2 2 0 0
6 1 0 1
9 8 7 6 6 6 3 2 2
5 0 3
5 2 1 0 0 4
5
2 2 6
Bassa

19
19
Summary
Statisticians or researchers collect raw data.
To obtain much information from this data, they must organize it in some meaningful way.
A frequency distribution using classes is often used for this purpose.
Once a frequency distribution is constructed, the representation of the data by graphs
becomes easy.
The most commonly used graphs in research statistics are the histogram, frequency
polygon, ogive, bar graph, Pareto chart, time series graph, and pie graph.
Finally, a stem and leaf plot uses part of the data values as stems and part of the data
values as leaves. This graph has the advantages of a frequency distribution and a
histogram.

20
20
Exercise 2
Bomi citizens age
55 33 5 37 27
31 42 12 45 5
0 44 6 17 8
3 10 42 9 3
26 34 28 7 55
3 3 9 48 2
Bassa citizen age
30 28 2 10 8
40 23 26 8 3
4 62 42 29 35
2 45 5 27 26
3 40 22 0 16
41 26 11 62 6
Use the dataset on the right to
construct stem and leaf plots for
the two counties using the
following age groupings:
0 - 9
10 - 14
15 - 24
25 - 44
45 - 54
55 - 64.

21
21
CHAPTER FOUR
DATA DESCRIPTION

22
22
IV. ORGANIZATION OF DATA
In the previous chapter, we learned how to obtain useful
information from raw data by organizing them into a
frequency distribution and then presenting the data by using
various graphs.
In this chapter, we will learn about the statistical methods
that can be used to summarize data. Our main objective will
be to find the “central number” or the “most typical case” in
our dataset and then analyze the relationship between this
number and the other numbers in the dataset.

23
23
First, we will look at the measures of average, also called
measures of central tendency. They include the mean,
median, mode, and midrange.
Next, we will learn about measures of variation, or
measures of dispersion. These measures include the range,
variance, and standard deviation.
Lastly, we will learn how to compute and interpret measures
of position, which include percentiles, deciles, and
quartiles.

24
24
4.1- Measures of Central Tendency
The Mean
The mean, also known as the arithmetic average, is found by adding the values
of the data and dividing by the total number of values.
𝑋 =
𝑋1 + 𝑋2 + 𝑋3 + ⋯ + 𝑋𝑛
𝑛
=
𝑖=1
𝑛
𝑋𝑖
𝑛
Sample mean
µ=
𝑋1+𝑋2+𝑋3+⋯+𝑋𝑁
𝑁
= 𝑖=1
𝑁
𝑋𝑖
𝑁
Population mean
𝑋 =
𝑊1𝑋1 + 𝑊2𝑋2 + 𝑊3𝑋3 + ⋯ + 𝑊
𝑛𝑋𝑛
𝑊1 + 𝑊2 + 𝑊3 + ⋯ + 𝑊
𝑛
=
𝑖=1
𝑛
𝑊𝑖𝑋𝑖
𝑖=1
𝑛
𝑊𝑖
Weighted mean
where 𝑊1, 𝑊2, 𝑊3, … , 𝑊
𝑛 are the weights and 𝑋1, 𝑋2, 𝑋3, … , 𝑋𝑛 are the values.

25
25
The Mean
𝑋 =
𝑓1𝑀1 + 𝑓2𝑀2 + 𝑓3𝑀3 + ⋯ + 𝑓𝑗𝑀𝑗
𝑛
=
𝑗=1
𝑛
𝑓𝑗𝑀𝑗
𝑛
Mean for a group data
where 𝑓1, 𝑓2, 𝑓3, … , 𝑓𝑗 are the frequencies and 𝑀1, 𝑀2, 𝑀3, … , 𝑀𝑗 are the midpoints
of the classes.

26
26
The Mean
Examples
2. Find student La Paix GPA if he has the following grades:
An A in English 201 (4 credits), a C in Statistics 103 (3 credits), a D in Math 202 (4 credits)
and an F in Statistics 203 (3 credits), considering that A=4 points, B=3 points, C= 2 points,
D= 1 point and F= 0 point.
1. The daily transportation of 6 UL students are given below:
$150LD, $450LD, $600LD, $200LD, $700LD, $150LD. Find the average daily transportation
of these students.

27
27
The Mean
Examples
3- Find the average of the group data given on the
right.

28
28
The Median
The median is the midpoint of the data array. The symbol for the median is MD.
A data array is a dataset that has been ordered. To find the median, all one needs to do
is to arrange the dataset in order and then locate the middle number. When the
number of data values is even, the median will be the midpoint of the two middle
numbers.
The median tells us that 50% of the data values are above it while 50% are below it.

29
29
The Median
We can also find the median for a grouped data using the formula below:
𝑀𝐷 = 𝐿𝑚 +
𝑤
𝑓𝑚
(0.5𝑛 − 𝑐𝑓𝑏)
𝑊ℎ𝑒𝑟𝑒, 𝑳𝒎 is the lowest limit of the median class, 𝒇𝒎 is the frequency of the median
class, 𝒘 is the width of the median class, 𝒏 is the sample size and 𝒄𝒇𝒃 is the cumulative
frequency of the class before the median class.
Note: The median class is the first class having a cumulative relative frequency greater than 50%.

30
30
The Median
Examples
Find the median of the following datasets:
Dataset 1: 713, 300, 618, 595, 311, 401, and 292.
Dataset 2: 684, 764, 656, 702, 856, 1133, 1132, 1303.

31
31
The Mode
The value that occurs most often in a data set is called the mode.
 Unimodal: a dataset with one mode.
 Bimodal: a dataset with two modes.
 Multimodal: a dataset with more than two modes.
 No mode: a dataset can have no mode.
Note: a dataset can have no mode, one mode, two modes or even more modes.
The mode for grouped data is the modal class. The modal class is the class with
the largest frequency.

32
32
The Mode
Examples:
Find the mode in the following datasets:
Dataset 1: 20.0, 16.0, 34.3, 13, 12.5, 13, 12.4, 13.
Dataset 2: 110, 731, 1031, 84, 20, 118, 1162, 1977, 103, 752.
Dataset 3: 104, 104, 104, 104, 104, 107, 109, 109, 109, 110, 109, 111, 112, 111, 109.
Table 1: Distribution of students by major field of
studies
Table 2: frequency distribution of miles that 20
runners ran in one week.

33
33
Comparison of the Mean, Median and Mode
A small company consists of the owner, the manager, the salesperson, and two
technicians, all of whose annual salaries are listed here. (Assume that this is the entire
population.)

34
34
The Midrange
The midrange is a rough estimate of the middle. It is found by adding the lowest and
highest values in the data set and dividing by 2. It is a very rough estimate of the
average and can be affected by one extremely high or low value.
MR=
𝑋𝑚𝑖𝑛+𝑋𝑚𝑎𝑥
2

35
35
The Midrange
Example
Find the midrange of this dataset and compare it with the mean. What can you say?
Dataset: 18.0, 14.0, 34.5, 10, 11.3, 10, 12.4, 10

36
36
In statistics, several measures can be used for an average. The most common measures
are the mean, median, mode, and midrange. Each has its own specific purpose and
use. However, several other averages, such as the harmonic mean, the geometric
mean, and the quadratic mean exist. Their applications are limited to specific areas.

37
37
Properties and Uses of Central Tendency
The mean
1. It is found by using all the values of the data.
2. The mean varies less than the median or mode when samples are taken from
the same population and all three measures are computed for these samples.
3. The mean is used in computing other statistics, such as the variance.
4. The mean for the data set is unique and not necessarily one of the data values.
5. The mean cannot be computed for the data in a frequency distribution that
has an open-ended class.
6. The mean is affected by extremely high or low values, called outliers, and may
not be the appropriate average to use in these situations.

38
38
The Median
1. The median is used to find the center or middle value of a data set.
2. The median is used when it is necessary to find out whether the data values
fall into the upper half or lower half of the distribution.
3. The median is used for an open-ended distribution.
4. The median is affected less than the mean by extremely high or extremely low
values.

39
39
The Mode
1. The mode is used when the most typical case is desired.
2. The mode is the easiest average to compute.
3. The mode can be used when the data are nominal, such as religious
preference, gender, or political affiliation.
4. The mode is not always unique. A data set can have more than one mode, or
the mode may not exist for a data set.

40
40
The Midrange
1. The midrange is easy to compute.
2. The midrange gives the midpoint.
3. The midrange is affected by extremely high or low values in a data set..

41
41
Class discussions.
 Discuss the effect of the measures of central tendency on the shape of a
distribution.
 Give some practical examples of the most commonly seen distributions.
 How does the shape of a distribution determines which measures of central
tendency to use.

42
42
4.2- Measures of Variation
In order to better describe a dataset, Statisticians do not only consider measures
of central tendency, but they also look at other measures such as measures of
variation and position.
In this section, we will learn how to compute and interpret measures of
variation such as the range, variance and standard variation.

43
43
Consider this example from the Elementary Statistics
book:
A testing lab wishes to test two experimental brands of
outdoor paint to see how long each will last before
fading. The testing lab makes 6 gallons of each paint to
test. Since different chemical agents are added to each
group and only six cans are involved, these two groups
constitute two small populations. The results (in months)
are shown to the right. Find the mean of each group.

44
44
Solution
As seen, the two brands
have the same means, 35
but brand B varies less
then brand A (indicating
that Brand B is more
consistent).

45
45
The Range
The range is the highest value minus the lowest value.
The symbol 𝑅 is used for the range. 𝑅= highest value − lowest value
Note: The range can greatly be affected by outliers. Because of this,
statisticians usually used variance and standard deviation.

46
46
The Variance
The variance is the average of the squares of the distance each value is
from the mean. The symbol for the population variance is 𝜎2(𝜎 is the
Greek lowercase letter sigma).
Note: The Standard deviation is given by the square root of the variance.
𝜎2 =
(𝑋 − 𝜇)2
𝑁
Population Variance
𝑠2 =
(𝑋 − 𝑋)2
𝑛 − 1
Sample Variance

47
47
The Variance
𝑠2
=
𝑛( 𝑋2) − ( 𝑋)2
𝑛(𝑛 − 1)
Simplest formula for finding Sample Variance

48
48
Example
Find the variances of the two brands
of paints given to the right.

49
49
Variance and Standard Deviation for Grouped Data
𝑠2 =
𝑛( 𝑓∙𝑋𝑚
2 )−( 𝑓∙𝑋𝑚)2
𝑛(𝑛−1)
, where 𝑋𝑚 represents the class midpoint.

50
50
Variance and Standard Deviation for Grouped Data
𝑬𝒙𝒂𝒎𝒑𝒍𝒆.
Find the variance and
standard deviation of
this dataset.

51
51
Uses of the Variance and Standard Deviation
1. The variances and standard deviations can be used to determine the spread of the
data. If the variance or standard deviation is large, the data are more dispersed. This
information is useful in comparing two (or more) data sets to determine which is more
(most) variable.
2. The measures of variance and standard deviation are used to determine the
consistency of a variable. For example, in the manufacture of fittings, such as nuts and
bolts, the variation in the diameters must be small, or the parts will not fit together.

52
52
Uses of the Variance and Standard Deviation
3. The variance and standard deviation are used to determine the number of data
values that fall within a specified interval in a distribution.
4. Finally, the variance and standard deviation are used quite often in inferential
statistics.
Note: The range can be used to approximate the standard deviation.
The approximation is called the range rule of thumb. 𝑆 ≈
𝑅
4

53
53
Coefficient of Variation
The Coefficient of Variation (CV) is a statistic that allows us to
compare standard deviations when the units are different.
For example, we might want to compare the standard deviation of the number of hours
that Firestone employees work weekly with the standard deviation of their weekly
earnings.

54
54
The CV is the standard deviation divided by the mean. The result is expressed as a
percentage.
𝑪𝑽 =
𝑆
𝑿
∙ 100
Sample
𝑪𝑽 =
𝜎
𝜇
∙ 100
Population

55
55
Example
Suppose the mean of the number of hours that Firestone employees work weekly
is 48 hours and the standard deviation is 3 hours. Assuming also that the mean of
their weekly earnings is $15, 250 LD, and the standard deviation is $850 LD.
Compare the variations of the two variables.

56
56
Solution
𝑪𝑽 =
3
48
∙ 100 = 6.25%
Number of hours that Firestone employees work weekly
𝑪𝑽 =
850
15,250
∙ 100 = 5.57%
Firestone employees weekly earnings
Interpretation: Since the coefficient of variation is smaller for Firestone employees weekly
earnings, we can say that the weekly earning of the employees is less variable than the
number of hours they work weekly.

57
57
Group Presentation
 Divide the class into two groups;
 Each group is to make a presentation on one of the following:
a). Chebyshev’s theorem;
b). The Normal Rule.
 Each presentation should highlight the following:
• Brief description of the theorem or rule;
• Importance of the theorem or rule;
• Presentation of formula (s) if any;
• A practical example of how the theorem or rule is used in real life situations.
Note: Each group will have a maximum of 10 minutes for their presentation, including Q&As

58
58
CHAPTER FIVE
DISCRETE PROBABILITY
DISTRIBUTIONS

Discrete Probability Distributions
OBJECTIVES
After completing this chapter, you should be able to :
1 - Construct a probability distribution for a random variable.
2 - Find the mean, variance, standard deviation, and expected value
for a discrete random variable.
3 - Find the exact probability for X successes in n trials of a binomial
experiment.

OBJECTIVES
4 - Find the mean, variance, and standard deviation for the variable of
a binomial distribution.
5 - Find probabilities for outcomes of variables, using the Poisson,
hypergeometric, and multinomial distributions.

INTRODUCTION
By assigning probabilities to all possible outcome, we can make many
decisions.
For example, a crime statistician at the LNP can compute the probabilities that 0, 1, 2
or more crimes will be committed next month.
A statistician at the MOT might choose to assign probabilities to the number of
vehicles that will be register next year.

INTRODUCTION
Once these probabilities are assigned, statistics such as the 𝜇, 𝜎2
and
𝜎 can be computed for these events. With these statistics, various
decisions can be made. The crime statistician will be able to compute
the average number of crimes next month. The MOT statistician can
easily advise the management on how many license plates should be
made available next year.

PROBABILITY DISTRIBUTIONS
We firstly need to review the definition of a variable.
What is a variable?
A variable is a characteristic or attribute that can assume different
values. Various letters of the alphabet, such as X, Y, or Z, are used to
represent variables.

Discrete Probability
Distributions
A random variable is a variable whose values are determined by
chance.
A random variable can be discrete or continuous.
Discrete variables have a finite number of possible values or an
infinite number of values that can be counted.

Distributions
A discrete probability distribution consists of the values a random
variable can assume and the corresponding probabilities of the values.
Discrete probability distributions can be shown by using a graph or a
table. Probability distributions can also be represented by a formula.

Distributions
EX.
Construct a probability distribution for the number of heads when a
coin is tossed three times.

Distributions
Two Requirements for a Probability Distribution
1. The sum of the probabilities of all the events in the sample space
must equal 1; that is, P(X) = 1.
2. The probability of each event in the sample space must be
between 0 and 1 (or equal to 0 or 1). That is, 0 ≤ P(X) ≤ 1.

Distributions
MEAN, VARIANCE, STANDARD DEVIATION, AND EXPECTATION
The mean, variance, and standard deviation for a probability
distribution are computed differently from the mean, variance, and
standard deviation for samples.
How are means calculated for samples or population?

Distributions
THE MEAN
Formula for the Mean of a Probability Distribution
𝜇 = 𝑋1 ∙ 𝑃 𝑋1 + 𝑋2 ∙ 𝑃 𝑋2 +𝑋3 ∙ 𝑃 𝑋3 + ⋯ + 𝑋𝑁 ∙ 𝑃 𝑋𝑁 =
𝑖=1
𝑁
𝑋𝑖 ∙ 𝑃(𝑋𝑖)
where 𝑋1, 𝑋2, 𝑋3, . . . , 𝑋𝑁 are the outcomes and 𝑃 𝑋1 , 𝑃 𝑋2 , 𝑃 𝑋3 ,. . . , 𝑃 𝑋𝑁 are the
corresponding probabilities.

Distributions
THE MEAN
EX.
Find the mean of the number of heads that appear when a coin is
tossed three times.

Distributions
THE VARIANCE AND STANDARD DEVIATION
Formula for the Variance of a Probability Distribution
𝜎2
=
𝑖=1
𝑁
[𝑋𝑖
2
∙ 𝑃(𝑋𝑖)] − 𝜇2
The SD is: 𝜎 = 𝑖=1
𝑁
[𝑋𝑖
2
∙ 𝑃(𝑋𝑖)] − 𝜇2

Distributions
THE VARIANCE AND STANDARD DEVIATION
Compute the variance and standard deviation for the probability
distribution in the previous example.

Distributions
EXPECTATION
Another concept related to the mean for a probability distribution is
that of expected value or expectation.
Expected value is used in various types of games of chance, in
insurance, and in other areas, such as decision theory

Distributions
EXPECTATION
The expected value of a discrete random variable of a probability
distribution is the theoretical average of the variable.
𝜇 = 𝐸 𝑋 = 𝑋 ∗ 𝑃(𝑋)

Distributions
EXPECTATION
EX 1.
One thousand tickets are sold at $1 each for a color television valued
at $350. What is the expected value of the gain if you purchase one
ticket?

Distributions
EXPECTATION
SOLUTION Win Lose
Gain X 349 -1
Probability 1
1000
999
1000
𝐸 𝑋 =
𝑖=1
𝑁
𝑋𝑖 ∙ 𝑃 𝑋𝑖 = 349 ∙
1
1000
+ −1 ∙
999
1000
𝑬 𝑿 = -$0.65

Distributions
EXPECTATION
EX 2.
One thousand tickets are sold at $1 each for four prizes of $100, $50,
$25, and $10. After each prize drawing, the winning ticket is then
returned to the pool of tickets. What is the expected value if you
purchase two tickets?

Distributions
EXPECTATION
SOLUTION Win Lose
Gain X $98 $48 $23 $8 -$2
Probability
2
1000
2
1000
2
1000
2
1000
992
1000
𝐸 𝑋 =
𝑖=1
𝑁
𝑋𝑖 ∙ 𝑃 𝑋𝑖 = 98 ∙
2
1000
+ 48 ∙
2
1000
+ 23 ∙
2
1000
+ 8 ∙
2
1000
+ (−2) ∙
992
1000
𝑬 𝑿 = -$1.63

Distributions
THE BINOMIAL DISTRIBUTION
Many types of probability problems have only two outcomes or can be
reduced to two outcomes.
For example, when a coin is tossed, it can land heads or tails. When a
baby is born, it will be either male or female. In a basketball game, a
team either wins or loses.
A true/false item can be answered in only two ways, true or false.

Distributions
A binomial experiment is a probability experiment that satisfies the
following four requirements:
1. There must be a fixed number of trials.
1. Each trial can have only two outcomes or outcomes that can be
reduced to two outcomes. These outcomes can be considered as
either success or failure.

Distributions
3. The outcomes of each trial must be independent of one another.
4. The probability of a success must remain the same for each trial.
A binomial experiment and its results give rise to a special probability
distribution called the binomial distribution.

Distributions
The outcomes of a binomial experiment and the corresponding
probabilities of these outcomes are called a binomial distribution.

Distributions
NOTATION FOR THE BINOMIAL DISTRIBUTION
P(S) => probability of success
P(F) => probability of failure
p => numerical probability of a success
q => numerical probability of a failure
P(S) = p and P(F) = 1 - p = q
n number of trials
X number of successes in n trials

Distributions
BINOMIAL PROBABILITY FORMULA
P(X) =
𝑛!
(𝑛 −𝑋)!𝑋!
𝑝𝑋
*𝑞𝑛−𝑋

Distributions
BINOMIAL PROBABILITY FORMULA
A coin is tossed 3 times. Find the probability of getting exactly two
heads (Use the binomial probability formula).

Distributions
MEAN, VARIANCE, AND STANDARD DEVIATION FOR THE
BINOMIAL DISTRIBUTION
Mean: 𝜇 = n ∙ 𝑝
Variance: 𝜎2= n ∙ 𝑝 ∙ 𝑞
Standard deviation: σ = 𝑛 ∙ 𝑝 ∙ 𝑞

Distributions
CREATING A BINOMIAL DISTRIBUTION AND GRAPH IN EXCEL
See page 282 of the text book for step by step instruction.

Distributions
THE MULTINOMIAL DISTRIBUTION
We use Multinomial Distribution in cases where each trial has more
than two outcomes.
Ex. In an experiment involving choice of best subject (Math, English,
and Biology)

Distributions
THE MULTINOMIAL DISTRIBUTION
In Multinomial Distribution,
 probability of success is constant for each trial,
 outcomes are independent for a fixed number of trials,
 events are mutually exclusive.

Distributions
FORMULA FOR THE MULTINOMIAL DISTRIBUTION
𝑃 𝑋 =
𝑛!
𝑋1!∙𝑋2!∙𝑋3!∙⋯𝑋𝑘!
∙ 𝑝1
𝑋1
∙ 𝑝2
𝑋2
∙ 𝑝3
𝑋3
… . 𝑝𝑘
𝑋𝑘
where 𝑋1 + 𝑋2 + 𝑋3 + … + 𝑋𝑘 = 𝑛 𝑎𝑛𝑑 𝑝1 + 𝑝2 + 𝑝3 … + 𝑝𝑘 = 1.

Distributions
EX.
In a large city, 50% of the people choose a movie, 30% choose dinner
and a play, and 20% choose shopping as a leisure activity. If a sample
of 5 people is randomly selected, find the probability that 3 are
planning to go to a movie, 1 to a play, and 1 to a shopping mall

Distributions
THE POISSON DISTRIBUTION
A discrete probability distribution that is useful when n is large and p
is small and when the independent variables occur over a period of
time is called the Poisson distribution.

Distributions
THE POISSON DISTRIBUTION
The Poisson distribution can be used when a density of items is
distributed over a given area or volume, such as the number of plants
growing per acre or the number of defects in a given length of
videotape.

Distributions
FORMULA FOR THE POISSON DISTRIBUTION
𝑃 𝑋, 𝜆 =
℮−𝜆𝜆𝑋
𝑋!
where 𝑋 = 0,1,2, …
The letter ℮ is a constant approximately equal to 2.7183.

Distributions
EX 1.
If there are 200 typographical errors randomly distributed in a 500-
page manuscript, find the probability that a given page contains
exactly 3 errors.

Distributions
EX 2.
A sales firm receives, on average, 3 calls per hour on its toll-free
number. For any given hour, find the probability that it will receive the
following.
a. At most 3 calls
b. At least 3 calls
c. 5 or more calls

Distributions
FORMULA FOR THE HYPERGEOMETRIC DISTRIBUTION
𝑃 𝑋 =
𝑎𝐶𝑋 ∙ 𝑏𝐶𝑛−𝑋
𝑎 + 𝑏𝐶𝑛

Distributions
EX 1.
Ten people apply for a job as assistant manager of a restaurant. Five
have completed college and five have not. If the manager selects 3
applicants at random, find the probability that all 3 are college
graduates.

Distributions
EX 2.
A recent study found that 2 out of every 10 houses in a neighborhood
have no insurance. If 5 houses are selected from 10 houses, find the
probability that exactly 1 will be uninsured.

STATS 101 WK7 NOTE.pptx

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a STATS 101 WK7 NOTE.pptx

Semelhante a STATS 101 WK7 NOTE.pptx (20)

Último

Último (20)

STATS 101 WK7 NOTE.pptx

Notas do Editor