SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
1
Lesson 1
INTRODUCTION
SPSS (Statistical Product and Service Solution) is the most famous and commonly used
software for statistics measurement and analysis. It provides a lot of tools to help on calculation
statistical parameters on descriptive statistics, representing data into various graph, calculation
on statistical inference and many others tools.
Manual calculation has so many limitations especially when we have a big number of
samples. It may produce the inaccurate calculation which will impact the accuracy of its
interpretation and analysis. Thus, this software will help us a lot to improve the accuracy and
effectiveness.
Besides SPSS there are some other statistic software such as MINITAB, SAS, Stata,
Lisrel, Exel or PSPP. Among them, PSPP is the free software that you can download easily from
internet.
Understanding the software (SPSS , PSPP)
This statistics software consist of 3 main parts ( 3 windows). They are:
1. Data Editor window
It is automatically open when you open the software. It consist two main window, Data
View and Variable View. It is used on the first step input the data.
Variable View: Used to determine the variable and its setting
Data View: Used to input the data
2
2. Output viewer window
This window will automatically pop up after executing data processing instruction on the
software.
How To Define and Input Data
We can use Data editor window to input the data into SPSS or PSPP. The following steps will
guide you how to create data.
1. Determine the variable on the Variable View. Variable View provides some data setting
that need to set up such as:
a. Variable name: can be defined as your own definition. If you don’t define then
software will automatically generate var 00001, var 00002 etc for the variable name.
b. Data Type: Numeric, String, Date, etc. Default data type is numeric
c. Variable Label and its Value : It used when your data is categorical data.
2. After you have done with the data setting, go to Data view to input the value of the
variable that had been created.
3
Lesson2.
Descriptive Statistics
How to present Statistics Description Measurement using SPSS or PSPP?
Study Case 1:
A random sample of 12 joggers was asked to keep track and report the number of miles they ran
last week. The responses are:
5.5 7.2 1.6 22.0 8.7 2.8 5.3 3.4 12.5 18.6 8.3 6.6
a. compute all the three statistics that measure the central tendency
Analyze  Descriptive Statistics Descriptive/ Frequency
b. Briefly describe what each statistics tell you
c. Measure all the variability measurement
Analyze Descriptive Statistics Descriptive
d. What is the interpretation?
Study Case 2:
Has the educational level of adults changed over 15 years? To help answer this question the
Bereau of Labour Statistics compiled the following table, which lists the number (1000) of adults
25 years of age and older who are employed. Use graphical technique to present these figure
1992 1995 2000 2004
Less than high school 13418 11972 12486 12513
High school 37910 36692 37699 37790
Some college 27048 30927 33257 34412
College graduate 28113 31149 36619 40418
4
Answer:
Steps:
1. Create the variable and input the data
2. Create Chart to see the difference
Study Case 3
Given below raw data:
Id Num Name Gender Marital St Height Weight DoB
785
756
757
788
793
803
811
856
876
888
Aminah
Imas
Tn. Rafius
Ismet
Esih
Sumiati
Romlah
Dudung
Fernando
Marimar
Female
Female
Male
Male
Female
Female
Female
Male
Male
Female
Married
Married
Married
Married
Widowed
Married
Widowed
Single
Single
Married
147.6
151
162.4
165
158
156.5
152.7
167
170
168
55.5
42
61.4
64.5
60
60.1
57.7
56
60
55
15-Feb-1953
30-Jun-1986
30-Jun-1960
15-Jan-1967
7-May-1950
19-Aug-1950
12 May 1987
16-Sep-1988
17-Oct-1992
17-Dec-1979
a. Input the Data into PSPP/ SPSS
b. Give some Descriptive Measurement (central tendency & variability) of height variable
c. Interpret the standard deviation of Height variable
d. See the proportion of Marital Status by using pie chart
5
Study Case 4
(Xr 04-36). Everyone is familiar with waiting lines or queues. For example, people wait in line at
a supermarket to go through the checkout counter. There are two factors that determine how long
the queue becomes. One is the speed of service. The other is the number of arrivals at the
checkout counter. The mean number of arrivals is an important number, but so is the standard
deviation. Suppose that a consultant for the supermarket counts the number of arrivals per hour
during a sample of 150 hours.
a. Compute the minimum , maximum, mean, standard deviation of the arrival variable
b. Create the Histogram and give comment on the skewness of the distribution
c. If it is assumed to be bell shaped, interpret the standard deviation.
6
Lesson Three.
Correlation and Regression
This lesson studies how to present some correlation parameters (covariance, coefficient of
correlation and coefficient of determination) and how to present the regression line for the spread
of data.
Some equivalence terms:
Covariance  Covariance
Coefficient of correlation  Pearson Correlation
Coefficient of Determination  R square
Study Case 1
1. A Retailer wanted to estimate the monthly fixed and variable selling expenses. As a first step she
collected data from the past 8 months. The total selling expenses (in $ thousands) and the total
sales (in $ thousands) were recorded and listed below
Total Sales Selling
Expenses
20 14
40 16
60 18
50 17
50 18
55 18
60 18
70 20
a. Compute the covariance, coefficient of correlation and the coefficient of determination and
describe what these statistics tell you
Answer: using PSPP
For having the covariance and coefficient of correlation
AnalyzeBivariate Correlation
7
The table above give you the value of Pearson Correlation (coefficient of correlation) which
is 0.97.
Interpretation: the pearson correlation is 0.97 which is really close to positive 1. It means
that the selling expenses and Total Sales variables has very strong linear relationship.
Note: in SPSS software this correlation table also covers covariance as well, but note in
PSSS (pity of us, hiks hiks hiks)
For having the coefficient of determination (R square)
Analyze  Linear Regression
8
The table above give you the R square is 0.95. It means that there are around 95 % the
fluctuation of selling expenses can be explained by the fluctuation of the total sales. The
remaining is unexplained
b. Determine the least square line and use it to produce the estimates retailer wants.
Answer using PSPP:
For having Least Square Line
Analyze Linear regression
The table above give you coefficient for your least square line. Based on the table the least
square line is y=0.11x + 11.66, with y is selling expenses and x is the total sales
The retailer wants to estimate the fixed and variable selling expenses using the least square
line:
The fixed selling expenses based on the table is $11.6 (in thousand). It means that the
minimum selling expenses has to be covered is $ 11.6 (in thousand) even though there is no
sales.
The variable selling expenses will be determined by 0.11. It means that for every single total
sales increament will lead you to increament on selling expenses as amount as $ 0.11 (in
thousand).
9
LESSON 4. STATISTICAL INFERENCE for Mean
A. One population
The basic idea of inference for mean of one population is trying to describe the condition of
population mean by using information from sample. One sample t-test is provided in SPSS and PSPP. P-
value is the parameter that need to considered in determining rejection of the Null hypothesis. As long the
p-value is less than the significance level, the Null hypothesis is rejected.
Study Case:
(Xr 12-23) [ Mean analysis for one population]
A diet doctor claims that the average North American is more than 20 pounds
overweight. To test his claim, a random sample of 20 North Americans was weighed, and
the difference between their actual weight and their ideal weight was calculated.
a. Do the data allow us to infer at the 5% significance level that the doctor’s claim is
true?
b. What is the interval estimation for the average of overweight with 95% confidence
interval?
Steps:
1. Input the data in one row (only one population sample)
2. Analyze  Compare Means  One Sample t-test
3. Input Overweight variable into test variable and put the tested population mean into test value.
Click option and determine your confidence level
10
4. Click Ok, then you will find the below result
One-Sample Test
Test Value = 20
t Df Sig. (2-tailed) Mean Difference
95% Confidence Interval of the
Difference
Lower Upper
Overweight .562 19 .581 .850 -2.31 4.01
Interpretation:
a. The appropriate hypothesis for the above case is:
20:0 H
20:1 H
That is one tail test, thus the p-value is 0.581/2 = 0.2905
05.0
Based on the p-value = 0.2905 which is greater than alpha, It indicates that Null hypothesis is not
rejected. It means that there is no sufficient evidence to support the doctor’s claim.
b. The 95% confidence interval of the overweight is [-2.31 : 4.01]
B. Inference of two independent Sample
The basic idea of inference for two independent population is trying to describe the condition of
mean difference of two independent populations by using information from the samples. Independent
Sample t-test is provided in SPSS and PSPP. P-value is the parameter that need to considered in
determining rejection of the Null hypothesis. As long the p-value is less than the significance level, the
Null hypothesis is rejected.
One-Sample Statistics
N Mean Std. Deviation Std. Error Mean
Overweight 20 20.85 6.761 1.512
11
Hypothesis testing in two populations is used when A bussiness analyst or researcher want to observe or
to compare the condition of two population. For example:
1. Compare the expenditures on shoes made in 2000 with those from 2010 in an effort to determine
whether any change occurred over the time.
2. Estimate or test to determine the difference in the market proportion of two companies or the
proportion of the market share of the company in two different regions.
Study Case:
(Xr 13-08) [Mean analysis for two Population]
A men’s softball league is experimenting with a yellow baseball that is easier to see during
nights games. One way to judge the effectiveness is to count the number of errors. In a
preliminary experiment, the yellow baseball was used in 10 games and the traditional white
baseball was used in another 10 games. The number of error in each game was recorded.
a. Can we infer that the there are fewer error on average when the yellow ball is used? (use
α=5%)
b. What is the interval estimation for the mean difference with 95% confidence interval?
Steps:
1. Input the data on the software
2. Create two additional variables. One for combining both data from two sample, and another one
for grouping each of the data based on its sample class. It is needed to be done since the number
of sample from the two independent populations no need to be the same.
12
3. Analyze  Compare Means  Independent Sample t-test
4. Input the combined data into test variable and the group into define group, and then click
Define group to create the group value.
5. Click ok, then you will below result
Group Statistics
group N Mean Std. Deviation Std. Error Mean
observation Yellow 10 5.10 2.424 .767
White 10 7.30 2.406 .761
13
Independent Samples Test
Levene's Test for
Equality of Variances t-test for Equality of Means
F Sig. t df
Sig. (2-
tailed)
Mean
Differenc
e
Std. Error
Differenc
e
95% Confidence
Interval of the
Difference
Lower Upper
observa
tion
Equal variances
assumed
.001 .974 -2.037 18 .057 -2.200 1.080 -4.469 .069
Equal variances
not assumed
-2.037 17.99
9
.057 -2.200 1.080 -4.469 .069
Interpretation
Hypothesis Set up
The appropriate hypothesis for the above case is:
0: 210  DIFFH 
0: 211  DIFFH 
05.0
Levene’s Test : this test is used to determine whether equal variance assume or not. If the p-
value (Sig) under Leven Test is greater than the significance level alpha, then equal variance
assumed.
Based on the above result the Levene’s test give sig=0.974 which is greater than the significance
alpha =0.05. It means that equal variance assumed. Thus, we have to use all result based on equal
variance assumed results.
p-value is 0.057/2 =0.0285 (one tail test)
Conclusion: p-value is 0.057/2 =0.0285 which is less than the significance alpha 0.05. Thus, the
null hypothesis is rejected. There is sufficient evidence to support that there are fewer error when
the yellow ball is used.
The 95% confidence interval for the difference of error made by yellow ball and white ball is
[-4.469 : 0.69]
14
C. Paired Sample t-test
Besides one sample t-test and independent sample t-test, SPSS and PSPP also provide
paired t-test. Paired t-test is used when we have paired sample data. Paired sample data is
gathered from one population who had treatment. We want to see the effect of the treatment.
Thus, we measure the condition before and after the treatment. Paired sample also defined as two
dependent samples.
Study Case:
(Xr 13-44) [Mean analysis for two dependent sample (paired sample)]
The president of a large company is in the process of deciding whether to adopt the lunch time
exercise program. The purpose of such program is to improve the health of workers and, in so
doing, reduce medical expenses. To get more information, he instituted an exercise program for
the employee for the office. The president knows that during the winter months medical
expenses are relatively high because of the incidence of colds and flu. Consequently, he decided
to use a match pair design by recording medical expenses for the 12 months before the program
and for the 12 months after the program. The “before” and ‘after” expenses (in thousands of
dollars) are compared on month –to-month basis and shown in the data.
a. Do the data indicate that exercise programs reduce medical expenses (use α = 5%)
b. Estimate with 95% confidence the mean savings produced by exercise programs.
Steps:
1. Input the data into software
2. Analyze  Compare Means  Paired Sample t-Test
15
3. Put variable After under Var 1 and variable Before under Var2
4. Click ok then you will have below result
Paired Samples Statistics
Mean N Std. Deviation Std. Error Mean
Pair 1 After 43.50 12 18.618 5.375
Before 46.58 12 16.670 4.812
Paired Samples Correlations
N Correlation Sig.
Pair 1 After & Before 12 .950 .000
Paired Samples Test
Paired Differences
t df
Sig. (2-
tailed)Mean
Std.
Deviation
Std. Error
Mean
95% Confidence Interval of
the Difference
Lower Upper
Pair 1 After -
Before
-3.083 5.885 1.699 -6.822 .656 -1.815 11 .097
16
Interpretation:
Hypothesis Set up
The appropriate hypothesis for the above case is:
0:0  DIFFbeforeafterH 
0:1  DIFFbeforeafterH 
05.0
p-value is 0.097/2 =0.0485 (one tail test)
Conclusion: p-value is 0.097/2 =0.0485 which is less than the significance alpha 0.05. Thus, the
null hypothesis is rejected. There is sufficient evidence to support that there is smaller amount
medical expenses when the lunch time exercise program applied.
The 95% confidence interval for the difference of the medical expenses before and after the
lunch time exercise program is [-6.822 : 0.656].
17
Lesson 5.Chi-Square Goodness-of-Fit Test
Basically Chi square Goodness of Fit Test is used to described the condition of population of
nominal data. In binomial distribution, the nominal variable could assume one of only two
possible values, such as failure or success. This concept then derives inference of two
populations for proportion. Binomial experiment is extended into Multinomial experiment when
the possible output is more than two. Chi Square Goodness of Fit Test is statistical Measurement
which can be used to inference more than two populations.
Study case:
A machine has a record of producing 80% excellent, 17% good, and 3% unacceptable parts.
After extensive repairs, a sample of 200 produced 157 excellent, 42 good, and 1 unacceptable
part. Have the repairs changed the nature of the output of the machine? Use PSPP with α = 0.05.
Steps:
1. Enter the category data into one variable and the observed frequency into another
variable.
Category data: Quality: 1=excellent, 2=Good, 3=Unacceptable
Figure 5.1
2. The data will be weighted by using its frequency : Data Weight CaseWeight Cases
by Observed_freq
18
3. Do the Chi-Square Test
Analyze Nonparametric Test Chi Square
4. The output given is:
19
5. Output analysis
Step 1: Hypotheses
H0: The repairs did not change the nature of the output of the machine.
[i.e., the proportions remained the same (π1 = 0.80, π2 = 0.17, π3 = 0.03)]
Ha: The repairs did change the nature of the output of the machine.
[i.e., the proportions changed after the repairs (at least one πi ≠ πi,0)]
Step 2: Significance Level
α = 0.05
Step 3: Rejection Region
Reject the null hypothesis if p-value ≤ 0.05 = α.
Step 4.1: Calculate Expected Frequencies
Step 4.2: Check Assumptions
According to footnote a (below), all expected frequencies are ≥ 5 (smallest value is 6).
Step 4.3: Test Statistic and P-value
20
Step 5: Decision
Since p-value = 0.0472 ≤ 0.05, we shall reject the null hypothesis.
Step 6: State conclusion in words
At the α = 0.05 level of significance, there is enough evidence to conclude that the
repairs changed the nature of the output of the machine (the proportions are not what they
used to be)
Lesson 6.Chi-Square of a Contingency Table
The Chi-Square test of a contingency table is used to determine whether there is enough
evidence to infer that two nominal variables are related and to infer that differences exist
between two or more populations of nominal variables.
Example:
Suppose we conducted a prospective cohort study to investigate the effect of aspirin on heart
disease. A group of patients who are at risk for a heart attack are randomly assigned to either a
placebo or aspirin. At the end of one year, the number of patients suffering a heart attack is
recorded.
H0: two variable are independent (no effect on medicine taken into having a heart disease)
Ha: two variable are dependent (there is effect on medicine taken into having a heart disease)
Group
Heart Disease
TotalYes (+) No (-)
Placebo
Aspirin
20
15
80
135
100
150
Total 35 215 250
Steps
1. Input the data. Create the variables: Heart_Disease, freq, Factor.
21
2. The data will be weighted by its frequency
3. Analyze Descriptive Statistics crosstab
Put factor in row box and heart disease in the coloumn box based on the contigency table
4. Results:
22
5. Analysis
p-value = 0.03
Chi-square= 4.98
p-value=0.03<alpha=0.05. It means that Null Hypothesis is rejected. There is sufficient
evidence to support that the medicine taken effect of having heart disease.
23
Anyone who has never made a mistake has never tried anything new.
Albert Einstein
Do not worry about your difficulties in Mathematics. I can assure you mine are still greater
Albert Einstein
3 sentence for getting success: know more than other, work more than other,
and expect less than other.
William Shakespeare
REFERENCE
Managerial Statistics Abbreviated, by Keller, South Western Cengage Learning,2009.
Modul Praktikum Metode Statistika, by FMIPA Gadjah Mada University, 2003

Mais conteúdo relacionado

Mais procurados

T test independent samples
T test  independent samplesT test  independent samples
T test independent samplesAmit Sharma
 
My regression lecture mk3 (uploaded to web ct)
My regression lecture   mk3 (uploaded to web ct)My regression lecture   mk3 (uploaded to web ct)
My regression lecture mk3 (uploaded to web ct)chrisstiff
 
Data Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVAData Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVADerek Kane
 
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...Universidad Particular de Loja
 
Marketing Engineering Notes
Marketing Engineering NotesMarketing Engineering Notes
Marketing Engineering NotesFelipe Affonso
 
Mean, median, mode, Standard deviation for grouped data for Statistical Measu...
Mean, median, mode, Standard deviation for grouped data for Statistical Measu...Mean, median, mode, Standard deviation for grouped data for Statistical Measu...
Mean, median, mode, Standard deviation for grouped data for Statistical Measu...Renzil D'cruz
 
Lesson 16 Data Analysis Ii
Lesson 16 Data Analysis IiLesson 16 Data Analysis Ii
Lesson 16 Data Analysis Iivinod
 
logistic regression analysis
logistic regression analysislogistic regression analysis
logistic regression analysisVinya P
 
Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02Stephen Ong
 
PG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 6 Test of Significance, z TestPG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 6 Test of Significance, z TestAashish Patel
 
2.4 Scatterplots, correlation, and regression
2.4 Scatterplots, correlation, and regression2.4 Scatterplots, correlation, and regression
2.4 Scatterplots, correlation, and regressionLong Beach City College
 
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...Smarten Augmented Analytics
 
Scatterplots, Correlation, and Regression
Scatterplots, Correlation, and RegressionScatterplots, Correlation, and Regression
Scatterplots, Correlation, and RegressionLong Beach City College
 
Machine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear RegressionMachine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear RegressionKush Kulshrestha
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regressionJames Neill
 
P G STAT 531 Lecture 8 Chi square test
P G STAT 531 Lecture 8 Chi square testP G STAT 531 Lecture 8 Chi square test
P G STAT 531 Lecture 8 Chi square testAashish Patel
 

Mais procurados (20)

T test independent samples
T test  independent samplesT test  independent samples
T test independent samples
 
My regression lecture mk3 (uploaded to web ct)
My regression lecture   mk3 (uploaded to web ct)My regression lecture   mk3 (uploaded to web ct)
My regression lecture mk3 (uploaded to web ct)
 
assignment 2
assignment 2assignment 2
assignment 2
 
Hmisiri nonparametrics book
Hmisiri nonparametrics bookHmisiri nonparametrics book
Hmisiri nonparametrics book
 
Data Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVAData Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVA
 
Chapter03
Chapter03Chapter03
Chapter03
 
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
 
Marketing Engineering Notes
Marketing Engineering NotesMarketing Engineering Notes
Marketing Engineering Notes
 
Mean, median, mode, Standard deviation for grouped data for Statistical Measu...
Mean, median, mode, Standard deviation for grouped data for Statistical Measu...Mean, median, mode, Standard deviation for grouped data for Statistical Measu...
Mean, median, mode, Standard deviation for grouped data for Statistical Measu...
 
Lesson 16 Data Analysis Ii
Lesson 16 Data Analysis IiLesson 16 Data Analysis Ii
Lesson 16 Data Analysis Ii
 
Chi square
Chi squareChi square
Chi square
 
logistic regression analysis
logistic regression analysislogistic regression analysis
logistic regression analysis
 
Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02
 
PG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 6 Test of Significance, z TestPG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 6 Test of Significance, z Test
 
2.4 Scatterplots, correlation, and regression
2.4 Scatterplots, correlation, and regression2.4 Scatterplots, correlation, and regression
2.4 Scatterplots, correlation, and regression
 
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
 
Scatterplots, Correlation, and Regression
Scatterplots, Correlation, and RegressionScatterplots, Correlation, and Regression
Scatterplots, Correlation, and Regression
 
Machine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear RegressionMachine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear Regression
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
P G STAT 531 Lecture 8 Chi square test
P G STAT 531 Lecture 8 Chi square testP G STAT 531 Lecture 8 Chi square test
P G STAT 531 Lecture 8 Chi square test
 

Semelhante a Lab manual_statistik

X18125514 ca2-statisticsfor dataanalytics
X18125514 ca2-statisticsfor dataanalyticsX18125514 ca2-statisticsfor dataanalytics
X18125514 ca2-statisticsfor dataanalyticsShantanu Deshpande
 
Statistics final seminar
Statistics final seminarStatistics final seminar
Statistics final seminarTejas Jagtap
 
SPSS statistics - get help using SPSS
SPSS statistics - get help using SPSSSPSS statistics - get help using SPSS
SPSS statistics - get help using SPSScsula its training
 
Assignment 2 Tests of SignificanceThroughout this assignment yo.docx
Assignment 2 Tests of SignificanceThroughout this assignment yo.docxAssignment 2 Tests of SignificanceThroughout this assignment yo.docx
Assignment 2 Tests of SignificanceThroughout this assignment yo.docxrock73
 
Answer the questions in one paragraph 4-5 sentences. · Why did t.docx
Answer the questions in one paragraph 4-5 sentences. · Why did t.docxAnswer the questions in one paragraph 4-5 sentences. · Why did t.docx
Answer the questions in one paragraph 4-5 sentences. · Why did t.docxboyfieldhouse
 
SOC2002 Lecture 11
SOC2002 Lecture 11SOC2002 Lecture 11
SOC2002 Lecture 11Bonnie Green
 
A General Manger of Harley-Davidson has to decide on the size of a.docx
A General Manger of Harley-Davidson has to decide on the size of a.docxA General Manger of Harley-Davidson has to decide on the size of a.docx
A General Manger of Harley-Davidson has to decide on the size of a.docxevonnehoggarth79783
 
DNP 830 Data Collection and Level of Measurement.docx
DNP 830 Data Collection and Level of Measurement.docxDNP 830 Data Collection and Level of Measurement.docx
DNP 830 Data Collection and Level of Measurement.docxwrite5
 
DirectionsSet up your IBM SPSS account and run several statisti.docx
DirectionsSet up your IBM SPSS account and run several statisti.docxDirectionsSet up your IBM SPSS account and run several statisti.docx
DirectionsSet up your IBM SPSS account and run several statisti.docxjakeomoore75037
 
Principal components
Principal componentsPrincipal components
Principal componentsHutami Endang
 
one-way-rm-anova-DE300.pdf
one-way-rm-anova-DE300.pdfone-way-rm-anova-DE300.pdf
one-way-rm-anova-DE300.pdfluizsilva460739
 
Week 2 Assignment1. A. What is the probability of rolling a four.docx
Week 2 Assignment1. A. What is the probability of rolling a four.docxWeek 2 Assignment1. A. What is the probability of rolling a four.docx
Week 2 Assignment1. A. What is the probability of rolling a four.docxmelbruce90096
 
SAMPLING MEAN DEFINITION The term sampling mean .docx
SAMPLING MEAN DEFINITION The term sampling mean .docxSAMPLING MEAN DEFINITION The term sampling mean .docx
SAMPLING MEAN DEFINITION The term sampling mean .docxanhlodge
 
14 + 8 Answers and calculations as basic statistics student would ex.docx
14 + 8 Answers and calculations as basic statistics student would ex.docx14 + 8 Answers and calculations as basic statistics student would ex.docx
14 + 8 Answers and calculations as basic statistics student would ex.docxjeanettehully
 

Semelhante a Lab manual_statistik (20)

Lecture 1
Lecture 1Lecture 1
Lecture 1
 
X18125514 ca2-statisticsfor dataanalytics
X18125514 ca2-statisticsfor dataanalyticsX18125514 ca2-statisticsfor dataanalytics
X18125514 ca2-statisticsfor dataanalytics
 
Statistics final seminar
Statistics final seminarStatistics final seminar
Statistics final seminar
 
SPSS statistics - get help using SPSS
SPSS statistics - get help using SPSSSPSS statistics - get help using SPSS
SPSS statistics - get help using SPSS
 
Assignment 2 Tests of SignificanceThroughout this assignment yo.docx
Assignment 2 Tests of SignificanceThroughout this assignment yo.docxAssignment 2 Tests of SignificanceThroughout this assignment yo.docx
Assignment 2 Tests of SignificanceThroughout this assignment yo.docx
 
Answer the questions in one paragraph 4-5 sentences. · Why did t.docx
Answer the questions in one paragraph 4-5 sentences. · Why did t.docxAnswer the questions in one paragraph 4-5 sentences. · Why did t.docx
Answer the questions in one paragraph 4-5 sentences. · Why did t.docx
 
SOC2002 Lecture 11
SOC2002 Lecture 11SOC2002 Lecture 11
SOC2002 Lecture 11
 
A General Manger of Harley-Davidson has to decide on the size of a.docx
A General Manger of Harley-Davidson has to decide on the size of a.docxA General Manger of Harley-Davidson has to decide on the size of a.docx
A General Manger of Harley-Davidson has to decide on the size of a.docx
 
DNP 830 Data Collection and Level of Measurement.docx
DNP 830 Data Collection and Level of Measurement.docxDNP 830 Data Collection and Level of Measurement.docx
DNP 830 Data Collection and Level of Measurement.docx
 
DirectionsSet up your IBM SPSS account and run several statisti.docx
DirectionsSet up your IBM SPSS account and run several statisti.docxDirectionsSet up your IBM SPSS account and run several statisti.docx
DirectionsSet up your IBM SPSS account and run several statisti.docx
 
Principal components
Principal componentsPrincipal components
Principal components
 
one-way-rm-anova-DE300.pdf
one-way-rm-anova-DE300.pdfone-way-rm-anova-DE300.pdf
one-way-rm-anova-DE300.pdf
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Week 2 Assignment1. A. What is the probability of rolling a four.docx
Week 2 Assignment1. A. What is the probability of rolling a four.docxWeek 2 Assignment1. A. What is the probability of rolling a four.docx
Week 2 Assignment1. A. What is the probability of rolling a four.docx
 
Bivariate Regression
Bivariate RegressionBivariate Regression
Bivariate Regression
 
SAMPLING MEAN DEFINITION The term sampling mean .docx
SAMPLING MEAN DEFINITION The term sampling mean .docxSAMPLING MEAN DEFINITION The term sampling mean .docx
SAMPLING MEAN DEFINITION The term sampling mean .docx
 
14 + 8 Answers and calculations as basic statistics student would ex.docx
14 + 8 Answers and calculations as basic statistics student would ex.docx14 + 8 Answers and calculations as basic statistics student would ex.docx
14 + 8 Answers and calculations as basic statistics student would ex.docx
 
mining
miningmining
mining
 

Último

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 

Último (20)

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 

Lab manual_statistik

  • 1. 1 Lesson 1 INTRODUCTION SPSS (Statistical Product and Service Solution) is the most famous and commonly used software for statistics measurement and analysis. It provides a lot of tools to help on calculation statistical parameters on descriptive statistics, representing data into various graph, calculation on statistical inference and many others tools. Manual calculation has so many limitations especially when we have a big number of samples. It may produce the inaccurate calculation which will impact the accuracy of its interpretation and analysis. Thus, this software will help us a lot to improve the accuracy and effectiveness. Besides SPSS there are some other statistic software such as MINITAB, SAS, Stata, Lisrel, Exel or PSPP. Among them, PSPP is the free software that you can download easily from internet. Understanding the software (SPSS , PSPP) This statistics software consist of 3 main parts ( 3 windows). They are: 1. Data Editor window It is automatically open when you open the software. It consist two main window, Data View and Variable View. It is used on the first step input the data. Variable View: Used to determine the variable and its setting Data View: Used to input the data
  • 2. 2 2. Output viewer window This window will automatically pop up after executing data processing instruction on the software. How To Define and Input Data We can use Data editor window to input the data into SPSS or PSPP. The following steps will guide you how to create data. 1. Determine the variable on the Variable View. Variable View provides some data setting that need to set up such as: a. Variable name: can be defined as your own definition. If you don’t define then software will automatically generate var 00001, var 00002 etc for the variable name. b. Data Type: Numeric, String, Date, etc. Default data type is numeric c. Variable Label and its Value : It used when your data is categorical data. 2. After you have done with the data setting, go to Data view to input the value of the variable that had been created.
  • 3. 3 Lesson2. Descriptive Statistics How to present Statistics Description Measurement using SPSS or PSPP? Study Case 1: A random sample of 12 joggers was asked to keep track and report the number of miles they ran last week. The responses are: 5.5 7.2 1.6 22.0 8.7 2.8 5.3 3.4 12.5 18.6 8.3 6.6 a. compute all the three statistics that measure the central tendency Analyze  Descriptive Statistics Descriptive/ Frequency b. Briefly describe what each statistics tell you c. Measure all the variability measurement Analyze Descriptive Statistics Descriptive d. What is the interpretation? Study Case 2: Has the educational level of adults changed over 15 years? To help answer this question the Bereau of Labour Statistics compiled the following table, which lists the number (1000) of adults 25 years of age and older who are employed. Use graphical technique to present these figure 1992 1995 2000 2004 Less than high school 13418 11972 12486 12513 High school 37910 36692 37699 37790 Some college 27048 30927 33257 34412 College graduate 28113 31149 36619 40418
  • 4. 4 Answer: Steps: 1. Create the variable and input the data 2. Create Chart to see the difference Study Case 3 Given below raw data: Id Num Name Gender Marital St Height Weight DoB 785 756 757 788 793 803 811 856 876 888 Aminah Imas Tn. Rafius Ismet Esih Sumiati Romlah Dudung Fernando Marimar Female Female Male Male Female Female Female Male Male Female Married Married Married Married Widowed Married Widowed Single Single Married 147.6 151 162.4 165 158 156.5 152.7 167 170 168 55.5 42 61.4 64.5 60 60.1 57.7 56 60 55 15-Feb-1953 30-Jun-1986 30-Jun-1960 15-Jan-1967 7-May-1950 19-Aug-1950 12 May 1987 16-Sep-1988 17-Oct-1992 17-Dec-1979 a. Input the Data into PSPP/ SPSS b. Give some Descriptive Measurement (central tendency & variability) of height variable c. Interpret the standard deviation of Height variable d. See the proportion of Marital Status by using pie chart
  • 5. 5 Study Case 4 (Xr 04-36). Everyone is familiar with waiting lines or queues. For example, people wait in line at a supermarket to go through the checkout counter. There are two factors that determine how long the queue becomes. One is the speed of service. The other is the number of arrivals at the checkout counter. The mean number of arrivals is an important number, but so is the standard deviation. Suppose that a consultant for the supermarket counts the number of arrivals per hour during a sample of 150 hours. a. Compute the minimum , maximum, mean, standard deviation of the arrival variable b. Create the Histogram and give comment on the skewness of the distribution c. If it is assumed to be bell shaped, interpret the standard deviation.
  • 6. 6 Lesson Three. Correlation and Regression This lesson studies how to present some correlation parameters (covariance, coefficient of correlation and coefficient of determination) and how to present the regression line for the spread of data. Some equivalence terms: Covariance  Covariance Coefficient of correlation  Pearson Correlation Coefficient of Determination  R square Study Case 1 1. A Retailer wanted to estimate the monthly fixed and variable selling expenses. As a first step she collected data from the past 8 months. The total selling expenses (in $ thousands) and the total sales (in $ thousands) were recorded and listed below Total Sales Selling Expenses 20 14 40 16 60 18 50 17 50 18 55 18 60 18 70 20 a. Compute the covariance, coefficient of correlation and the coefficient of determination and describe what these statistics tell you Answer: using PSPP For having the covariance and coefficient of correlation AnalyzeBivariate Correlation
  • 7. 7 The table above give you the value of Pearson Correlation (coefficient of correlation) which is 0.97. Interpretation: the pearson correlation is 0.97 which is really close to positive 1. It means that the selling expenses and Total Sales variables has very strong linear relationship. Note: in SPSS software this correlation table also covers covariance as well, but note in PSSS (pity of us, hiks hiks hiks) For having the coefficient of determination (R square) Analyze  Linear Regression
  • 8. 8 The table above give you the R square is 0.95. It means that there are around 95 % the fluctuation of selling expenses can be explained by the fluctuation of the total sales. The remaining is unexplained b. Determine the least square line and use it to produce the estimates retailer wants. Answer using PSPP: For having Least Square Line Analyze Linear regression The table above give you coefficient for your least square line. Based on the table the least square line is y=0.11x + 11.66, with y is selling expenses and x is the total sales The retailer wants to estimate the fixed and variable selling expenses using the least square line: The fixed selling expenses based on the table is $11.6 (in thousand). It means that the minimum selling expenses has to be covered is $ 11.6 (in thousand) even though there is no sales. The variable selling expenses will be determined by 0.11. It means that for every single total sales increament will lead you to increament on selling expenses as amount as $ 0.11 (in thousand).
  • 9. 9 LESSON 4. STATISTICAL INFERENCE for Mean A. One population The basic idea of inference for mean of one population is trying to describe the condition of population mean by using information from sample. One sample t-test is provided in SPSS and PSPP. P- value is the parameter that need to considered in determining rejection of the Null hypothesis. As long the p-value is less than the significance level, the Null hypothesis is rejected. Study Case: (Xr 12-23) [ Mean analysis for one population] A diet doctor claims that the average North American is more than 20 pounds overweight. To test his claim, a random sample of 20 North Americans was weighed, and the difference between their actual weight and their ideal weight was calculated. a. Do the data allow us to infer at the 5% significance level that the doctor’s claim is true? b. What is the interval estimation for the average of overweight with 95% confidence interval? Steps: 1. Input the data in one row (only one population sample) 2. Analyze  Compare Means  One Sample t-test 3. Input Overweight variable into test variable and put the tested population mean into test value. Click option and determine your confidence level
  • 10. 10 4. Click Ok, then you will find the below result One-Sample Test Test Value = 20 t Df Sig. (2-tailed) Mean Difference 95% Confidence Interval of the Difference Lower Upper Overweight .562 19 .581 .850 -2.31 4.01 Interpretation: a. The appropriate hypothesis for the above case is: 20:0 H 20:1 H That is one tail test, thus the p-value is 0.581/2 = 0.2905 05.0 Based on the p-value = 0.2905 which is greater than alpha, It indicates that Null hypothesis is not rejected. It means that there is no sufficient evidence to support the doctor’s claim. b. The 95% confidence interval of the overweight is [-2.31 : 4.01] B. Inference of two independent Sample The basic idea of inference for two independent population is trying to describe the condition of mean difference of two independent populations by using information from the samples. Independent Sample t-test is provided in SPSS and PSPP. P-value is the parameter that need to considered in determining rejection of the Null hypothesis. As long the p-value is less than the significance level, the Null hypothesis is rejected. One-Sample Statistics N Mean Std. Deviation Std. Error Mean Overweight 20 20.85 6.761 1.512
  • 11. 11 Hypothesis testing in two populations is used when A bussiness analyst or researcher want to observe or to compare the condition of two population. For example: 1. Compare the expenditures on shoes made in 2000 with those from 2010 in an effort to determine whether any change occurred over the time. 2. Estimate or test to determine the difference in the market proportion of two companies or the proportion of the market share of the company in two different regions. Study Case: (Xr 13-08) [Mean analysis for two Population] A men’s softball league is experimenting with a yellow baseball that is easier to see during nights games. One way to judge the effectiveness is to count the number of errors. In a preliminary experiment, the yellow baseball was used in 10 games and the traditional white baseball was used in another 10 games. The number of error in each game was recorded. a. Can we infer that the there are fewer error on average when the yellow ball is used? (use α=5%) b. What is the interval estimation for the mean difference with 95% confidence interval? Steps: 1. Input the data on the software 2. Create two additional variables. One for combining both data from two sample, and another one for grouping each of the data based on its sample class. It is needed to be done since the number of sample from the two independent populations no need to be the same.
  • 12. 12 3. Analyze  Compare Means  Independent Sample t-test 4. Input the combined data into test variable and the group into define group, and then click Define group to create the group value. 5. Click ok, then you will below result Group Statistics group N Mean Std. Deviation Std. Error Mean observation Yellow 10 5.10 2.424 .767 White 10 7.30 2.406 .761
  • 13. 13 Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means F Sig. t df Sig. (2- tailed) Mean Differenc e Std. Error Differenc e 95% Confidence Interval of the Difference Lower Upper observa tion Equal variances assumed .001 .974 -2.037 18 .057 -2.200 1.080 -4.469 .069 Equal variances not assumed -2.037 17.99 9 .057 -2.200 1.080 -4.469 .069 Interpretation Hypothesis Set up The appropriate hypothesis for the above case is: 0: 210  DIFFH  0: 211  DIFFH  05.0 Levene’s Test : this test is used to determine whether equal variance assume or not. If the p- value (Sig) under Leven Test is greater than the significance level alpha, then equal variance assumed. Based on the above result the Levene’s test give sig=0.974 which is greater than the significance alpha =0.05. It means that equal variance assumed. Thus, we have to use all result based on equal variance assumed results. p-value is 0.057/2 =0.0285 (one tail test) Conclusion: p-value is 0.057/2 =0.0285 which is less than the significance alpha 0.05. Thus, the null hypothesis is rejected. There is sufficient evidence to support that there are fewer error when the yellow ball is used. The 95% confidence interval for the difference of error made by yellow ball and white ball is [-4.469 : 0.69]
  • 14. 14 C. Paired Sample t-test Besides one sample t-test and independent sample t-test, SPSS and PSPP also provide paired t-test. Paired t-test is used when we have paired sample data. Paired sample data is gathered from one population who had treatment. We want to see the effect of the treatment. Thus, we measure the condition before and after the treatment. Paired sample also defined as two dependent samples. Study Case: (Xr 13-44) [Mean analysis for two dependent sample (paired sample)] The president of a large company is in the process of deciding whether to adopt the lunch time exercise program. The purpose of such program is to improve the health of workers and, in so doing, reduce medical expenses. To get more information, he instituted an exercise program for the employee for the office. The president knows that during the winter months medical expenses are relatively high because of the incidence of colds and flu. Consequently, he decided to use a match pair design by recording medical expenses for the 12 months before the program and for the 12 months after the program. The “before” and ‘after” expenses (in thousands of dollars) are compared on month –to-month basis and shown in the data. a. Do the data indicate that exercise programs reduce medical expenses (use α = 5%) b. Estimate with 95% confidence the mean savings produced by exercise programs. Steps: 1. Input the data into software 2. Analyze  Compare Means  Paired Sample t-Test
  • 15. 15 3. Put variable After under Var 1 and variable Before under Var2 4. Click ok then you will have below result Paired Samples Statistics Mean N Std. Deviation Std. Error Mean Pair 1 After 43.50 12 18.618 5.375 Before 46.58 12 16.670 4.812 Paired Samples Correlations N Correlation Sig. Pair 1 After & Before 12 .950 .000 Paired Samples Test Paired Differences t df Sig. (2- tailed)Mean Std. Deviation Std. Error Mean 95% Confidence Interval of the Difference Lower Upper Pair 1 After - Before -3.083 5.885 1.699 -6.822 .656 -1.815 11 .097
  • 16. 16 Interpretation: Hypothesis Set up The appropriate hypothesis for the above case is: 0:0  DIFFbeforeafterH  0:1  DIFFbeforeafterH  05.0 p-value is 0.097/2 =0.0485 (one tail test) Conclusion: p-value is 0.097/2 =0.0485 which is less than the significance alpha 0.05. Thus, the null hypothesis is rejected. There is sufficient evidence to support that there is smaller amount medical expenses when the lunch time exercise program applied. The 95% confidence interval for the difference of the medical expenses before and after the lunch time exercise program is [-6.822 : 0.656].
  • 17. 17 Lesson 5.Chi-Square Goodness-of-Fit Test Basically Chi square Goodness of Fit Test is used to described the condition of population of nominal data. In binomial distribution, the nominal variable could assume one of only two possible values, such as failure or success. This concept then derives inference of two populations for proportion. Binomial experiment is extended into Multinomial experiment when the possible output is more than two. Chi Square Goodness of Fit Test is statistical Measurement which can be used to inference more than two populations. Study case: A machine has a record of producing 80% excellent, 17% good, and 3% unacceptable parts. After extensive repairs, a sample of 200 produced 157 excellent, 42 good, and 1 unacceptable part. Have the repairs changed the nature of the output of the machine? Use PSPP with α = 0.05. Steps: 1. Enter the category data into one variable and the observed frequency into another variable. Category data: Quality: 1=excellent, 2=Good, 3=Unacceptable Figure 5.1 2. The data will be weighted by using its frequency : Data Weight CaseWeight Cases by Observed_freq
  • 18. 18 3. Do the Chi-Square Test Analyze Nonparametric Test Chi Square 4. The output given is:
  • 19. 19 5. Output analysis Step 1: Hypotheses H0: The repairs did not change the nature of the output of the machine. [i.e., the proportions remained the same (π1 = 0.80, π2 = 0.17, π3 = 0.03)] Ha: The repairs did change the nature of the output of the machine. [i.e., the proportions changed after the repairs (at least one πi ≠ πi,0)] Step 2: Significance Level α = 0.05 Step 3: Rejection Region Reject the null hypothesis if p-value ≤ 0.05 = α. Step 4.1: Calculate Expected Frequencies Step 4.2: Check Assumptions According to footnote a (below), all expected frequencies are ≥ 5 (smallest value is 6). Step 4.3: Test Statistic and P-value
  • 20. 20 Step 5: Decision Since p-value = 0.0472 ≤ 0.05, we shall reject the null hypothesis. Step 6: State conclusion in words At the α = 0.05 level of significance, there is enough evidence to conclude that the repairs changed the nature of the output of the machine (the proportions are not what they used to be) Lesson 6.Chi-Square of a Contingency Table The Chi-Square test of a contingency table is used to determine whether there is enough evidence to infer that two nominal variables are related and to infer that differences exist between two or more populations of nominal variables. Example: Suppose we conducted a prospective cohort study to investigate the effect of aspirin on heart disease. A group of patients who are at risk for a heart attack are randomly assigned to either a placebo or aspirin. At the end of one year, the number of patients suffering a heart attack is recorded. H0: two variable are independent (no effect on medicine taken into having a heart disease) Ha: two variable are dependent (there is effect on medicine taken into having a heart disease) Group Heart Disease TotalYes (+) No (-) Placebo Aspirin 20 15 80 135 100 150 Total 35 215 250 Steps 1. Input the data. Create the variables: Heart_Disease, freq, Factor.
  • 21. 21 2. The data will be weighted by its frequency 3. Analyze Descriptive Statistics crosstab Put factor in row box and heart disease in the coloumn box based on the contigency table 4. Results:
  • 22. 22 5. Analysis p-value = 0.03 Chi-square= 4.98 p-value=0.03<alpha=0.05. It means that Null Hypothesis is rejected. There is sufficient evidence to support that the medicine taken effect of having heart disease.
  • 23. 23 Anyone who has never made a mistake has never tried anything new. Albert Einstein Do not worry about your difficulties in Mathematics. I can assure you mine are still greater Albert Einstein 3 sentence for getting success: know more than other, work more than other, and expect less than other. William Shakespeare REFERENCE Managerial Statistics Abbreviated, by Keller, South Western Cengage Learning,2009. Modul Praktikum Metode Statistika, by FMIPA Gadjah Mada University, 2003