SlideShare uma empresa Scribd logo
1 de 44
Baixar para ler offline
BSTAT 5325 Advanced Statistical Methods Fall 2016
Effect of Action Video Games on Math
Performance
Group 8 Project Report
Team Members
Anish Grandhi
Manikandan Sundarapandian
Sathya Narayanan Manivannan
Smruti Chandan Chhatwani
Table of Contents
1. Project Abstract...............................................................................................................................1
2. Project Motivation ..........................................................................................................................1
3. Questions of Interest .......................................................................................................................1
4. Dataset Details ................................................................................................................................1
5. Statistical Methods:.........................................................................................................................3
6. Statistical Analysis, Results, and Interpretation..............................................................................4
i. Definitions, Notations, and Assumptions ...................................................................................4
ii. Details of the Analysis and Interpretations.................................................................................5
iii. Diagnostics Checks...................................................................................................................19
7. Criticisms and Possible Extensions...............................................................................................20
8. Conclusions...................................................................................................................................20
9. Appendix.......................................................................................................................................22
1
1. Project Abstract
The aim of our project is to conduct an experimental study on a dataset containing 30
observations. The dataset includes pre-test and post-test performance of students from different
majors. By default, students are made to play non-action video game before pre-test. Each
student is made to write a math test which includes Geometry, Word, and Non-word problems.
Their anxiety levels, confidence level, and speed for doing the calculations are all measured
(Pre-test scores). The students are made to play an action video game after which they are made
to take the math test where the performance is measured again (Post-test scores). Statistical
analysis is performed on the data obtained and conclusions are made.
2. Project Motivation
Curiosity is the main driving factor for this project. Two main points of interest that
caught out attention - Does playing an action game before an exam affect the score and if
gender plays a role too. A lot of research had showed that playing a video game indeed does
improve the scores of examinations and Pilots drove better after they were subjected to an
action video game. We wanted to prove this by conducting statistical analysis on a group of
students and checking their math scores. And, if gender had any role or not.
3. Questions of Interest
 Is there a change in overall performance of the students after playing an AVG (Action
Video Game) and a Non-AVG (Non-Action Video game)?
 Does ‘Gender’ attribute play a role in affecting the overall performance?
4. Dataset Details
 Dataset has been obtained from the website -
https://figshare.com/articles/BJET_Data_Entry/1167496
 The number of variables in the dataset are 31.
 The number of observations in the dataset- 30 Observations
 Student Number - A unique number was assigned to each student
 Group - Action Video Game (AVG) versus non-Action Video Game (non-AVG)
group
 Gender - Female/Male
2
 Description of the variables –
Attributes Description Values
Student Number Student ID No Numeric Value
Group
Type of Game the
students played
AVG- Action Video
Game
Non- AVG- Non-
Action Video Game
Gender Student Gender Male and Female
Working Memory Assessment- Pre-Test Measures
WM_AbsoluteOSPAN_Pretest Pretest OSPAN score
WM_TotalCorrect_Pretest Pretest Total number of correct answers
WM_MathErrors_Pretest Pretest Number of Math errors
WM_MathSpeedErrors_Pretest Pretest Math Speed Errors
WM_MathAccuracyErrors_Pretest Pretest Accuracy Errors
Working Memory Post-test Measures:
WM_AbsoluteOSPAN_Posttest Posttest OSPAN score
WM_TotalCorrect_Posttest Post Total number of correct answers
WM_MathErrors_Posttest Posttest Number of Math errors
WM_MathSpeedErrors_Posttest Posttest Math Speed Errors
WM_MathAccuracyErrors_Posttest Posttest Accuracy Errors
Mathematics Anxiety and Confidence in Learning Mathematics
Pretest Measures (1-min; 5-max):
Confidence_Pre Pretest Confidence Measure
Anxiety_Pre Pretest Anxiety Measure
Post-test Measures (1-min; 5-max):
Confidence_Post Posttest Confidence Measure
Anxiety_Post Posttest Anxiety Measure
Mathematics Assessment
The mathematics performance test included geometry, word, and non-word
problems sections.
Pretest Measures:
Math_Geometry_Pre Geometry score in % correct
Math_Word_Pre Mathematics word problems score in % correct
Math_NonWord_Pre Math non-word problems score in % correct
Math_Total_Pre Total score on mathematics test in % correct
TimePre Time spent on mathematics test (in min)
Post-test Measures:
Math_Geometry_Post Geometry score in % correct
Math_Word_Post Mathematics word problems score in % correct
Math_NonWord_Post Math non-word problems score in % correct
Math_Total_Post Total score on mathematics test in % correct
TimePost Time spent on mathematics test (in min)
Perceived Cognitive Load (1-min; 9-max)
CogLoadPre Pretest cognitive load
CogLoadPost Post-test cognitive load
Mental Rotation Test (MRT)
3
MRT_Pre Pretest MRT score
MRT_Post Post-test MRT score
 Each record contains pre-test and post test scores relative to the game played.
5. Statistical Methods:
a) Is there a change in overall performance of the students after playing an AVG (Action
Video Game)?
For this experiment, we considered:
 One dependent variable that is continuous – Math_Total_Post
 1 independent variable with two categories – Group (Non Action Video Game,
Action Video Game)
To answer the question of interest, we need determine whether action video gamers and
non-action video gamers perform equally in the final math exam. This allows us to
check whether there is a change in overall performance of the students after playing the
action video game. So, we chose to perform ANOVA (Analysis of Variance) method
between independent and dependent variables.
b) Does ‘Gender’ attribute play a role in affecting the overall performance?
For this experiment, we considered:
 One Independent Variable with 2 categories. (male, female)
 One dependent variable that is continuous – Math_Total_Post
To determine whether gender attribute has any effect on the overall performance of
mathematics test, we performed ANOVA method on the dependent and independent
variables. To check whether there was a linear dependency, linear regression with
gender as independent variable and Total Math Score (Post) as dependent variable.
4
6. Statistical Analysis, Results, and Interpretation
i. Definitions, Notations, and Assumptions
a) ANOVA definition – Analysis of variance, a statistical method in which the
variation in a set of observations is divided into distinct components.
b) Linear Regression definition – In statistics, linear regression is an approach for
modelling the relationship between a scalar dependent variable y and one or more
explanatory variables (or independent variables) denoted X. The case of one
explanatory variable (independent variable) is called simple linear regression.
c) F-Statistic definition –An F statistic is a value you get when you run an ANOVA
test or a regression analysis to find out if the means between two populations are
significantly different.
d) Parameters measured and their notations –
 SST - Sum of Squares Treatment
 SSE - Sum of Squares Error
 SS (Total) – Sum of Squares Total
 B0 - y intercept
 B1 - Coefficient of Independent Variable
 ŷ – Predicted Value
 k – Number of Categories of Independent Variable
 N – Number of Observations
 MST – Mean Square of Treatment
 MSE – Mean Square of Error
e) Assumptions of ANOVA -
 1 dependent variable that is continuous
 1 independent variable with two or more categories
 Independent of observations
 No significant outliers
 Normally distributed data
 Homoscedasticity
f) Assumptions of Linear Regression –
 Linear relationship exists between DV and IV
 Independence of observations
 The residuals (errors) of the regression line are approximately Normally
distributed
 Equal Variances (homoscedasticity) – Variances along line of best fit
remains similar as move along line.
 No significant outliers
5
ii. Details of the Analysis and Interpretations
a. Analysis for Question of Interest 1 (Is there a change in overall performance of the
students after playing an Action Video Game?)
We conducted Analysis of Variance method to check for this question.
We had 12 persons who played action video game and 18 persons who played non-
action video game in our dataset. Figure 1 gives a report about the observations.
Figure 1
Important Note – Non AVG group is transformed to 1 and AVG group to 0 for
calculation purposes.
Test for outliers: To check for outliers, we used the following options to create a
box plot for our dependent variable.
Analyze  Descriptive Statistics  Explore (give dependent variable) 
Statistics (check Outliers)
Figure 2
Figure 2 shows the box plot output in SPSS. No points lie outside the box plot.
We can interpret that there are no outliers.
6
Figure 3
Figure 3 shows the extreme values on the higher and lower side.
Test for normality:
Option used in SPSS:
Analyze  Descriptive Statistics  Explore (give dependent variable)  Plots
(Histogram, Normality plot with tests)
Figure 4
Figure 4 shows that Shapiro Wilk’s test has significance value 0.030 which is less
than 0.05. This means that the dependent variable is normally distributed.
7
Figure 5
Figure 5 shows a normally distributed histogram of the dependent variable.
Test for equal variance (Homoscedasticity):
We considered the Null Hypothesis(H0) as group means are equal and the
alternative hypothesis(H1) as the group means differ from each other.
H0 = Group means are equal
H1 = Group means differ from each other
Option used in SPSS:
Analyze  Compare means  One way ANOVA  Options (Homogeneity of
variance test)
Figure 6
The output of the Homogeneity of variances test is shown in Figure 6. Since Sig
value we got is greater than 0.05, we reject H0 and conclude that there is a
difference in group means.
8
ANOVA results:
Between groups – Treatment
Within groups – Error
SST 3274.175
SSE 12539.42
SS Total 15813.59
k 2
N 30
MST 3274.175
MSE 447.836
F Stat 7.311
P value 0.012
Figure 7
Figure 7 shows that the significance is less than 0.05. This means that the F-
statistic is significant. This proves that the type of game played has effect on
Math_Total_Post (the total post score of math exam).
To find the value of contrast between the two groups (AVG group and Non-AVG
group):
Option used in SPSS:
Analyze  Compare means  One way ANOVA (Contrasts).
The contrast coefficient is the weightage given to each group. ‘Coefficient total’
should always be zero.
9
Figure 8
Figure 8 shows the values of the coefficients. They should add up to the value
zero.
From Figure 9, the values from ‘Does not assume equal variance’ are considered
for analysis as our experiment failed in Levene’s test of equal variance. The value
of significance is 0.017 which is less than 0.05 and so we can conclude that the
results are significant.
The value of contrast between two groups is found to be 21.324.
To determine the size of the difference (Effect size):
Option used in SPSS:
Analyze  Compare means  Options  ANOVA table and ETA
Figure 10
Figure 10 shows the magnitude of the difference in means is given by the Eta
Squared value. Eta Squared value is found to be 0.207.
Figure 9
10
b. Analysis and Interpretation for Question of Interest 2 (Does ‘Gender’ attribute play
a role in affecting the overall performance?).
We conducted Analysis of Variance to check for this question.
Our dataset contains 14 males and 16 females.
Figure 11
Important Note – Values of males are transformed as 1 and females as 0 for
calculation purposes.
Test for outliers: To check for outliers, we used the following options to create a
box plot for our dependent variable -
Analyze  Descriptive Statistics  Explore (give dependent variable) 
Statistics (check Outliers)
Figure 12
Figure 12 shows the box plot output in SPSS. No points lie outside the box
plot. We can interpret that there are no outliers
11
Test for normality:
To check for normality, we performed Shapiro Wilk’s test.
Option used in SPSS:
Analyze  Descriptive Statistics  Explore (give dependent variable)  Plots
(Histogram, Normality plot with tests)
Figure 13
Figure 13 shows that the significance value is less than 0.05. This means that the
dependent variable is normally distributed
Test for equal variance (Homoscedasticity):
To check for homogeneity of variances, we performed Levene’s test.
We considered null hypothesis(H0) and alternate hypothesis(H1) as -
H0 = Group means are equal
H1 = Group means differ from each other
Option used in SPSS:
Analyze  Compare means  One way ANOVA  Options (Homogeneity of
variance test)
Figure 14
Figure 14 shows that significance value is greater than 0.05, we reject H0 and
conclude that there is a difference in group means.
ANOVA Test
The ANOVA test helps us determine if there is a difference between the mean total
math scores that males and females have received.
12
Figure 15
Figure 15 shows us that the significance value is less than 0.05. This means that
the F statistic is significant. We concluded that gender has a role in determining
the final math score.
Since we failed Levene’s test, we went ahead and performed Custom Contrasts
test.
To find the value of contrast between the two genders (AVG and Non-AVG):
Option used in SPSS:
Analyze  Compare means  One way ANOVA (Contrasts).
The contrast coefficient is the weightage given to each group. ‘Coefficient total’
should always be zero.
The value of contrast is 39.05.
Figure 16
Figure 17
13
To determine the size of the difference (Effect size):
Option used in SPSS:
Analyze  Compare means  Options  ANOVA table and ETA
Figure 19
The magnitude of the difference in means is given by the Eta Squared value.
Figure 19 shows that the Eta Squared value to be 0.207
c. Linear Regression on Question of Interest 2 (Does ‘Gender’ attribute play a role
in affecting the overall performance?)
We conducted Linear Regression to check if there was a dependency between
gender and the final math score.
The scatter plot shows that the first assumption was violated and there exists no
linear relationship between gender and Math_Total_Post
Option used in SPSS:
Graphs  Chart builder  Scatter/Dot (Choose dependent and independent
variables)
Figure 20 shows that a scatterplot between gender and Math_Total_Post scores.
Figure 18
14
Figure 20
We have tried different transformations and none of them revealed any linear
dependency.
Figure 21
Figure 21 shows the graph between transformed log value of Math Total Post
score and gender. Observation shows that there is no linear dependency between
the two.
15
Figure 22
Figure 22 is a plot between log value of gender vs log value of Math Total Post
score. There is no linear dependency between the two variables.
Figure 23
Figure 23 shows a plot between reciprocal value of Math Total Post score and
gender. There is no linear dependency between the variables.
16
Figure 24
Figure 24 shows a plot between square root of Math Total Post scores and gender
variable. Once again, there is no linear dependency between two variables.
Figure 25
17
Figure 25 shows a plot between log value of gender and log value of Math Total
Post score. SPSS took only one value on the X axis and plotted the scatterplot.
To check for normality in residuals, we plotted a scatterplot between residual and
predicted value of the dependent variable. Figure 26 below is the residual plot.
We can see that the residuals are scattered around.
Option used in SPSS:
Analyze  Regression  Plots (ZPRED on x axis, ZRESID on y axis, check Histogram)
Figure 26
Figure 27 is a histogram which shows that the residuals are normally distributed.
Figure 27
18
To check for Auto correlation and Linear Regression Analysis
Option used in SPSS:
Analyze  Regression  Linear (Check R squared change, part and partial
correlations, confidence intervals - 95%, Durbin Watson, Casewise Diagnostics)
The Durbin-Watson statistic which is used to check for autocorrelation in the
residuals from a regression analysis gave us 1.856. Since this value is less than 2,
we can conclude that there is a strong serial correlation.
R value is 0.849 which is the correlation coefficient. This value shows that there
is a high correlation between gender and final math score.
R2
value 0.72 which is the coefficient of determination. So gender accounted for
72 % of variation in Math total score.
The ANOVA table showed us that the significance value is less than 0.05. This
proves that the F statistic is significant and gender significantly predicts the total
math score.
The linear model is ŷ = 30.721 + 39.059 x + e where ‘e’ denotes error from
other factors.
Figure 28
Figure 30
Figure 29
19
The slope of the regression line is 39.05 which says that when there is a change
in gender, the average change in final math score is 39.059
So, to find the estimated final math score for male and female, we substitute x
= 1 and x = 0 in the above equation respectively (on the assumption that error
value is equal to 0).
The estimated math total final score for male is 69.78. The estimated math total
final score for female is 30.721.
iii. Diagnostics Checks
a. All ANOVA assumptions were plausible. The checks that we have performed for
each assumption are-
 1 dependent variable that is continuous – Math_Total_Post is a continuous
variable.
 1 independent variable with two or more categories – Independent variable
Group has 2 values - Non-Action Video Game (Non AVG), Action Video
Game (AVG). We transformed Non AVG as 1 and AVG as 0 for
calculation purposes.
 Independent of observations – All the observations in the dataset were
carried out separately and are independent. Every student performed the
experiment alone without any interference.
 No significant outliers – Box plot was used to check for outliers Normally
distributed data – Shapiro Wilk’s test was used to check if the dataset is
normally distributed.
 Homoscedasticity – Levene’s test was performed to check for equal
variances.
b. All Linear Regression assumptions were plausible. The checks that we performed
for each assumption are-
 Transformations - We transformed male as 1 and females as 0 for
calculation purposes.
 Linear relationship exists between DV and IV – Scatterplot was created to
check if there was any linear relationship between dependent and
independent variables. Though this assumption failed, we wanted to
check what if there would have been a linear relationship between gender
and final math score.
 Independence of observations – All the observations in the dataset were
carried out separately and are independent. Every student performed the
experiment alone without any interference.
 The residuals (errors) of the regression line are approximately Normally
distributed – The residuals plot was drawn to check if regression line is
approximately Normally distributed.
 Equal Variances (homoscedasticity) - Durbin-Watson value was
determined to check for equal variances along the line of best fit.
 No significant outliers – Box plot was used to check for outliers Normally
distributed data.
20
7. Criticisms and Possible Extensions
i. For Question of Interest 1 (Is there a change in overall performance of the students
after playing an AVG (Action Video Game)?)
Our experiment was to find whether action video gamers and non-action video gamers
had a difference in performance in their math final score. Our conclusion is that there
was a difference and group attribute affects math final score. Our conclusion is
subjected for these 30 observations. Thus, we cannot generalize the conclusion for
gamers. A generalized conclusion requires a lot more observations.
ii. For Question of Interest 2 (Does ‘Gender’ attribute play a role in affecting the overall
performance?)
Our experiment concludes that the final math score depends on gender and females in
our dataset scored comparatively less than males. To defend against such an allegation,
we would explain that these results are for these 30 observations. We cannot generalize
the results of our experiment and end up in a biased conclusion. Our dataset had only
30 samples which cannot be applied on a large scale. And, there are a lot more factors
which might affect the final math score like IQ, memory test score etc.
We proceeded with Simple Linear Regression under the assumption that there is a linear
relationship between dependent variable (Math final score) and independent variable
(Gender). So, based on that assumption, we proceeded with rest of the tests. The scatter
plots did not show any linear relationship between gender and final math score. The
main motive for us was to try out the whole experiment with the curiosity that 'what
would have happened if gender played a role?'
8. Conclusions
The conclusions we that we have come to for each question of interest –
Question of Interest 1 - Is there a change in overall performance of the students after
playing an AVG (Action Video Game)?
Conclusion – We performed ANOVA analysis for this question. The significance is
less than 0.05. So, F statistic (7.311) is significant. This proves that the type of game
played has effect on Math Total Post score.
Question of Interest 2 - Is ‘Gender’ attribute play a role in affecting the overall
performance?
Conclusion – Firstly, we performed ANOVA test to determine if there is a difference
between the mean total math scores that males and females have received.
The significance value is less than 0.05 and the F statistic (72.123) is significant. We
conclude that gender has role in determining the Math Total Post score.
Secondly, we wanted to check if there exists a linear dependency between the math
total post scores and gender. We failed one of the assumptions (Linear relationship
exists between math total post scores and gender). Out of curiosity, we wanted to check
what would be the result if there was a linear dependency.
The linear model that we got is ŷ = 30.721 + 39.059 x + e.
21
The estimated math total final score for male is 69.78. The estimated math total final
score for female is 30.721.
We cannot generalize the results of our experiment and end up in a biased conclusion.
Our dataset had only 30 samples which cannot be applied on a large scale. And, there
are a lot more factors which might affect the final math score like IQ, memory test score
etc.
22
9. Appendix
The results obtained from SPSS for section –
ONEWAY Math_Total_Post BY Gender
/CONTRAST=-1 1
/STATISTICS DESCRIPTIVES HOMOGENEITY WELCH
/MISSING ANALYSIS.
Oneway
Notes
Output Created 26-NOV-2016 13:15:33
Comments
Input Active Dataset DataSet1
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data
File
30
Missing Value Handling Definition of Missing User-defined missing values are
treated as missing.
Cases Used Statistics for each analysis are
based on cases with no missing
data for any variable in the
analysis.
Syntax ONEWAY Math_Total_Post
BY Gender
/CONTRAST=-1 1
/STATISTICS
DESCRIPTIVES
HOMOGENEITY WELCH
/MISSING ANALYSIS.
Resources Processor Time 00:00:00.00
Elapsed Time 00:00:00.06
23
Descriptives
Math_Total_Post
N Mean Std. Deviation Std. Error
95% Confidence Interval for Mean
Lower Bound Upper Bound
0 16 30.72115385000 13.588685250000 3.397171313000 23.48025460000 37.96205310000
1 14 69.78021978000 11.274812610000 3.013320420000 63.27033679000 76.29010277000
Total 30 48.94871795000 23.351578060000 4.263395353000 40.22909540000 57.66834050000
Descriptives
Math_Total_Post
Minimum Maximum
0 7.692307692 53.846153850
1 38.461538460 80.769230770
Total 7.692307692 80.769230770
Test of Homogeneity of Variances
Math_Total_Post
Levene Statistic df1 df2 Sig.
1.518 1 28 .228
ANOVA
Math_Total_Post
Sum of Squares df Mean Square F Sig.
Between Groups 11391.226 1 11391.226 72.123 .000
Within Groups 4422.364 28 157.942
Total 15813.590 29
24
Math_Total_Post
Statistica
df1 df2 Sig.
Welch 73.984 1 27.936 .000
a. Asymptotically F distributed.
Contrast Coefficients
Contrast
Gender
0 1
1 -1 1
Contrast Tests
Contrast Value of Contrast Std. Error t
Math_Total_Post Assume equal variances 1 39.05906593000 4.599226845000 8.493
Does not assume equal
variances
1 39.05906593000 4.541021128000 8.601
Contrast Tests
Contrast df Sig. (2-tailed)
Math_Total_Post Assume equal variances 1 28 .000
Does not assume equal variances 1 27.936 .000
MEANS TABLES=Math_Total_Post BY Gender
/CELLS=MEAN COUNT STDDEV
/STATISTICS ANOVA.
Means
Notes
Output Created 26-NOV-2016 13:16:28
Comments
Input Active Dataset DataSet1
Filter <none>
25
Weight <none>
Split File <none>
N of Rows in Working Data
File
30
Missing Value Handling Definition of Missing For each dependent variable in a
table, user-defined missing
values for the dependent and all
grouping variables are treated as
missing.
Cases Used Cases used for each table have
no missing values in any
independent variable, and not all
dependent variables have
missing values.
Syntax MEANS
TABLES=Math_Total_Post BY
Gender
/CELLS=MEAN COUNT
STDDEV
/STATISTICS ANOVA.
Resources Processor Time 00:00:00.03
Elapsed Time 00:00:00.05
Case Processing Summary
Cases
Included Excluded Total
N Percent N Percent N Percent
Math_Total_Post * Gender 30 100.0% 0 0.0% 30 100.0%
26
Report
Math_Total_Post
Gender Mean N Std. Deviation
0 30.72115385000 16 13.588685250000
1 69.78021978000 14 11.274812610000
Total 48.94871795000 30 23.351578060000
ANOVA Table
Sum of Squares df Mean Square
Math_Total_Post * Gender Between Groups (Combined) 11391.226 1 11391.226
Within Groups 4422.364 28 157.942
Total 15813.590 29
ANOVA Table
F Sig.
Math_Total_Post * Gender Between Groups (Combined) 72.123 .000
Within Groups
Total
Measures of Association
Eta Eta Squared
Math_Total_Post * Gender .849 .720
EXAMINE VARIABLES=Math_Total_Post
/PLOT BOXPLOT HISTOGRAM NPPLOT
/COMPARE GROUPS
/STATISTICS DESCRIPTIVES EXTREME
/CINTERVAL 95
/MISSING LISTWISE
27
/NOTOTAL.
Explore
Notes
Output Created 26-NOV-2016 13:17:16
Comments
Input Active Dataset DataSet1
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data
File
30
Missing Value Handling Definition of Missing User-defined missing values for
dependent variables are treated
as missing.
Cases Used Statistics are based on cases with
no missing values for any
dependent variable or factor
used.
Syntax EXAMINE
VARIABLES=Math_Total_Post
/PLOT BOXPLOT
HISTOGRAM NPPLOT
/COMPARE GROUPS
/STATISTICS
DESCRIPTIVES EXTREME
/CINTERVAL 95
/MISSING LISTWISE
/NOTOTAL.
Resources Processor Time 00:00:00.55
Elapsed Time 00:00:00.62
28
Case Processing Summary
Cases
Valid Missing Total
N Percent N Percent N Percent
Math_Total_Post 30 100.0% 0 0.0% 30 100.0%
Descriptives
Statistic Std. Error
Math_Total_Post Mean 48.94871795000 4.263395353000
95% Confidence Interval for
Mean
Lower Bound 40.22909540000
Upper Bound 57.66834050000
5% Trimmed Mean 49.40170940000
Median 44.23076923000
Variance 545.296
Std. Deviation 23.351578060000
Minimum 7.692307692
Maximum 80.769230770
Range 73.076923080
Interquartile Range 36.538461540
Skewness -.165 .427
Kurtosis -1.266 .833
Extreme Values
Case Number Value
Math_Total_Post Highest 1 5 80.769230770
2 26 80.769230770
3 6 76.923076920
4 10 76.923076920
5 17 76.923076920a
Lowest 1 4 7.692307692
29
2 15 11.538461540
3 2 11.538461540
4 28 18.461538460
5 14 19.230769230
a. Only a partial list of cases with the value 76.923076920 are shown in the table of
upper extremes.
Tests of Normality
Kolmogorov-Smirnova
Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
Math_Total_Post .174 30 .021 .922 30 .030
a. Lilliefors Significance Correction
Math_Total_Post
30
31
GET DATA
/TYPE=XLS
/FILE='F:MS 1st semBSTATProjectAVGnonAVGDataset_Test.xls'
/SHEET=name 'AVG without q7i Spring 2014 FIN'
/CELLRANGE=FULL
/READNAMES=ON
/DATATYPEMIN PERCENTAGE=95.0.
EXECUTE.
DATASET NAME DataSet1 WINDOW=FRONT.
* Chart Builder.
GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=Gender Math_Total_Post
MISSING=LISTWISE
REPORTMISSING=NO
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
32
DATA: Gender=col(source(s), name("Gender"), unit.category())
DATA: Math_Total_Post=col(source(s), name("Math_Total_Post"))
GUIDE: axis(dim(1), label("Gender"))
GUIDE: axis(dim(2), label("Math_Total_Post"))
SCALE: linear(dim(2), include(0))
ELEMENT: point(position(Gender*Math_Total_Post))
END GPL.
GGraph
Notes
Output Created 26-NOV-2016 23:43:42
Comments
Input Active Dataset DataSet1
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data File 30
33
Syntax GGRAPH
/GRAPHDATASET
NAME="graphdataset"
VARIABLES=Gender
Math_Total_Post
MISSING=LISTWISE
REPORTMISSING=NO
/GRAPHSPEC
SOURCE=INLINE.
BEGIN GPL
SOURCE:
s=userSource(id("graphdataset")
)
DATA: Gender=col(source(s),
name("Gender"),
unit.category())
DATA:
Math_Total_Post=col(source(s),
name("Math_Total_Post"))
GUIDE: axis(dim(1),
label("Gender"))
GUIDE: axis(dim(2),
label("Math_Total_Post"))
SCALE: linear(dim(2),
include(0))
ELEMENT:
point(position(Gender*Math_To
tal_Post))
END GPL.
Resources Processor Time 00:00:02.41
Elapsed Time 00:00:01.26
[DataSet1]
34
Notes
Output Created 26-NOV-2016 23:45:36
Comments
Input Active Dataset DataSet1
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data File 30
35
Syntax GGRAPH
/GRAPHDATASET
NAME="graphdataset"
VARIABLES=Gender
Log_Math_Total_Post
MISSING=LISTWISE
REPORTMISSING=NO
/GRAPHSPEC
SOURCE=INLINE.
BEGIN GPL
SOURCE:
s=userSource(id("graphdataset")
)
DATA: Gender=col(source(s),
name("Gender"),
unit.category())
DATA:
Log_Math_Total_Post=col(sour
ce(s),
name("Log_Math_Total_Post"))
GUIDE: axis(dim(1),
label("Gender"))
GUIDE: axis(dim(2),
label("Log_Math_Total_Post"))
SCALE: linear(dim(2),
include(0))
ELEMENT:
point(position(Gender*Log_Mat
h_Total_Post))
END GPL.
Resources Processor Time 00:00:00.72
Elapsed Time 00:00:00.30
36
REGRESSION
/DESCRIPTIVES MEAN STDDEV CORR SIG N
/MISSING LISTWISE
/STATISTICS COEFF OUTS CI(95) BCOV R ANOVA COLLIN TOL CHANGE ZPP
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT Math_Total_Post
/METHOD=ENTER Gender
/SCATTERPLOT=(*ZRESID ,*ZPRED)
/RESIDUALS DURBIN HISTOGRAM(ZRESID) NORMPROB(ZRESID)
/CASEWISE PLOT(ZRESID) OUTLIERS(3).
Regression
Notes
Output Created 26-NOV-2016 23:47:45
Comments
Input Active Dataset DataSet1
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data
File
30
Missing Value Handling Definition of Missing User-defined missing values are
treated as missing.
Cases Used Statistics are based on cases with
no missing values for any
variable used.
37
Syntax REGRESSION
/DESCRIPTIVES MEAN
STDDEV CORR SIG N
/MISSING LISTWISE
/STATISTICS COEFF OUTS
CI(95) BCOV R ANOVA
COLLIN TOL CHANGE ZPP
/CRITERIA=PIN(.05)
POUT(.10)
/NOORIGIN
/DEPENDENT
Math_Total_Post
/METHOD=ENTER Gender
/SCATTERPLOT=(*ZRESID
,*ZPRED)
/RESIDUALS DURBIN
HISTOGRAM(ZRESID)
NORMPROB(ZRESID)
/CASEWISE PLOT(ZRESID)
OUTLIERS(3).
Resources Processor Time 00:00:01.58
Elapsed Time 00:00:00.72
Memory Required 2928 bytes
Additional Memory Required
for Residual Plots
680 bytes
Descriptive Statistics
Mean Std. Deviation N
Math_Total_Post 48.94871795000 23.351578060000 30
Gender .47 .507 30
38
Correlations
Math_Total_Post Gender
Pearson Correlation Math_Total_Post 1.000 .849
Gender .849 1.000
Sig. (1-tailed) Math_Total_Post . .000
Gender .000 .
N Math_Total_Post 30 30
Gender 30 30
Variables Entered/Removeda
Model Variables Entered
Variables
Removed Method
1 Genderb
. Enter
a. Dependent Variable: Math_Total_Post
b. All requested variables entered.
Model Summaryb
Model R R Square
Adjusted R
Square
Std. Error of the
Estimate
Change Statistics
R Square Change F Change df1
1 .849a
.720 .710 12.567480280000 .720 72.123 1
Model Summaryb
Model
Change Statistics
df2 Sig. F Change
1 28 .000 1.856
a. Predictors: (Constant), Gender
b. Dependent Variable: Math_Total_Post
39
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 11391.226 1 11391.226 72.123 .000b
Residual 4422.364 28 157.942
Total 15813.590 29
a. Dependent Variable: Math_Total_Post
b. Predictors: (Constant), Gender
Coefficientsa
Model
Unstandardized Coefficients
Standardized
Coefficients
t Sig.B Std. Error Beta
1 (Constant) 30.721 3.142 9.778 .000
Gender 39.059 4.599 .849 8.493 .000
Coefficientsa
Model
95.0% Confidence Interval for B Correlations
Collinearity
Statistics
Lower Bound Upper Bound Zero-order Partial Part Tolerance
1 (Constant) 24.285 37.157
Gender 29.638 48.480 .849 .849 .849 1.000
Coefficientsa
Model
Collinearity Statistics
VIF
1 (Constant)
Gender 1.000
a. Dependent Variable: Math_Total_Post
40
Coefficient Correlationsa
Model Gender
1 Correlations Gender 1.000
Covariances Gender 21.153
a. Dependent Variable: Math_Total_Post
Collinearity Diagnosticsa
Model Dimension Eigenvalue Condition Index
Variance Proportions
(Constant) Gender
1 1 1.683 1.000 .16 .16
2 .317 2.305 .84 .84
a. Dependent Variable: Math_Total_Post
Residuals Statisticsa
Minimum Maximum Mean Std. Deviation N
Predicted Value 30.72115326000 69.78022003000 48.94871795000 19.819205290000 30
Residual -
31.318681720000
23.125000000000 .000000000000 12.348898730000 30
Std. Predicted Value -.920 1.051 .000 1.000 30
Std. Residual -2.492 1.840 .000 .983 30
a. Dependent Variable: Math_Total_Post
41
Charts
42

Mais conteúdo relacionado

Mais procurados

The Tools of Video Game Analysis
The Tools of Video Game AnalysisThe Tools of Video Game Analysis
The Tools of Video Game Analysis
Austin O'Brien
 
Smart Gamification
Smart GamificationSmart Gamification
Smart Gamification
Amy Jo Kim
 

Mais procurados (8)

The History of The 3DO Company
The History of The 3DO CompanyThe History of The 3DO Company
The History of The 3DO Company
 
Thèse professionnelle "Quel avenir pour les jeux vidéos en 2016?"
Thèse professionnelle "Quel avenir pour les jeux vidéos en 2016?"Thèse professionnelle "Quel avenir pour les jeux vidéos en 2016?"
Thèse professionnelle "Quel avenir pour les jeux vidéos en 2016?"
 
Epic Games Company Analysis
Epic Games Company AnalysisEpic Games Company Analysis
Epic Games Company Analysis
 
The Tools of Video Game Analysis
The Tools of Video Game AnalysisThe Tools of Video Game Analysis
The Tools of Video Game Analysis
 
Sony - A Crisis Management Case Study
Sony - A Crisis Management Case StudySony - A Crisis Management Case Study
Sony - A Crisis Management Case Study
 
Lecture 15 Game Analytics in the Age of Big Data
Lecture 15 Game Analytics in the Age of Big DataLecture 15 Game Analytics in the Age of Big Data
Lecture 15 Game Analytics in the Age of Big Data
 
Collaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFCollaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CF
 
Smart Gamification
Smart GamificationSmart Gamification
Smart Gamification
 

Semelhante a BSTAT 5325_Group 8_Project Report

Week 6 Assignment 2Application Chi-Square Study.docx
Week 6 Assignment 2Application Chi-Square Study.docxWeek 6 Assignment 2Application Chi-Square Study.docx
Week 6 Assignment 2Application Chi-Square Study.docx
melbruce90096
 
MANOVA (July 2014 updated)
MANOVA (July 2014 updated)MANOVA (July 2014 updated)
MANOVA (July 2014 updated)
Michael Ling
 
Social Science Statistics STA2122.501 ● ONLINE Project 3
Social Science Statistics STA2122.501 ● ONLINE Project 3Social Science Statistics STA2122.501 ● ONLINE Project 3
Social Science Statistics STA2122.501 ● ONLINE Project 3
ChereCheek752
 
Methodology it capstone projet
Methodology it capstone projetMethodology it capstone projet
Methodology it capstone projet
june briones
 
Social Science Statistics STA2122.501 ● ONLINE Project 3.docx
Social Science Statistics STA2122.501 ● ONLINE Project 3.docxSocial Science Statistics STA2122.501 ● ONLINE Project 3.docx
Social Science Statistics STA2122.501 ● ONLINE Project 3.docx
rosemariebrayshaw
 
Rohan's Masters presentation
Rohan's Masters presentationRohan's Masters presentation
Rohan's Masters presentation
rohan_anil
 

Semelhante a BSTAT 5325_Group 8_Project Report (20)

Influencing Visual Judgment through Affective Priming
Influencing Visual Judgment through Affective PrimingInfluencing Visual Judgment through Affective Priming
Influencing Visual Judgment through Affective Priming
 
SPSS statistics - get help using SPSS
SPSS statistics - get help using SPSSSPSS statistics - get help using SPSS
SPSS statistics - get help using SPSS
 
Week 6 Assignment 2Application Chi-Square Study.docx
Week 6 Assignment 2Application Chi-Square Study.docxWeek 6 Assignment 2Application Chi-Square Study.docx
Week 6 Assignment 2Application Chi-Square Study.docx
 
MANOVA (July 2014 updated)
MANOVA (July 2014 updated)MANOVA (July 2014 updated)
MANOVA (July 2014 updated)
 
Learning analytics for improving educational games jcsg2017
Learning analytics for improving educational games jcsg2017Learning analytics for improving educational games jcsg2017
Learning analytics for improving educational games jcsg2017
 
Statistics - Multiple Regression and Two Way Anova
Statistics - Multiple Regression and Two Way AnovaStatistics - Multiple Regression and Two Way Anova
Statistics - Multiple Regression and Two Way Anova
 
GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018
 
GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018
 
IRJET- Sentimental Analysis for Students’ Feedback using Machine Learning App...
IRJET- Sentimental Analysis for Students’ Feedback using Machine Learning App...IRJET- Sentimental Analysis for Students’ Feedback using Machine Learning App...
IRJET- Sentimental Analysis for Students’ Feedback using Machine Learning App...
 
Two-way Mixed Design with SPSS
Two-way Mixed Design with SPSSTwo-way Mixed Design with SPSS
Two-way Mixed Design with SPSS
 
Social Science Statistics STA2122.501 ● ONLINE Project 3
Social Science Statistics STA2122.501 ● ONLINE Project 3Social Science Statistics STA2122.501 ● ONLINE Project 3
Social Science Statistics STA2122.501 ● ONLINE Project 3
 
Methodology it capstone projet
Methodology it capstone projetMethodology it capstone projet
Methodology it capstone projet
 
User Experience 7: Quantitative Methods, Questionnaires, Biometrics and Data ...
User Experience 7: Quantitative Methods, Questionnaires, Biometrics and Data ...User Experience 7: Quantitative Methods, Questionnaires, Biometrics and Data ...
User Experience 7: Quantitative Methods, Questionnaires, Biometrics and Data ...
 
Social Science Statistics STA2122.501 ● ONLINE Project 3.docx
Social Science Statistics STA2122.501 ● ONLINE Project 3.docxSocial Science Statistics STA2122.501 ● ONLINE Project 3.docx
Social Science Statistics STA2122.501 ● ONLINE Project 3.docx
 
Graphical Analysis of Simulated Financial Data Using R
Graphical Analysis of Simulated Financial Data Using RGraphical Analysis of Simulated Financial Data Using R
Graphical Analysis of Simulated Financial Data Using R
 
Rohan's Masters presentation
Rohan's Masters presentationRohan's Masters presentation
Rohan's Masters presentation
 
2021_06_30 «Stealth Assessment of Physics Understanding in Physics Playground»
2021_06_30 «Stealth Assessment of Physics Understanding in Physics Playground»2021_06_30 «Stealth Assessment of Physics Understanding in Physics Playground»
2021_06_30 «Stealth Assessment of Physics Understanding in Physics Playground»
 
IRJET - Support Vector Machine versus Naive Bayes Classifier:A Juxtaposition ...
IRJET - Support Vector Machine versus Naive Bayes Classifier:A Juxtaposition ...IRJET - Support Vector Machine versus Naive Bayes Classifier:A Juxtaposition ...
IRJET - Support Vector Machine versus Naive Bayes Classifier:A Juxtaposition ...
 
Data Reduction and Classification for Lumosity Data
Data Reduction and Classification for Lumosity DataData Reduction and Classification for Lumosity Data
Data Reduction and Classification for Lumosity Data
 
Social Learning in Networks: Extraction Deterministic Rules
Social Learning in Networks: Extraction Deterministic RulesSocial Learning in Networks: Extraction Deterministic Rules
Social Learning in Networks: Extraction Deterministic Rules
 

BSTAT 5325_Group 8_Project Report

  • 1. BSTAT 5325 Advanced Statistical Methods Fall 2016 Effect of Action Video Games on Math Performance Group 8 Project Report Team Members Anish Grandhi Manikandan Sundarapandian Sathya Narayanan Manivannan Smruti Chandan Chhatwani
  • 2. Table of Contents 1. Project Abstract...............................................................................................................................1 2. Project Motivation ..........................................................................................................................1 3. Questions of Interest .......................................................................................................................1 4. Dataset Details ................................................................................................................................1 5. Statistical Methods:.........................................................................................................................3 6. Statistical Analysis, Results, and Interpretation..............................................................................4 i. Definitions, Notations, and Assumptions ...................................................................................4 ii. Details of the Analysis and Interpretations.................................................................................5 iii. Diagnostics Checks...................................................................................................................19 7. Criticisms and Possible Extensions...............................................................................................20 8. Conclusions...................................................................................................................................20 9. Appendix.......................................................................................................................................22
  • 3. 1 1. Project Abstract The aim of our project is to conduct an experimental study on a dataset containing 30 observations. The dataset includes pre-test and post-test performance of students from different majors. By default, students are made to play non-action video game before pre-test. Each student is made to write a math test which includes Geometry, Word, and Non-word problems. Their anxiety levels, confidence level, and speed for doing the calculations are all measured (Pre-test scores). The students are made to play an action video game after which they are made to take the math test where the performance is measured again (Post-test scores). Statistical analysis is performed on the data obtained and conclusions are made. 2. Project Motivation Curiosity is the main driving factor for this project. Two main points of interest that caught out attention - Does playing an action game before an exam affect the score and if gender plays a role too. A lot of research had showed that playing a video game indeed does improve the scores of examinations and Pilots drove better after they were subjected to an action video game. We wanted to prove this by conducting statistical analysis on a group of students and checking their math scores. And, if gender had any role or not. 3. Questions of Interest  Is there a change in overall performance of the students after playing an AVG (Action Video Game) and a Non-AVG (Non-Action Video game)?  Does ‘Gender’ attribute play a role in affecting the overall performance? 4. Dataset Details  Dataset has been obtained from the website - https://figshare.com/articles/BJET_Data_Entry/1167496  The number of variables in the dataset are 31.  The number of observations in the dataset- 30 Observations  Student Number - A unique number was assigned to each student  Group - Action Video Game (AVG) versus non-Action Video Game (non-AVG) group  Gender - Female/Male
  • 4. 2  Description of the variables – Attributes Description Values Student Number Student ID No Numeric Value Group Type of Game the students played AVG- Action Video Game Non- AVG- Non- Action Video Game Gender Student Gender Male and Female Working Memory Assessment- Pre-Test Measures WM_AbsoluteOSPAN_Pretest Pretest OSPAN score WM_TotalCorrect_Pretest Pretest Total number of correct answers WM_MathErrors_Pretest Pretest Number of Math errors WM_MathSpeedErrors_Pretest Pretest Math Speed Errors WM_MathAccuracyErrors_Pretest Pretest Accuracy Errors Working Memory Post-test Measures: WM_AbsoluteOSPAN_Posttest Posttest OSPAN score WM_TotalCorrect_Posttest Post Total number of correct answers WM_MathErrors_Posttest Posttest Number of Math errors WM_MathSpeedErrors_Posttest Posttest Math Speed Errors WM_MathAccuracyErrors_Posttest Posttest Accuracy Errors Mathematics Anxiety and Confidence in Learning Mathematics Pretest Measures (1-min; 5-max): Confidence_Pre Pretest Confidence Measure Anxiety_Pre Pretest Anxiety Measure Post-test Measures (1-min; 5-max): Confidence_Post Posttest Confidence Measure Anxiety_Post Posttest Anxiety Measure Mathematics Assessment The mathematics performance test included geometry, word, and non-word problems sections. Pretest Measures: Math_Geometry_Pre Geometry score in % correct Math_Word_Pre Mathematics word problems score in % correct Math_NonWord_Pre Math non-word problems score in % correct Math_Total_Pre Total score on mathematics test in % correct TimePre Time spent on mathematics test (in min) Post-test Measures: Math_Geometry_Post Geometry score in % correct Math_Word_Post Mathematics word problems score in % correct Math_NonWord_Post Math non-word problems score in % correct Math_Total_Post Total score on mathematics test in % correct TimePost Time spent on mathematics test (in min) Perceived Cognitive Load (1-min; 9-max) CogLoadPre Pretest cognitive load CogLoadPost Post-test cognitive load Mental Rotation Test (MRT)
  • 5. 3 MRT_Pre Pretest MRT score MRT_Post Post-test MRT score  Each record contains pre-test and post test scores relative to the game played. 5. Statistical Methods: a) Is there a change in overall performance of the students after playing an AVG (Action Video Game)? For this experiment, we considered:  One dependent variable that is continuous – Math_Total_Post  1 independent variable with two categories – Group (Non Action Video Game, Action Video Game) To answer the question of interest, we need determine whether action video gamers and non-action video gamers perform equally in the final math exam. This allows us to check whether there is a change in overall performance of the students after playing the action video game. So, we chose to perform ANOVA (Analysis of Variance) method between independent and dependent variables. b) Does ‘Gender’ attribute play a role in affecting the overall performance? For this experiment, we considered:  One Independent Variable with 2 categories. (male, female)  One dependent variable that is continuous – Math_Total_Post To determine whether gender attribute has any effect on the overall performance of mathematics test, we performed ANOVA method on the dependent and independent variables. To check whether there was a linear dependency, linear regression with gender as independent variable and Total Math Score (Post) as dependent variable.
  • 6. 4 6. Statistical Analysis, Results, and Interpretation i. Definitions, Notations, and Assumptions a) ANOVA definition – Analysis of variance, a statistical method in which the variation in a set of observations is divided into distinct components. b) Linear Regression definition – In statistics, linear regression is an approach for modelling the relationship between a scalar dependent variable y and one or more explanatory variables (or independent variables) denoted X. The case of one explanatory variable (independent variable) is called simple linear regression. c) F-Statistic definition –An F statistic is a value you get when you run an ANOVA test or a regression analysis to find out if the means between two populations are significantly different. d) Parameters measured and their notations –  SST - Sum of Squares Treatment  SSE - Sum of Squares Error  SS (Total) – Sum of Squares Total  B0 - y intercept  B1 - Coefficient of Independent Variable  ŷ – Predicted Value  k – Number of Categories of Independent Variable  N – Number of Observations  MST – Mean Square of Treatment  MSE – Mean Square of Error e) Assumptions of ANOVA -  1 dependent variable that is continuous  1 independent variable with two or more categories  Independent of observations  No significant outliers  Normally distributed data  Homoscedasticity f) Assumptions of Linear Regression –  Linear relationship exists between DV and IV  Independence of observations  The residuals (errors) of the regression line are approximately Normally distributed  Equal Variances (homoscedasticity) – Variances along line of best fit remains similar as move along line.  No significant outliers
  • 7. 5 ii. Details of the Analysis and Interpretations a. Analysis for Question of Interest 1 (Is there a change in overall performance of the students after playing an Action Video Game?) We conducted Analysis of Variance method to check for this question. We had 12 persons who played action video game and 18 persons who played non- action video game in our dataset. Figure 1 gives a report about the observations. Figure 1 Important Note – Non AVG group is transformed to 1 and AVG group to 0 for calculation purposes. Test for outliers: To check for outliers, we used the following options to create a box plot for our dependent variable. Analyze  Descriptive Statistics  Explore (give dependent variable)  Statistics (check Outliers) Figure 2 Figure 2 shows the box plot output in SPSS. No points lie outside the box plot. We can interpret that there are no outliers.
  • 8. 6 Figure 3 Figure 3 shows the extreme values on the higher and lower side. Test for normality: Option used in SPSS: Analyze  Descriptive Statistics  Explore (give dependent variable)  Plots (Histogram, Normality plot with tests) Figure 4 Figure 4 shows that Shapiro Wilk’s test has significance value 0.030 which is less than 0.05. This means that the dependent variable is normally distributed.
  • 9. 7 Figure 5 Figure 5 shows a normally distributed histogram of the dependent variable. Test for equal variance (Homoscedasticity): We considered the Null Hypothesis(H0) as group means are equal and the alternative hypothesis(H1) as the group means differ from each other. H0 = Group means are equal H1 = Group means differ from each other Option used in SPSS: Analyze  Compare means  One way ANOVA  Options (Homogeneity of variance test) Figure 6 The output of the Homogeneity of variances test is shown in Figure 6. Since Sig value we got is greater than 0.05, we reject H0 and conclude that there is a difference in group means.
  • 10. 8 ANOVA results: Between groups – Treatment Within groups – Error SST 3274.175 SSE 12539.42 SS Total 15813.59 k 2 N 30 MST 3274.175 MSE 447.836 F Stat 7.311 P value 0.012 Figure 7 Figure 7 shows that the significance is less than 0.05. This means that the F- statistic is significant. This proves that the type of game played has effect on Math_Total_Post (the total post score of math exam). To find the value of contrast between the two groups (AVG group and Non-AVG group): Option used in SPSS: Analyze  Compare means  One way ANOVA (Contrasts). The contrast coefficient is the weightage given to each group. ‘Coefficient total’ should always be zero.
  • 11. 9 Figure 8 Figure 8 shows the values of the coefficients. They should add up to the value zero. From Figure 9, the values from ‘Does not assume equal variance’ are considered for analysis as our experiment failed in Levene’s test of equal variance. The value of significance is 0.017 which is less than 0.05 and so we can conclude that the results are significant. The value of contrast between two groups is found to be 21.324. To determine the size of the difference (Effect size): Option used in SPSS: Analyze  Compare means  Options  ANOVA table and ETA Figure 10 Figure 10 shows the magnitude of the difference in means is given by the Eta Squared value. Eta Squared value is found to be 0.207. Figure 9
  • 12. 10 b. Analysis and Interpretation for Question of Interest 2 (Does ‘Gender’ attribute play a role in affecting the overall performance?). We conducted Analysis of Variance to check for this question. Our dataset contains 14 males and 16 females. Figure 11 Important Note – Values of males are transformed as 1 and females as 0 for calculation purposes. Test for outliers: To check for outliers, we used the following options to create a box plot for our dependent variable - Analyze  Descriptive Statistics  Explore (give dependent variable)  Statistics (check Outliers) Figure 12 Figure 12 shows the box plot output in SPSS. No points lie outside the box plot. We can interpret that there are no outliers
  • 13. 11 Test for normality: To check for normality, we performed Shapiro Wilk’s test. Option used in SPSS: Analyze  Descriptive Statistics  Explore (give dependent variable)  Plots (Histogram, Normality plot with tests) Figure 13 Figure 13 shows that the significance value is less than 0.05. This means that the dependent variable is normally distributed Test for equal variance (Homoscedasticity): To check for homogeneity of variances, we performed Levene’s test. We considered null hypothesis(H0) and alternate hypothesis(H1) as - H0 = Group means are equal H1 = Group means differ from each other Option used in SPSS: Analyze  Compare means  One way ANOVA  Options (Homogeneity of variance test) Figure 14 Figure 14 shows that significance value is greater than 0.05, we reject H0 and conclude that there is a difference in group means. ANOVA Test The ANOVA test helps us determine if there is a difference between the mean total math scores that males and females have received.
  • 14. 12 Figure 15 Figure 15 shows us that the significance value is less than 0.05. This means that the F statistic is significant. We concluded that gender has a role in determining the final math score. Since we failed Levene’s test, we went ahead and performed Custom Contrasts test. To find the value of contrast between the two genders (AVG and Non-AVG): Option used in SPSS: Analyze  Compare means  One way ANOVA (Contrasts). The contrast coefficient is the weightage given to each group. ‘Coefficient total’ should always be zero. The value of contrast is 39.05. Figure 16 Figure 17
  • 15. 13 To determine the size of the difference (Effect size): Option used in SPSS: Analyze  Compare means  Options  ANOVA table and ETA Figure 19 The magnitude of the difference in means is given by the Eta Squared value. Figure 19 shows that the Eta Squared value to be 0.207 c. Linear Regression on Question of Interest 2 (Does ‘Gender’ attribute play a role in affecting the overall performance?) We conducted Linear Regression to check if there was a dependency between gender and the final math score. The scatter plot shows that the first assumption was violated and there exists no linear relationship between gender and Math_Total_Post Option used in SPSS: Graphs  Chart builder  Scatter/Dot (Choose dependent and independent variables) Figure 20 shows that a scatterplot between gender and Math_Total_Post scores. Figure 18
  • 16. 14 Figure 20 We have tried different transformations and none of them revealed any linear dependency. Figure 21 Figure 21 shows the graph between transformed log value of Math Total Post score and gender. Observation shows that there is no linear dependency between the two.
  • 17. 15 Figure 22 Figure 22 is a plot between log value of gender vs log value of Math Total Post score. There is no linear dependency between the two variables. Figure 23 Figure 23 shows a plot between reciprocal value of Math Total Post score and gender. There is no linear dependency between the variables.
  • 18. 16 Figure 24 Figure 24 shows a plot between square root of Math Total Post scores and gender variable. Once again, there is no linear dependency between two variables. Figure 25
  • 19. 17 Figure 25 shows a plot between log value of gender and log value of Math Total Post score. SPSS took only one value on the X axis and plotted the scatterplot. To check for normality in residuals, we plotted a scatterplot between residual and predicted value of the dependent variable. Figure 26 below is the residual plot. We can see that the residuals are scattered around. Option used in SPSS: Analyze  Regression  Plots (ZPRED on x axis, ZRESID on y axis, check Histogram) Figure 26 Figure 27 is a histogram which shows that the residuals are normally distributed. Figure 27
  • 20. 18 To check for Auto correlation and Linear Regression Analysis Option used in SPSS: Analyze  Regression  Linear (Check R squared change, part and partial correlations, confidence intervals - 95%, Durbin Watson, Casewise Diagnostics) The Durbin-Watson statistic which is used to check for autocorrelation in the residuals from a regression analysis gave us 1.856. Since this value is less than 2, we can conclude that there is a strong serial correlation. R value is 0.849 which is the correlation coefficient. This value shows that there is a high correlation between gender and final math score. R2 value 0.72 which is the coefficient of determination. So gender accounted for 72 % of variation in Math total score. The ANOVA table showed us that the significance value is less than 0.05. This proves that the F statistic is significant and gender significantly predicts the total math score. The linear model is ŷ = 30.721 + 39.059 x + e where ‘e’ denotes error from other factors. Figure 28 Figure 30 Figure 29
  • 21. 19 The slope of the regression line is 39.05 which says that when there is a change in gender, the average change in final math score is 39.059 So, to find the estimated final math score for male and female, we substitute x = 1 and x = 0 in the above equation respectively (on the assumption that error value is equal to 0). The estimated math total final score for male is 69.78. The estimated math total final score for female is 30.721. iii. Diagnostics Checks a. All ANOVA assumptions were plausible. The checks that we have performed for each assumption are-  1 dependent variable that is continuous – Math_Total_Post is a continuous variable.  1 independent variable with two or more categories – Independent variable Group has 2 values - Non-Action Video Game (Non AVG), Action Video Game (AVG). We transformed Non AVG as 1 and AVG as 0 for calculation purposes.  Independent of observations – All the observations in the dataset were carried out separately and are independent. Every student performed the experiment alone without any interference.  No significant outliers – Box plot was used to check for outliers Normally distributed data – Shapiro Wilk’s test was used to check if the dataset is normally distributed.  Homoscedasticity – Levene’s test was performed to check for equal variances. b. All Linear Regression assumptions were plausible. The checks that we performed for each assumption are-  Transformations - We transformed male as 1 and females as 0 for calculation purposes.  Linear relationship exists between DV and IV – Scatterplot was created to check if there was any linear relationship between dependent and independent variables. Though this assumption failed, we wanted to check what if there would have been a linear relationship between gender and final math score.  Independence of observations – All the observations in the dataset were carried out separately and are independent. Every student performed the experiment alone without any interference.  The residuals (errors) of the regression line are approximately Normally distributed – The residuals plot was drawn to check if regression line is approximately Normally distributed.  Equal Variances (homoscedasticity) - Durbin-Watson value was determined to check for equal variances along the line of best fit.  No significant outliers – Box plot was used to check for outliers Normally distributed data.
  • 22. 20 7. Criticisms and Possible Extensions i. For Question of Interest 1 (Is there a change in overall performance of the students after playing an AVG (Action Video Game)?) Our experiment was to find whether action video gamers and non-action video gamers had a difference in performance in their math final score. Our conclusion is that there was a difference and group attribute affects math final score. Our conclusion is subjected for these 30 observations. Thus, we cannot generalize the conclusion for gamers. A generalized conclusion requires a lot more observations. ii. For Question of Interest 2 (Does ‘Gender’ attribute play a role in affecting the overall performance?) Our experiment concludes that the final math score depends on gender and females in our dataset scored comparatively less than males. To defend against such an allegation, we would explain that these results are for these 30 observations. We cannot generalize the results of our experiment and end up in a biased conclusion. Our dataset had only 30 samples which cannot be applied on a large scale. And, there are a lot more factors which might affect the final math score like IQ, memory test score etc. We proceeded with Simple Linear Regression under the assumption that there is a linear relationship between dependent variable (Math final score) and independent variable (Gender). So, based on that assumption, we proceeded with rest of the tests. The scatter plots did not show any linear relationship between gender and final math score. The main motive for us was to try out the whole experiment with the curiosity that 'what would have happened if gender played a role?' 8. Conclusions The conclusions we that we have come to for each question of interest – Question of Interest 1 - Is there a change in overall performance of the students after playing an AVG (Action Video Game)? Conclusion – We performed ANOVA analysis for this question. The significance is less than 0.05. So, F statistic (7.311) is significant. This proves that the type of game played has effect on Math Total Post score. Question of Interest 2 - Is ‘Gender’ attribute play a role in affecting the overall performance? Conclusion – Firstly, we performed ANOVA test to determine if there is a difference between the mean total math scores that males and females have received. The significance value is less than 0.05 and the F statistic (72.123) is significant. We conclude that gender has role in determining the Math Total Post score. Secondly, we wanted to check if there exists a linear dependency between the math total post scores and gender. We failed one of the assumptions (Linear relationship exists between math total post scores and gender). Out of curiosity, we wanted to check what would be the result if there was a linear dependency. The linear model that we got is ŷ = 30.721 + 39.059 x + e.
  • 23. 21 The estimated math total final score for male is 69.78. The estimated math total final score for female is 30.721. We cannot generalize the results of our experiment and end up in a biased conclusion. Our dataset had only 30 samples which cannot be applied on a large scale. And, there are a lot more factors which might affect the final math score like IQ, memory test score etc.
  • 24. 22 9. Appendix The results obtained from SPSS for section – ONEWAY Math_Total_Post BY Gender /CONTRAST=-1 1 /STATISTICS DESCRIPTIVES HOMOGENEITY WELCH /MISSING ANALYSIS. Oneway Notes Output Created 26-NOV-2016 13:15:33 Comments Input Active Dataset DataSet1 Filter <none> Weight <none> Split File <none> N of Rows in Working Data File 30 Missing Value Handling Definition of Missing User-defined missing values are treated as missing. Cases Used Statistics for each analysis are based on cases with no missing data for any variable in the analysis. Syntax ONEWAY Math_Total_Post BY Gender /CONTRAST=-1 1 /STATISTICS DESCRIPTIVES HOMOGENEITY WELCH /MISSING ANALYSIS. Resources Processor Time 00:00:00.00 Elapsed Time 00:00:00.06
  • 25. 23 Descriptives Math_Total_Post N Mean Std. Deviation Std. Error 95% Confidence Interval for Mean Lower Bound Upper Bound 0 16 30.72115385000 13.588685250000 3.397171313000 23.48025460000 37.96205310000 1 14 69.78021978000 11.274812610000 3.013320420000 63.27033679000 76.29010277000 Total 30 48.94871795000 23.351578060000 4.263395353000 40.22909540000 57.66834050000 Descriptives Math_Total_Post Minimum Maximum 0 7.692307692 53.846153850 1 38.461538460 80.769230770 Total 7.692307692 80.769230770 Test of Homogeneity of Variances Math_Total_Post Levene Statistic df1 df2 Sig. 1.518 1 28 .228 ANOVA Math_Total_Post Sum of Squares df Mean Square F Sig. Between Groups 11391.226 1 11391.226 72.123 .000 Within Groups 4422.364 28 157.942 Total 15813.590 29
  • 26. 24 Math_Total_Post Statistica df1 df2 Sig. Welch 73.984 1 27.936 .000 a. Asymptotically F distributed. Contrast Coefficients Contrast Gender 0 1 1 -1 1 Contrast Tests Contrast Value of Contrast Std. Error t Math_Total_Post Assume equal variances 1 39.05906593000 4.599226845000 8.493 Does not assume equal variances 1 39.05906593000 4.541021128000 8.601 Contrast Tests Contrast df Sig. (2-tailed) Math_Total_Post Assume equal variances 1 28 .000 Does not assume equal variances 1 27.936 .000 MEANS TABLES=Math_Total_Post BY Gender /CELLS=MEAN COUNT STDDEV /STATISTICS ANOVA. Means Notes Output Created 26-NOV-2016 13:16:28 Comments Input Active Dataset DataSet1 Filter <none>
  • 27. 25 Weight <none> Split File <none> N of Rows in Working Data File 30 Missing Value Handling Definition of Missing For each dependent variable in a table, user-defined missing values for the dependent and all grouping variables are treated as missing. Cases Used Cases used for each table have no missing values in any independent variable, and not all dependent variables have missing values. Syntax MEANS TABLES=Math_Total_Post BY Gender /CELLS=MEAN COUNT STDDEV /STATISTICS ANOVA. Resources Processor Time 00:00:00.03 Elapsed Time 00:00:00.05 Case Processing Summary Cases Included Excluded Total N Percent N Percent N Percent Math_Total_Post * Gender 30 100.0% 0 0.0% 30 100.0%
  • 28. 26 Report Math_Total_Post Gender Mean N Std. Deviation 0 30.72115385000 16 13.588685250000 1 69.78021978000 14 11.274812610000 Total 48.94871795000 30 23.351578060000 ANOVA Table Sum of Squares df Mean Square Math_Total_Post * Gender Between Groups (Combined) 11391.226 1 11391.226 Within Groups 4422.364 28 157.942 Total 15813.590 29 ANOVA Table F Sig. Math_Total_Post * Gender Between Groups (Combined) 72.123 .000 Within Groups Total Measures of Association Eta Eta Squared Math_Total_Post * Gender .849 .720 EXAMINE VARIABLES=Math_Total_Post /PLOT BOXPLOT HISTOGRAM NPPLOT /COMPARE GROUPS /STATISTICS DESCRIPTIVES EXTREME /CINTERVAL 95 /MISSING LISTWISE
  • 29. 27 /NOTOTAL. Explore Notes Output Created 26-NOV-2016 13:17:16 Comments Input Active Dataset DataSet1 Filter <none> Weight <none> Split File <none> N of Rows in Working Data File 30 Missing Value Handling Definition of Missing User-defined missing values for dependent variables are treated as missing. Cases Used Statistics are based on cases with no missing values for any dependent variable or factor used. Syntax EXAMINE VARIABLES=Math_Total_Post /PLOT BOXPLOT HISTOGRAM NPPLOT /COMPARE GROUPS /STATISTICS DESCRIPTIVES EXTREME /CINTERVAL 95 /MISSING LISTWISE /NOTOTAL. Resources Processor Time 00:00:00.55 Elapsed Time 00:00:00.62
  • 30. 28 Case Processing Summary Cases Valid Missing Total N Percent N Percent N Percent Math_Total_Post 30 100.0% 0 0.0% 30 100.0% Descriptives Statistic Std. Error Math_Total_Post Mean 48.94871795000 4.263395353000 95% Confidence Interval for Mean Lower Bound 40.22909540000 Upper Bound 57.66834050000 5% Trimmed Mean 49.40170940000 Median 44.23076923000 Variance 545.296 Std. Deviation 23.351578060000 Minimum 7.692307692 Maximum 80.769230770 Range 73.076923080 Interquartile Range 36.538461540 Skewness -.165 .427 Kurtosis -1.266 .833 Extreme Values Case Number Value Math_Total_Post Highest 1 5 80.769230770 2 26 80.769230770 3 6 76.923076920 4 10 76.923076920 5 17 76.923076920a Lowest 1 4 7.692307692
  • 31. 29 2 15 11.538461540 3 2 11.538461540 4 28 18.461538460 5 14 19.230769230 a. Only a partial list of cases with the value 76.923076920 are shown in the table of upper extremes. Tests of Normality Kolmogorov-Smirnova Shapiro-Wilk Statistic df Sig. Statistic df Sig. Math_Total_Post .174 30 .021 .922 30 .030 a. Lilliefors Significance Correction Math_Total_Post
  • 32. 30
  • 33. 31 GET DATA /TYPE=XLS /FILE='F:MS 1st semBSTATProjectAVGnonAVGDataset_Test.xls' /SHEET=name 'AVG without q7i Spring 2014 FIN' /CELLRANGE=FULL /READNAMES=ON /DATATYPEMIN PERCENTAGE=95.0. EXECUTE. DATASET NAME DataSet1 WINDOW=FRONT. * Chart Builder. GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=Gender Math_Total_Post MISSING=LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset"))
  • 34. 32 DATA: Gender=col(source(s), name("Gender"), unit.category()) DATA: Math_Total_Post=col(source(s), name("Math_Total_Post")) GUIDE: axis(dim(1), label("Gender")) GUIDE: axis(dim(2), label("Math_Total_Post")) SCALE: linear(dim(2), include(0)) ELEMENT: point(position(Gender*Math_Total_Post)) END GPL. GGraph Notes Output Created 26-NOV-2016 23:43:42 Comments Input Active Dataset DataSet1 Filter <none> Weight <none> Split File <none> N of Rows in Working Data File 30
  • 35. 33 Syntax GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=Gender Math_Total_Post MISSING=LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset") ) DATA: Gender=col(source(s), name("Gender"), unit.category()) DATA: Math_Total_Post=col(source(s), name("Math_Total_Post")) GUIDE: axis(dim(1), label("Gender")) GUIDE: axis(dim(2), label("Math_Total_Post")) SCALE: linear(dim(2), include(0)) ELEMENT: point(position(Gender*Math_To tal_Post)) END GPL. Resources Processor Time 00:00:02.41 Elapsed Time 00:00:01.26 [DataSet1]
  • 36. 34 Notes Output Created 26-NOV-2016 23:45:36 Comments Input Active Dataset DataSet1 Filter <none> Weight <none> Split File <none> N of Rows in Working Data File 30
  • 37. 35 Syntax GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=Gender Log_Math_Total_Post MISSING=LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset") ) DATA: Gender=col(source(s), name("Gender"), unit.category()) DATA: Log_Math_Total_Post=col(sour ce(s), name("Log_Math_Total_Post")) GUIDE: axis(dim(1), label("Gender")) GUIDE: axis(dim(2), label("Log_Math_Total_Post")) SCALE: linear(dim(2), include(0)) ELEMENT: point(position(Gender*Log_Mat h_Total_Post)) END GPL. Resources Processor Time 00:00:00.72 Elapsed Time 00:00:00.30
  • 38. 36 REGRESSION /DESCRIPTIVES MEAN STDDEV CORR SIG N /MISSING LISTWISE /STATISTICS COEFF OUTS CI(95) BCOV R ANOVA COLLIN TOL CHANGE ZPP /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT Math_Total_Post /METHOD=ENTER Gender /SCATTERPLOT=(*ZRESID ,*ZPRED) /RESIDUALS DURBIN HISTOGRAM(ZRESID) NORMPROB(ZRESID) /CASEWISE PLOT(ZRESID) OUTLIERS(3). Regression Notes Output Created 26-NOV-2016 23:47:45 Comments Input Active Dataset DataSet1 Filter <none> Weight <none> Split File <none> N of Rows in Working Data File 30 Missing Value Handling Definition of Missing User-defined missing values are treated as missing. Cases Used Statistics are based on cases with no missing values for any variable used.
  • 39. 37 Syntax REGRESSION /DESCRIPTIVES MEAN STDDEV CORR SIG N /MISSING LISTWISE /STATISTICS COEFF OUTS CI(95) BCOV R ANOVA COLLIN TOL CHANGE ZPP /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT Math_Total_Post /METHOD=ENTER Gender /SCATTERPLOT=(*ZRESID ,*ZPRED) /RESIDUALS DURBIN HISTOGRAM(ZRESID) NORMPROB(ZRESID) /CASEWISE PLOT(ZRESID) OUTLIERS(3). Resources Processor Time 00:00:01.58 Elapsed Time 00:00:00.72 Memory Required 2928 bytes Additional Memory Required for Residual Plots 680 bytes Descriptive Statistics Mean Std. Deviation N Math_Total_Post 48.94871795000 23.351578060000 30 Gender .47 .507 30
  • 40. 38 Correlations Math_Total_Post Gender Pearson Correlation Math_Total_Post 1.000 .849 Gender .849 1.000 Sig. (1-tailed) Math_Total_Post . .000 Gender .000 . N Math_Total_Post 30 30 Gender 30 30 Variables Entered/Removeda Model Variables Entered Variables Removed Method 1 Genderb . Enter a. Dependent Variable: Math_Total_Post b. All requested variables entered. Model Summaryb Model R R Square Adjusted R Square Std. Error of the Estimate Change Statistics R Square Change F Change df1 1 .849a .720 .710 12.567480280000 .720 72.123 1 Model Summaryb Model Change Statistics df2 Sig. F Change 1 28 .000 1.856 a. Predictors: (Constant), Gender b. Dependent Variable: Math_Total_Post
  • 41. 39 ANOVAa Model Sum of Squares df Mean Square F Sig. 1 Regression 11391.226 1 11391.226 72.123 .000b Residual 4422.364 28 157.942 Total 15813.590 29 a. Dependent Variable: Math_Total_Post b. Predictors: (Constant), Gender Coefficientsa Model Unstandardized Coefficients Standardized Coefficients t Sig.B Std. Error Beta 1 (Constant) 30.721 3.142 9.778 .000 Gender 39.059 4.599 .849 8.493 .000 Coefficientsa Model 95.0% Confidence Interval for B Correlations Collinearity Statistics Lower Bound Upper Bound Zero-order Partial Part Tolerance 1 (Constant) 24.285 37.157 Gender 29.638 48.480 .849 .849 .849 1.000 Coefficientsa Model Collinearity Statistics VIF 1 (Constant) Gender 1.000 a. Dependent Variable: Math_Total_Post
  • 42. 40 Coefficient Correlationsa Model Gender 1 Correlations Gender 1.000 Covariances Gender 21.153 a. Dependent Variable: Math_Total_Post Collinearity Diagnosticsa Model Dimension Eigenvalue Condition Index Variance Proportions (Constant) Gender 1 1 1.683 1.000 .16 .16 2 .317 2.305 .84 .84 a. Dependent Variable: Math_Total_Post Residuals Statisticsa Minimum Maximum Mean Std. Deviation N Predicted Value 30.72115326000 69.78022003000 48.94871795000 19.819205290000 30 Residual - 31.318681720000 23.125000000000 .000000000000 12.348898730000 30 Std. Predicted Value -.920 1.051 .000 1.000 30 Std. Residual -2.492 1.840 .000 .983 30 a. Dependent Variable: Math_Total_Post
  • 44. 42