The document contains a regression analysis of house prices using four predictor variables. It includes:
1) The regression equation estimating house prices from the predictor variables.
2) Statistical tests showing three of the four predictor variables are significant while one is not.
3) Analysis of variance tables and calculations showing the regression model is significant overall.
4) Comparison of three regression models, finding the second model is superior to the first but the third is not an improvement on the second.
5) Using the second model to estimate the price of a detached house with specific characteristics.
The second section analyzes the relationship between advertising expenditure and sales, finding a curvilinear relationship and estimating sales for
2. 1. House price (Again)
Predictor Coefficient (B) SE (B)
(Variable)
Constant -2.5 41.4
X1 1.62 0.21
X2 0.257 1.88
X4 -0.027 0.008
Analysis of Variance (ANOVA)
Source of variation Sum of Squares Degree of Freedom Mean Squares
Regression 277,895
Residual 34,727
3. 1 (a)
(i) Write out the estimated regression equation
Predictor
Coefficient (B) SE (B)
(Variable)
Constant -2.5 41.4
X1 1.62 0.21
X2 0.257 1.88
X4 -0.027 0.008
ˆ
Y = −2.5 + 1.62 X 1 + 0.257 X 2 − 0.027 X 4
4. 1 (a)
(ii) Test for the significance of regression equation
Step1: At 1% α = 0.01
Critical Value tα 2,df = t0.012 ,15− 4 = t0.005,11 = 3.1058
Step2: βi
t βi =
t-Statistic SE βi
5. 1 (a)
(ii) Test for the significance of regression equation
Step1: Critical Value At 1% α = 0.01 t0.005,11 = 3.1058
Step2: t-Statistic
1.62 Reject H0
t1 = = 7.71 > 3.1058
βi 0.21
t βi =
SE βi Do NOT
0.257 < 3.1058
t2 = = 0.137 Reject H0
1.88
− 0.027
t4 = = −3.375 < -3.1058 Reject H0
0.008
6. 1. a). (iii) What are DF for SSR & SSE?
Predictor Coefficient (B) SE (B)
(Variable)
Constant -2.5 41.4
X1 1.62 0.21
X2 0.257 1.88
X4 -0.027 0.008
Analysis of Variance (ANOVA)
Source of variation Sum of Squares Degree of Freedom Mean Squares
Regression 277,895 3 (p)
Residual 34,727 11 (n-p-1)
7. 1. a).
(iv) Test for Significant relationship X&Y?
H0: β1 = β 2 = β 4 = 0
H1: At least one of the coefficients does not equal 0
Analysis of Variance (ANOVA)
Source of Sum of Degree of Mean
F Statistic
variation Squares Freedom Squares
Regression 277,895 3 92,631 29.341
Residual 34,727 11 3157
Critical Value At α = 0.01 F0.01(3,11) = 6.217
Then we can reject Null hypothesis, there is
a relationship between Xs & Y
8. 1. a).
(v) Compute the coefficient of determination
and explain its meaning
2 = 1−
Sum Square Error
R
Sum Squares Total Analysis of Variance (ANOVA)
Source of Sum of Degree of Mean
F Statistic
variation Squares Freedom Squares
Regression 277,895 3 92,631 29.341
Residual 34,727 11 3157
TOTAL 312,622
R2 = 1 – (34,727/312,622)
R2 = 1 – 0.111
R2 = 0.889 = 88.9%
9. 1(b)
Model 1
y = 1.8 + 1.601x1 − 0.026 x4
ˆ
R = 0.880
2
Model 2
y = 64.05 + 1.23 x1 − 0.026 x4 + 63.794 x5 − 65.371x6
ˆ
R = 0.935
2
Model 3
y = 65.2 + 1.22 x1 − 0.067 x2 − 0.026 x4 + 63.447 x5 − 65.447 x6
ˆ
R = 0.936
2
10. 1(b)
(i) Compute Adjusted Coefficient of determination for
three models
n −1
R 2
adj = R = 1 − (1 − R )(
2 2
)
n − p −1
15 − 1
R = 1 − (1 − 0.880)(
1
2
) = 0.86
15 − 2 − 1
15 − 1
R = 1 − (1 − 0.935)(
2
2
) = 0.909
15 − 4 − 1
15 − 1
R = 1 − (1 − 0.936)(
3
2
) = 0.900
15 − 5 − 1
11. 1(b)
(ii) Interpret the coefficients on the house type, Beta5
and Beta6
(model 2) y = 64.05 + 1.23 x1 − 0.026 x4 + 63.794 x5 − 65.371x6
ˆ
Prices for Detached houses increase by £63,794
Prices for Terrace Houses decreased by £65,371
(relative to Semi- detached)
12. 1(b)
(iii) At 0.05 level of significance, determine whether
model 2 is superior to model1
Model 1 y = 1.8 + 1.601x1 − 0.026 x4
ˆ
Model 2 y = 64.05 + 1.23 x1 − 0.026 x4 + 63.794 x5 − 65.371x6
ˆ
RComplete − RRe stricted
2 2
n − p −1
F= ×
1− R 2
Complete p−q
0.935 − 0.880 15 − 4 − 1
F= × = 4.231
1 − 0.935 4−2
Fα ,( p − q ,n − p −1) = F0.05,( 4− 2,15− 4−1) = F0.05, 2,10 = 4.103 < 4.231
Significant i.e., Model 2 is better than Model 1
13. 1(b)
(iv) At 0.05 level of significance, determine whether
model 3 is superior to model 2
Model 2 y = 64.05 + 1.23 x1 − 0.026 x4 + 63.794 x5 − 65.371x6
ˆ
Model 3 y = 65.2 + 1.22 x1 − 0.067 x2 − 0.026 x4 + 63.447 x5 − 65.447 x6
ˆ
RComplete − RRe stricted
2 2
n − p −1
F= ×
1 − RComplete
2
p−q
0.936 − 0.935 15 − 5 − 1
F= × = 0.141
1 − 0.936 5−4
Fα ,( p − q ,n − p −1) = F0.05,( 5− 4,15−5−1) = F0.05,1,9 = 5.117 > 0.141
NOT Significant i.e., Model 3 is NOT better than Model 2
14. 1(b)
(v) From model2, estimate the price of 5 years old
detached house with 250 square meters
y = 64.05 + 1.23 x1 − 0.026 x4 + 63.794 x5 − 65.371x6
ˆ
y = 64.05 + 1.23 * 250 − 0.026(250 * 5) + 63.794 *1 − 65.371* 0
ˆ
y = £402,844
ˆ
15. 2. Advertising expenditure
X, Advertising Y, Sales R square 0.97
(£000) (£000) Adjusted R Square 0.96
5.5 90 Standard error of regression 3.37
2.0 40 Analysis of variance
3.2 55 DF Sum Square Mean Square
Regression 2,904
6.0 95
Residual 80.0
3.8 70
4.4 80 Variables in the Equation
6.0 88 Variable B SE B
5.0 85 Advert 31.79 4.48
6.5 92 Advert-square -2.30 0.485
7.0 91 (constant) -17.22 9.65
16. 2.(a) State the regression equation
for the curvilinear model.
Variables in the Equation
Variable B SE B
Advert 31.79 4.48
Advert-square -2.30 0.485
(constant) -17.22 9.65
ˆ = β +β X −β X2
Yt 0 1 2
ˆ = −17.22 + 31.79 X − 2.30 X 2
Yt
17. 2.(b) Predict the monthly sales (in pounds)
for a month with total advertising
expenditure of £6,000
ˆ
Yt = −17.22 + 31.79 X − 2.30 X 2
X=6
ˆ
Yt = −17.22 + 31.79(6) − 2.30(6)2 = 90.720
Sales = 90.720 *1,000 = £90,720
18. 2.(c) Determine there is significant relationship
between the sales and advertising expenditure at
the 0.01 level of significance
H0: β1 = β 2 = 0 ˆ
Yt = β 0 + β1 X − β 2 X 2
H1: At least one of the coefficients does not equal 0
Analysis of variance
DF Sum Square Mean Square F
Regression 2 2,904 1,452 127.05
Residual 7 80.0 11.428
Critical Value At α = 0.01 F0.01( 2, 7 ) = 5.547
Then we can reject Null hypothesis, there is a curvilinear
relationship between sales and advertising expenditure
19. 2 (d) Fit a linear model to the data
and calculate SSE for this model
ˆ
β1 =
∑ xy − nx y
∑ x − nx
2 2
ˆ = y−β x
β0 ˆ
1
20. 2 (d) Fit a linear model to the data
and calculate SSE for this model
X Y
ID
Advertising Sales
1 5.5 90
2 2 40
3 3.2 55
4 6 95
5 3.8 70
6 4.4 80
7 6 88
8 5 85
9 6.5 92
10 7 91
21. 2 (d) Fit a linear model to the data
and calculate SSE for this model
X Y
ID xy x^2 y^2
Advertising Sales
1 5.5 90 495 30.25 8100
2 2 40 80 4 1600
3 3.2 55 176 10.24 3025
4 6 95 570 36 9025
5 3.8 70 266 14.44 4900
6 4.4 80 352 19.36 6400
7 6 88 528 36 7744
8 5 85 425 25 7225
9 6.5 92 598 42.25 8464
10 7 91 637 49 8281
Sum 49.4 786 4127 266.54 64764
Average 4.94 78.6 412.7 26.654 6476.4
22. 2 (d) Fit a linear model to the data
and calculate SSE for this model
ˆ
β1 =
∑ xy − nxy β = 4127 − 10(4.94)(78.6) = 10.85
ˆ
∑
1
x − nx
2 2
266.54 − 10(4.94) 2
ˆ ˆ
β 0 = y − β1 x ˆ
β 0 = 78.6 − 10.85(4.94) = 25.0
y = 25.0 + 10.85 x
ˆ
23. 2 (d) Fit a linear model to the data
and calculate SSE for this model
X Y
ID xy x^2 y^2
Advertising Sales
1 5.5 90 495 30.25 8100
2 2 40 80 4 1600
3 3.2 55 176 10.24 3025
4 6 95 570 36 9025
5 3.8 70 266 14.44 4900
6 4.4 80 352 19.36 6400
7 6 88 528 36 7744
8 5 85 425 25 7225
9 6.5 92 598 42.25 8464
10 7 91 637 49 8281
Sum 49.4 786 4127 266.54 64764
Average 4.94 78.6 412.7 26.654 6476.4
24. 2 (d) Fit a linear model to the data
and calculate SSE for this model
X Y predicted
ID xy x^2 y^2
Advertising Sales Y
1 5.5 90 495 30.25 8100 84.68
2 2 40 80 4 1600 46.70
3 3.2 55 176 10.24 3025 59.72
4 6 95 570 36 9025 90.10
5 3.8 70 266 14.44 4900 66.23
ˆ = 25 + 10.85 X
6 4.4 80 352 19.36 6400 72.74
Yt7
8
9
6
5
6.5
88
85
92
528
425
598
36
25
42.25
7744
7225
8464
90.10
79.25
95.53
10 7 91 637 49 8281 100.95
Sum 49.4 786 4127 266.54 64764
Average 4.94 78.6 412.7 26.654 6476.4
25. 2 (d) Fit a linear model to the data
and calculate SSE for this model
X Y predicted Square
ID xy x^2 y^2
Advertising Sales Y Error
1 5.5 90 495 30.25 8100 84.68 28.35
2 2 40 80 4 1600 46.70 44.92
3 3.2 55 176 10.24 3025 59.72 22.29
4 6 95 570 36 9025 90.10 24.00
5 3.8 70 266 14.44 4900 66.23 14.20
6 4.4 80 352 19.36 6400 72.74 52.69
7 6 88 528 36 7744 90.10 4.41
8 5 85 425 25 7225 79.25 33.05
9 6.5 92 598 42.25 8464 95.53 12.43
10 7 91 637 49 8281 100.95 99.01
Sum 49.4 786 4127 266.54 64764
Average 4.94 78.6 412.7 26.654 6476.4
26. 2 (d) Fit a linear model to the data
and calculate SSE for this model
X Y predicted Square
ID xy x^2 y^2
Advertising Sales Y Error
1 5.5 90 495 30.25 8100 84.68 28.35
2 2 40 80 4 1600 46.70 44.92
3 3.2 55 176 10.24 3025 59.72 22.29
4 6 95 570 36 9025 90.10 24.00
5 3.8 70 266 14.44 4900 66.23 14.20
6 4.4 80 352 19.36 6400 72.74 52.69
7 6 88 528 36 7744 90.10 4.41
8 5 85 425 25 7225 79.25 33.05
9 6.5 92 598 42.25 8464 95.53 12.43
10 7 91 637 49 8281 100.95 99.01
Sum 49.4 786 4127 266.54 64764 335.36
Average 4.94 78.6 412.7 26.654 6476.4
27. 2(e) At 0.01 level of significance, determine
whether the curvilinear model is superior to the
linear regression model
Curvilinear Model ˆ
Yt = −17.22 + 31.79 X − 2.30 X 2
Linear Regression Model ˆ
Yt = 25 + 10.85 X
SSE Linear − SSECurvilinear n − p − 1
F= ×
SSECurvilinear p−q
335 − 80 10 − 2 − 1
F= × = 22.3125
80 2 −1
Fα ,( p − q ,n − p −1) = F0.01,( 2−1,10− 2−1) = F0.01,1, 7 = 12.25 < 22.3
Significant i.e., Curvilinear effect make significant
contribution and should be included in the model.