1. B Y
A N W E S H B I S W A S ( 1 7 M B 4 0 0 9 )
A M A A N A L I ( 1 7 M B 4 0 2 2 )
REGRESSION ANALYSIS
AND ITS APPLICATION IN BUSINESS
2. Regression Analysis. . .
It is the study of the
relationship between
variables.
It is one of the most
commonly used tools for
business analysis.
It is easy to use and
applies to many
situations.
3. TYPES OF REGRSSION…
Simple Regression: single explanatory variable
Multiple Regression: includes any number of
explanatory variables.
4. Dependent variable: the single variable being explained/ predicted by
the regression model
Independent variable: The explanatory variable(s) used to predict the
dependent variable.
Coefficients (β): values, computed by the regression tool, reflecting
explanatory to dependent variable relationships.
Residuals (ε): the portion of the dependent variable that isn’t explained
by the model; the model under and over predictions.
5. TYPES OF REGRESSION ANALYSIS…
Linear Regression: straight-line relationship
Form:y=mx+b
Non-linear: implies curved relationships
logarithmic relationships
Cross Sectional: data gathered from the same time
period
Time Series: Involves data observed over equally
spaced points in time.
6. Simple Linear Regression Model. . .
Only one independent
variable, x
Relationship between x
andy is described by a
linear function
Changes in y are
assumed to be caused
by changes in x
9. Estimated Regression Model. . .
The sample regression line provides an estimate of
the population regression line
10. EXAMPLE (USING EXCEL)
On a Friday, 22 students
in a class were asked to
record the numbers of
hours they spent
studying for a test on
Monday and the
numbers of hours they
spent watching
television. The results
are shown below.
Book2.xlsx
MARKS HOURS
40 1
44 1
51 2
58 3
49 3
48 4
64 4
55 5
69 5
58 5
75 5
68 6
63 6
93 6
84 7
67 7
90 8
76 8
95 9
72 9
85 9
98 10
11. GRAPHICAL REPRESENTATION
y = 5.639x + 36.745
0
20
40
60
80
100
120
0 2 4 6 8 10 12
MARKSOBTAINED
HOURS STUDIED
MARKS
MARKS
Linear (MARKS)
12. ACTUAL ANALYSIS
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.86107
R Square 0.741442
Adjusted R Square0.728514
Standard Error8.976161
Observations 22
ANOVA
df SS MS F Significance F
Regression 1 4620.934 4620.934 57.35199 2.69E-07
Residual 20 1611.429 80.57147
Total 21 6232.364
CoefficientsStandard Error t Stat P-value Lower 95%Upper 95%Lower 95.0%Upper 95.0%
Intercept 36.74539 4.58186 8.019753 1.12E-07 27.18779 46.30298 27.18779 46.30298
HOURS 5.639037 0.744613 7.57311 2.69E-07 4.085801 7.192272 4.085801 7.192272
13. Interpretation of the Intercept,b0
Marks Obtained 5.639 (hours studied)36.745
b0 is the estimated average value of Y when the
value of X is zero (if x = 0 is in the range of
observed x values)
36.745 just indicates that, for marks within the
range of sizes observed, 36.745 is the portion of
the marks not explained by hours studied.
14. Interpretation of the Slope Coefficient, b1
Marks Obtained 36.745 (hours studied)5.639
b1 measures the estimated change in the average
value of Y as a result of a one- unit change in X
– Here, b1 = 5.639 tells us that the average value
of marks increases by 5.639 , on average, for each
additional one hour studied.
15. Coefficient of Determination, R2
Note: In the single independent variable case, the coefficient
of determination is
R2
r2
where:
R2 = Coefficient of determination
r = Simple correlation coefficient
16. Examples of Approximate R2 Values
R2 = +1
y
x
y
x
R2 = -1
R2 = +-1
Perfect linear relationship
between x and y:
100% of the variation in y is
explained by variation in x
17. Examples of Approximate R2 Values
R2 = 0
No linear relationship
between x and y:
The value of Y does not
depend on x. (None of the
variation in y is explained
by variation in x)
y
x
R2 = 0
18. OUTPUT
R2
SSR
4620.9341
0.7414
SST 6232.3636
R Square 0.741441681
Adjusted R Square 0.728513765
Standard Error 8.976161388
Observations 22
ANOVA
df SS
Regression 1 4620.934171
Residual 20 1611.429465
Total 21 6232.363636
THIS MEANS THAT
74.14% OF VARIATION
IN MARKS CAN BE
EXPLAINED BY
VARIATION IN STUDY
HOURS
19. Standard Error of Estimate. . .
The standard deviation of the variation of
observations around the regression line is
estimated by
n k 1
ESS
s
Where
ESS = ERROR Sum of
squares n = Sample
size
k = number of independent variables in the
model
20. OUTPUT
R Square 0.741441681
Adjusted R Square 0.728513765
Standard Error 8.976161388
Observations 22
ANOVA
df SS
Regression 1 4620.934171
Residual 20 1611.429465
Total 21 6232.363636
sε 8.9761