SlideShare uma empresa Scribd logo
1 de 65
Baixar para ler offline
Scatter Plot
scatter plot - a graphical representation of the relationship
between the two variables

Study Session 3, Reading 11
Correlation and Covariance Analysis
Correlation analysis - expresses the relationship between two
variables with the help of a single number. It measures both
the extent and direction of the linear relationship between
two variables.
Formula: Sample covariance of X and Y for a sample size of ‘n’
can be calculated as:

Study Session 3, Reading 11
Correlation and Covariance
Analysis (cont.)
Formula: Sample Correlation Coefficient:

Where: SX - standard deviation of variable X
SY - standard deviation of variable Y

Formula: Sample Standard Deviation

Study Session 3, Reading 11
Limitations to Correlation Analysis
Outliners
Outliers are a small number of observations that are at either
extreme of a sample
Spurious Correlation
 The correlation between two variables that shows a chance
relationship in a particular data set is called spurious
correlation.
 The correlation between two variables that arises not from a
direct relationship between them but their relation to third
variable is also called spurious correlation.
Study Session 3, Reading 11
Hypothesis Testing For Population
Correlation Coefficient
Proposed Hypothesis:
 null hypothesis - H0 , that the correlation is 0 (p=0)
 alternative hypothesis - Ha that the correlation of population

is different from 0 (p≠0)
Formula: t-test

Study Session 3, Reading 11
Dependent and Independent
Variables in Linear Regression
independent variable (denoted as X) - the variable that is used
to explain changes
dependent variable (denoted as Y) - the variable that is to be
explained.
Linear regression involves the use of one variable to make a
prediction about other variable. It also involves testing
hypotheses about the relation between the two variables and
quantifying the strength of relationship between the two
variables.
Study Session 3, Reading 11
Dependent and Independent
Variables in Linear Regression (cont.)
Regression equation that defines the linear relation between the
dependant and independent variable:

Where: Y - dependent variable
b0 – intercept
b1 - slope coefficient
X - independent variable
- error term

Study Session 3, Reading 11
Dependent and Independent
Variables in Linear Regression (cont.)
In linear regression, estimated or fitted parameters b0 and b1 are
chosen in the given equation to minimize:

cross sectional data - uses many observations on the dependant
and independent variables for the same time period
time-series data - many observations from different time
periods are used

Study Session 3, Reading 11
Assumptions of a Classical Linear
Regression Model
1. There is a linear relationship between the independent and
2.
3.
4.
5.
6.

dependant variable.
The independent variable is not random.
The expected value of the error term is 0.
The error term is normally distributed.
The error term is uncorrelated across observations.
The variance of the error term is the same for all
observations (Homoskedasticity Assumption).

Study Session 3, Reading 11
Standard Error of Estimate
Standard Error of Estimate (also called the standard error of
regression) - used to measure how accurately a regression
model fits the data.
Formula:

Study Session 3, Reading 11
Coefficient of Determination
coefficient of determination - used in measuring the proportion
variance in the dependent variable that is explained by the
independent variable
Formula:

Study Session 3, Reading 11
Confidence Interval for Regression
Coefficient
regression coefficient - the average change in the dependant variable
for every one unit change in the value of the independent variable.
Things needed to estimate confidence interval for the regression coefficient:
 estimated parameter value for a sample
 standard error of estimate
 Significance level for t-distribution
 degree of freedom (n-2).

Formula:
Where: tc - critical t value at a chosen significant level
Study Session 3, Reading 11
Hypothesis Testing for a Population
Value of the Regression Coefficient
Formula: When testing a hypothesis using a regression model
with t-test of significance, the t statistic is computed as:

Formula: The confidence interval for the test is given as:

Study Session 3, Reading 11
Calculating a Predicted Value for the
Dependent Variable
Two sources of uncertainty in using regression model:
1. the error term
2. estimated parameters ( bˆ0 and bˆ1 )
Given the regression model Yi =bo +b1 Xi +Ei , if estimated
parameters bˆ0 and bˆ1 are known, the predicted value of
dependent variable ,Y, can be calculated as:

Study Session 3, Reading 11
Calculating a Predicted Value for the
Dependent Variable (cont.)
The prediction interval for a regression equation for a particular
predicted value of the dependent variable is computed as:
Where: Sf - square root of estimated variance of prediction error
tc - critical level for t-statistic at chosen significance level

The confidence level is taken as

Study Session 3, Reading 11
Calculating a Predicted Value for the
Dependent Variable (cont.)
The estimated variance of the prediction error (
calculated as:

of Y) is

Where: S2 - squared standard error of estimate
-variance of independent variable
Study Session 3, Reading 11
Calculating ANOVA
in Regression Analysis
Analysis of Variance (ANOVA) - a statistical procedure that is
used to determine how well the independent variable or
variables explain the variation in the dependant variable.
F-test - the statistical test that is used in the analysis of the
variance

Study Session 3, Reading 11
F-test
A F-statistic is used to test whether the slope coefficients in a

linear regression are equal to 0 or not.
In a regression equation with one independent variable:
 Null Hypothesis H0 : b1= 0
 Alternative Hypothesis Ha : b1≠ 0

Things required to undertake an F-test
1. the total number of observations
2. the total number of parameters to be estimated
3. the sum of squared errors(SSE)
4. regression sum of squares (RSS)
Study Session 3, Reading 11
F-test (cont.)
Formula: SSE

Formula: RSS

Formula: Total Variation
(TSS) = SSE + RSS
Formula: F-statistic in a regression with one independent variable

Study Session 3, Reading 11
Limitations to Regression Analysis
Parameter instability
In investment analysis, regression models can have limited use

because public knowledge of regression relationships can
negate their use for future purpose
Violations of assumptions can make hypothesis tests and
predictions invalid

Study Session 3, Reading 11
Multiple Regression Equation
multiple regression equation - used to determine how a dependent
variable is affected by more than one independent variables
log-log regression model - used when the proportional changes in the
dependent variable bear a constant relationship to a proportional
changes in independent variables
General Form of the Multiple Regression Model

Where: Yi - the ith observation of the dependent variable Y
Xji - the ith observation of the independent variable Xj, j=1,2,…,k
b0 - the intercept of the equation
b1 ,…., bk - the slope coefficients for each of the independent variables
Ei - the error term
n - the number of observations Study Session 3, Reading 112
Hypothesis Testing for a Population
Value of a Regression Coefficient
Under the null hypothesis, the hypothesis population value of a
regression coefficient is taken as 0.
The degrees of freedom in the test are the number of observations
minus the number of independent variables + 1 (i.e. n – (k+1).)
Formula: Hypothesis testing using t-test:

Where: b^j - regression estimate of hypothesized value of coefficient
-estimated standard error of b^j
Study Session 3, Reading 12
Hypothesis Testing for a Population
Value of a Regression Coefficient (cont.)
p-value
The p-value for a regression coefficient is the smallest level of
significance at which the null hypothesis of that population
value of the coefficient is 0 can be rejected in a two-sided test.
The lower the p-level, the more accurate the result of the test.

Study Session 3, Reading 12
Confidence Interval for the Population
Value and Predicted Value for the
Dependent Variable
Two types of uncertainty in predicting the dependent variable

using linear regression model:

 the regression model itself because of standard error of estimate
 uncertainty about estimates of regression model parameters

The computation of the prediction interval to accommodate

the uncertainties is done with the help of matrix algebra.

Study Session 3, Reading 12
Points to be considered for predicting
a dependent variable
Assumptions required for using a regression model must be

met.

Caution should be exercised on predictions that are based on

the value of independent variables that are outside the range
of data used for estimating the model.

Study Session 3, Reading 12
Steps in predicting the value of the
dependent variable
Obtaining estimates

of regression

parameters (
).
Determining assumed values of independent variables
Computing predicted value of dependent variable

using the equation:

Study Session 3, Reading 12
Assumptions of a Multiple
Regression Model
1. There exists a linear relationship between the dependent variable
2.

3.
4.
5.
6.

and the independent variables.
There is no exact linear relationship between two or more of the
independent variables and the independent variables are not
random.
The error term is normally distributed.
The error term is uncorrelated across observations.
The variance of the error term is the same for all of the
observations.
The expected value of error term, conditioned upon the
independent variable, is 0.
Study Session 3, Reading 12
F-statistic in Regression Analysis
F-statistic - used to test whether at least one of the slope
coefficients of the independent variables is not equal to 0
null hypothesis - all the slope coefficients in the multiple
regression model are equal to 0 is presented as :
alternative hypothesis - at least one slope coefficient is not
equal to 0.

Study Session 3, Reading 12
F-statistic in Regression Analysis (cont.)
Things required for F-test
Total number of observations (n).
Total number of regression coefficients to be estimated (k+1)
where k is number of slope coefficients.
Sum of squared errors (SSE) (Unexplained Variation)

Regression sum of squares (RSS) (Explained Variation)

Study Session 3, Reading 12
F-statistic in Regression Analysis (cont.)
Calculating the F-statistic

Degrees of freedom in the test
1) k (numerator degrees of freedom)
2) n-(k+1) (denominator degrees of freedom)
Study Session 3, Reading 12
R2 and Adjusted R2
in Multiple Regression
R2 measures how appropriately the regression model fits with one
independent variable.
Adjusted R2 (
) is used in place of R2 when there is more than
independent variable.
Relationship:

Where: n - the number of observations
k - number of independent variables

Study Session 3, Reading 12
Dummy Variables
Dummy variables - used in regression models to determine
whether a qualitative independent variable explains the
dependent variable
A dummy variable has a value of 1 if a particular qualitative

condition is true and 0 if that condition is false.
In order to distinguish between n categories, n – 1 dummy
variables are required.

Study Session 3, Reading 12
Heteroskedasticity and its Effect on
Statistical Inference
Heteroskedasticity - a violation of the regression assumption
that the variance of the errors in a regression is constant
across observations.
Two types of heteroskedasticity :
1. unconditional heteroskedasticity
2. conditional heteroskedasticity

Breusch-Pagan test - widely used when testing for conditional
heteroskedasticity.
Two methods used for correcting conditional heteroskedasticity:
1. Robust Standard Errors
2. Generalized Least Squares
Study Session 3, Reading 12
Heteroskedasticity and its Effect on
Statistical Inference (cont.)
Durbin-Waston test – test conducted when serial correlation
generally arises in time-series regressions
Consequences of Heteroskedasticity
F-test does not provide reliable results.
T-tests for the significance of individual regression coefficients
does not provide reliable results.
Standard errors and test statistics will have to be adjusted in
order to derive reliable results.
Study Session 3, Reading 12
Unconditional Heteroskedasticity
and Conditional Heteroskedasticity
Unconditional heteroskedasticity arises when the
heteroskedasticity of an error variance does not correlate with
the independent variables. This heteroskedasticity is not a
major problem for statistical inference.
Conditional heteroskedasticity arises when heteroskedasticity
in the error variance is correlated with the independent
variables. This heteroskedasticity is a major problem for
statistical inference.

Study Session 3, Reading 12
Methods for Correcting for
Heteroskedasticity
1. Under the robust standard error method, the standard

errors of a linear regression model’s estimated coefficients
are corrected.
2. Under the generalized least square method, original
equation is modified and a new modified regression
equation is estimated.

Study Session 3, Reading 12
Consequences of Serial Correlation
Incorrect estimates of the regression coefficient standard

errors.
If the independent variable is a lagged value of the dependent
variable, it will make the parameter estimates invalid.
In positive serial correlation, a positive (negative) error for
one observation increases the positive (negative) error for
another observation.
Positive serial correlation has no effect on the consistency of
estimated regression coefficients, but affects validity of
statistical tests.
Study Session 3, Reading 12
Durbin-Waston Test
Formula:

Study Session 3, Reading 12
Methods to correct
for Serial Correlation
1. The coefficient standard errors for the linear regression

parameter estimates can be adjusted.
2. Regression equation can be modified to eliminate serial
correlation.

Study Session 3, Reading 12
Multicollinearity
in Regression Analysis
Multicollinearity - a violation of the regression assumption that
there is no exact linear relationship between two or more
independent variables
Consequences of Multicollinearity
Estimates of regression coefficients become unreliable.
It is not possible to ascertain how individual independent
variables affect dependent variables.

Study Session 3, Reading 12
Model Misspecification
in Regression Analysis
Model specification - the set of variables that are included in the
regression and the regression equation’s functional form
Misspecified Functional Form
It omits one or more important variables from regression.
One or more regression variables are required to be transformed

before estimating the regression.
Data has been pooled from different samples that are not to be
pooled.
Study Session 3, Reading 12
Model Misspecification
in Regression Analysis (cont.)
Reasons for time-series misspecification
Inclusion of lagged dependent variables as independent variables in

regressions which have serially correlated errors.
The dependent variable being included as an independent variable.
If there are independent variables that are measured with errors.

Study Session 3, Reading 12
Models With Qualitative Dependent
Variables
Qualitative dependent variables are dummy variables that are used as
dependent variables.
1. Probit model - used to estimate the probability of a discrete

outcome when values of independent variables used to explain the
outcomes given based on normal distribution
2. Logic model - used to estimate the probability of a discrete outcome
when values of independent variables used to explain the outcomes
given based on logical distribution

Study Session 3, Reading 12
Calculating the Predicted Trend Value
for a Time Series
Linear Trend Models - the dependent variable changes at a
constant rate with time
Formula:

Where: yt - value of the time series at time t
b0 - the y-intercept term
b1 - the slope coefficient (trend coefficient)
t - time (independent variable)
Et - a random error term
Study Session 3, Reading 13
Calculating the Predicted Trend Value
for a Time Series (cont.)
Log-Linear Trend Models - used when the time series tends to
grow at a constant rate
Formula:

Predicted trend value of

Study Session 3, Reading 13
Limitations of the Use of Trend Models
for a Given Time Series
Trend models can suffer from the limitation of serially

correlated errors.
If trend models have errors that are serially correlated, better
forecast models for such time series are required than trend
models.

Study Session 3, Reading 13
Covariance Stationary
Following things should be finite and constant in all periods:
 Expected value of time series.
 Variance of time series.
 Covariance of time series with itself for a fixed number of periods in

the past or future.

Implications if the Time Series is not Covariance Stationary
 Estimate of autoregressive time series by using linear regression will

not be valid
 The hypothesis test will provide invalid results.

Study Session 3, Reading 13
Structure of an Autoregressive
Model of Order p
In an autoregressive model, a time series is regressed on its past
values and shows the relationship between current period-values
and past-period values.
pth-order Autoregressive Model:

First Order Autoregression

Study Session 3, Reading 13
Autocorrelation for Time Series
Autocorrelation of a time series - the correlation of the time
series with its past values
Formula:

Study Session 3, Reading 13
Autocorrelation for Error Term
Error autocorrelation is estimated by using sample
autocorrelations of the residuals called residual
autocorrelations and their sample variance.
Formula:

Study Session 3, Reading 13
Mean Reversion
A time series shows mean reversion if it tends to rise when its level
is below its mean and falls when its level is above its mean.
Formula: Mean Reverting Level

Study Session 3, Reading 13
Mean Reversion (cont.)
Interpretation of Mean Reversion Level
If the current value of time series is b0 /(1 – b1 ) , it will neither
increase nor decrease.
If the current value is below b0 /(1 – b1 ) , the time series will
increase.
If the current value is above b0 /(1 – b1 ), the time series will
decrease.

Study Session 3, Reading 13
Mean Reversion (cont.)
Multiple Periods of Forecasting and the Chain Rule of Forecasting

Formula: AR Model

Formula: Two-period ahead forecast

Study Session 3, Reading 13
In-Sample and Out-of-Sample Forecasts
In-sample forecasts can be defined as the in-sample predicted
values from the estimated time series model.
Out-of-sample forecasts are made from estimated time-series
models for a period that is different from the period from
which the model was estimated.
Root Mean Squared Error (RMSE) (calculated as square root of
average squared error) - used for comparing the out-ofsample forecasting accuracy of different time series models.

Study Session 3, Reading 13
Instability of Coefficients
in Time-Series Models
Generally unstable across different sample periods
Different between models that are estimated based on longer

or shorter sample periods
Depends upon the sample period

Study Session 3, Reading 13
Random Walk
random walk - a time series model in which the value of a series
in one period is calculated as the value of the series in the
previous period plus an unpredictable random error
Formula:

Random walk with a drift increases or decreases by a constant
amount in each period
Formula:

Study Session 3, Reading 13
Random Walk (cont.)
First-differencing - differencing a time series by creating a new
time series that in each period is equal to the difference
between xt and xt-1.
Formula:

Study Session 3, Reading 13
Dickey Fuller Unit Root Test
Formula:

Where: g1 = (b1 – 1)
Null Hypothesis is H0 : g1 = 0
Alternative Hypothesis is Ha : g1 < 0

Study Session 3, Reading 13
Seasonality in a Time-Series Model
Seasonality of time series occurs when regular patterns of
movement within the year are observed.
Formula: Seasonal lag in autoregressive model

Formula: Forecasted Value

Study Session 3, Reading 13
ARCH Models
Autoregressive Conditional Heteroskedasticity (ARCH) - if the
variance of errors in a time series model depends on the
variance of previous
Formula: Linear regression error

Where: u1 = error term

Study Session 3, Reading 13
ARCH Models (cont.)
Predicting Variance of Errors
Formula:

Formula: Calculate the variance of the error term in the current period

Study Session 3, Reading 13
Analysis of Time-Series Variables
Prior To Linear Regression
Two time series - said to be cointegrated if there is such a longterm financial or economic relationship between the two
variables that they do not diverge from each other without
being bound in the long run.
The (Engle Granger) Dickey Fuller test is used to determine
whether time series are cointegrated.

Study Session 3, Reading 13
Analysis of the Appropriate
Time-Series Model Given an
Investment Problem
Regression models or time series models can be used in the

analysis of investment problems.
In a regression model, predicting the future value of a variable
is undertaken on the basis of a hypothesized casual
relationship with other variables.
In time series mode, the future behavior of the variable is
made on the basis of past behavior of that variable.

Study Session 3, Reading 13
Explanation of the Dependent Variable
by Analysing the Regression Equation
and ANOVA Table Key
analysis of variance (ANOVA) - used to provide information
about a regression model’s explanatory power
F-statics are used to test the explanatory power of the dependent

variable
If independent variables do not explain the dependent variables, the
value of the F-statistic is 0.
Variability in values of the dependent variable can be divided into
two parts:
Total Sum of Squares = Regression Sum of Squares + Residual Sum of
Squares
Study Session 3, Reading 13
Uses of Multiple Regression Analysis
in Financial Analysis
Used in various finance and investment decisions
The effect of various parameters on investment decisions can be

measured
To predict the expected return of a fund or portfolio
Dummy variable can be used in various financial analysis models
If there are any violations of assumptions, they should be
adjusted by analysts before making any decisions

Study Session 3, Reading 13

Mais conteúdo relacionado

Mais procurados

Logistic Regression in Case-Control Study
Logistic Regression in Case-Control StudyLogistic Regression in Case-Control Study
Logistic Regression in Case-Control StudySatish Gupta
 
Wilcoxon Signed Rank Test
Wilcoxon Signed Rank Test Wilcoxon Signed Rank Test
Wilcoxon Signed Rank Test Sharlaine Ruth
 
General Linear Model | Statistics
General Linear Model | StatisticsGeneral Linear Model | Statistics
General Linear Model | StatisticsTransweb Global Inc
 
Generalized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects DesignsGeneralized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects Designssmackinnon
 
Regression analysis made easy
Regression analysis made easyRegression analysis made easy
Regression analysis made easyWeam Banjar
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis pptElkana Rorio
 
T-Test for Correlated Groups by STR Grp. 2
T-Test for Correlated Groups by STR Grp. 2T-Test for Correlated Groups by STR Grp. 2
T-Test for Correlated Groups by STR Grp. 2Oj Acopiado
 
Nonparametric statistics ppt @ bec doms
Nonparametric statistics ppt @ bec domsNonparametric statistics ppt @ bec doms
Nonparametric statistics ppt @ bec domsBabasab Patil
 
D-Pubb-TestingExperience_Issue_28_2014-12_Berta
D-Pubb-TestingExperience_Issue_28_2014-12_BertaD-Pubb-TestingExperience_Issue_28_2014-12_Berta
D-Pubb-TestingExperience_Issue_28_2014-12_BertaBerta Danilo
 

Mais procurados (19)

Data Analyst - Interview Guide
Data Analyst - Interview GuideData Analyst - Interview Guide
Data Analyst - Interview Guide
 
Anova copy
Anova   copyAnova   copy
Anova copy
 
Logistic Regression in Case-Control Study
Logistic Regression in Case-Control StudyLogistic Regression in Case-Control Study
Logistic Regression in Case-Control Study
 
Wilcoxon Signed Rank Test
Wilcoxon Signed Rank Test Wilcoxon Signed Rank Test
Wilcoxon Signed Rank Test
 
General Linear Model | Statistics
General Linear Model | StatisticsGeneral Linear Model | Statistics
General Linear Model | Statistics
 
More tabs
More tabsMore tabs
More tabs
 
Generalized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects DesignsGeneralized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects Designs
 
Regression analysis made easy
Regression analysis made easyRegression analysis made easy
Regression analysis made easy
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
 
Regression vs ANOVA
Regression vs ANOVARegression vs ANOVA
Regression vs ANOVA
 
Logistic regression analysis
Logistic regression analysisLogistic regression analysis
Logistic regression analysis
 
T-Test for Correlated Groups by STR Grp. 2
T-Test for Correlated Groups by STR Grp. 2T-Test for Correlated Groups by STR Grp. 2
T-Test for Correlated Groups by STR Grp. 2
 
Ch14 multiple regression
Ch14 multiple regressionCh14 multiple regression
Ch14 multiple regression
 
Linearity cochran test
Linearity cochran testLinearity cochran test
Linearity cochran test
 
Chi square & related procedure
Chi square & related procedureChi square & related procedure
Chi square & related procedure
 
Nonparametric statistics ppt @ bec doms
Nonparametric statistics ppt @ bec domsNonparametric statistics ppt @ bec doms
Nonparametric statistics ppt @ bec doms
 
Contingency Tables
Contingency TablesContingency Tables
Contingency Tables
 
D-Pubb-TestingExperience_Issue_28_2014-12_Berta
D-Pubb-TestingExperience_Issue_28_2014-12_BertaD-Pubb-TestingExperience_Issue_28_2014-12_Berta
D-Pubb-TestingExperience_Issue_28_2014-12_Berta
 
Regression
RegressionRegression
Regression
 

Semelhante a L2 flash cards quantitative methods - SS3

Applied statistics lecture_6
Applied statistics lecture_6Applied statistics lecture_6
Applied statistics lecture_6Daria Bogdanova
 
Part C Regression and Correlation AnalysisUse the dependent varia.docx
Part C Regression and Correlation AnalysisUse the dependent varia.docxPart C Regression and Correlation AnalysisUse the dependent varia.docx
Part C Regression and Correlation AnalysisUse the dependent varia.docxdewhirstichabod
 
Practice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anovaPractice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anovaLong Beach City College
 
Advanced Methods of Statistical Analysis used in Animal Breeding.
Advanced Methods of Statistical Analysis used in Animal Breeding.Advanced Methods of Statistical Analysis used in Animal Breeding.
Advanced Methods of Statistical Analysis used in Animal Breeding.DrBarada Mohanty
 
Chap014 BStat SemA201.ppt
Chap014  BStat SemA201.pptChap014  BStat SemA201.ppt
Chap014 BStat SemA201.pptnajwalyaa
 
Statistics questionsTrueFalse1. The standard error of the est.docx
Statistics questionsTrueFalse1. The standard error of the est.docxStatistics questionsTrueFalse1. The standard error of the est.docx
Statistics questionsTrueFalse1. The standard error of the est.docxdessiechisomjj4
 
Group 5 - Regression Analysis.pdf
Group 5 - Regression Analysis.pdfGroup 5 - Regression Analysis.pdf
Group 5 - Regression Analysis.pdffahlevet40
 
Correlation research
Correlation researchCorrelation research
Correlation researchAmina Tariq
 
QNT 275 Inspiring Innovation / tutorialrank.com
QNT 275 Inspiring Innovation / tutorialrank.comQNT 275 Inspiring Innovation / tutorialrank.com
QNT 275 Inspiring Innovation / tutorialrank.comBromleyz33
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.pptTanyaWadhwani4
 
Multinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisMultinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisHARISH Kumar H R
 
Simple Linear Regression detail explanation.pdf
Simple Linear Regression detail explanation.pdfSimple Linear Regression detail explanation.pdf
Simple Linear Regression detail explanation.pdfUVAS
 
journal in research
journal in research journal in research
journal in research rikaseorika
 

Semelhante a L2 flash cards quantitative methods - SS3 (20)

Applied statistics lecture_6
Applied statistics lecture_6Applied statistics lecture_6
Applied statistics lecture_6
 
Part C Regression and Correlation AnalysisUse the dependent varia.docx
Part C Regression and Correlation AnalysisUse the dependent varia.docxPart C Regression and Correlation AnalysisUse the dependent varia.docx
Part C Regression and Correlation AnalysisUse the dependent varia.docx
 
Practice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anovaPractice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anova
 
Advanced Methods of Statistical Analysis used in Animal Breeding.
Advanced Methods of Statistical Analysis used in Animal Breeding.Advanced Methods of Statistical Analysis used in Animal Breeding.
Advanced Methods of Statistical Analysis used in Animal Breeding.
 
Chap014 BStat SemA201.ppt
Chap014  BStat SemA201.pptChap014  BStat SemA201.ppt
Chap014 BStat SemA201.ppt
 
Statistics questionsTrueFalse1. The standard error of the est.docx
Statistics questionsTrueFalse1. The standard error of the est.docxStatistics questionsTrueFalse1. The standard error of the est.docx
Statistics questionsTrueFalse1. The standard error of the est.docx
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
Group 5 - Regression Analysis.pdf
Group 5 - Regression Analysis.pdfGroup 5 - Regression Analysis.pdf
Group 5 - Regression Analysis.pdf
 
Correlation research
Correlation researchCorrelation research
Correlation research
 
Regression
RegressionRegression
Regression
 
CH3.pdf
CH3.pdfCH3.pdf
CH3.pdf
 
QNT 275 Inspiring Innovation / tutorialrank.com
QNT 275 Inspiring Innovation / tutorialrank.comQNT 275 Inspiring Innovation / tutorialrank.com
QNT 275 Inspiring Innovation / tutorialrank.com
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.ppt
 
Multinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisMultinomial Logistic Regression Analysis
Multinomial Logistic Regression Analysis
 
Chapter13
Chapter13Chapter13
Chapter13
 
Simple Linear Regression detail explanation.pdf
Simple Linear Regression detail explanation.pdfSimple Linear Regression detail explanation.pdf
Simple Linear Regression detail explanation.pdf
 
Simple Regression.pptx
Simple Regression.pptxSimple Regression.pptx
Simple Regression.pptx
 
journal in research
journal in research journal in research
journal in research
 
journals public
journals publicjournals public
journals public
 

Mais de analystbuddy

L2 flash cards economics - SS 4
L2 flash cards economics - SS 4L2 flash cards economics - SS 4
L2 flash cards economics - SS 4analystbuddy
 
L2 flash cards portfolio management - SS 18
L2 flash cards portfolio management - SS 18L2 flash cards portfolio management - SS 18
L2 flash cards portfolio management - SS 18analystbuddy
 
L2 flash cards derivatives - ss 17
L2 flash cards derivatives - ss 17L2 flash cards derivatives - ss 17
L2 flash cards derivatives - ss 17analystbuddy
 
L2 flash cards derivatives - ss 16
L2 flash cards derivatives - ss 16L2 flash cards derivatives - ss 16
L2 flash cards derivatives - ss 16analystbuddy
 
L2 flash cards fixed income - SS 14
L2 flash cards fixed income - SS 14L2 flash cards fixed income - SS 14
L2 flash cards fixed income - SS 14analystbuddy
 
L2 flash cards fixed income - SS 14
L2 flash cards fixed income - SS 14L2 flash cards fixed income - SS 14
L2 flash cards fixed income - SS 14analystbuddy
 
L2 flash cards alternative investments - SS 13
L2 flash cards alternative investments - SS 13L2 flash cards alternative investments - SS 13
L2 flash cards alternative investments - SS 13analystbuddy
 
L2 flash cards equity - SS 10
L2 flash cards equity - SS 10L2 flash cards equity - SS 10
L2 flash cards equity - SS 10analystbuddy
 
L2 flash cards corporate finance - SS 8
L2 flash cards corporate finance - SS 8L2 flash cards corporate finance - SS 8
L2 flash cards corporate finance - SS 8analystbuddy
 
L2 flash cards financial reporting - SS 7
L2 flash cards financial reporting - SS 7L2 flash cards financial reporting - SS 7
L2 flash cards financial reporting - SS 7analystbuddy
 
L2 flash cards financial reporting - SS 6
L2 flash cards financial reporting - SS 6L2 flash cards financial reporting - SS 6
L2 flash cards financial reporting - SS 6analystbuddy
 
L2 flash cards financial reporting - SS 5
L2 flash cards financial reporting - SS 5L2 flash cards financial reporting - SS 5
L2 flash cards financial reporting - SS 5analystbuddy
 
L2 flash cards economics - SS 4
L2 flash cards economics - SS 4L2 flash cards economics - SS 4
L2 flash cards economics - SS 4analystbuddy
 
L2 flash cards ethics SS 2
L2 flash cards ethics   SS 2L2 flash cards ethics   SS 2
L2 flash cards ethics SS 2analystbuddy
 
L2 flash cards ethics SS 1
L2 flash cards ethics   SS 1L2 flash cards ethics   SS 1
L2 flash cards ethics SS 1analystbuddy
 
L1 flash cards alternative investments (ss18)
L1 flash cards alternative investments (ss18)L1 flash cards alternative investments (ss18)
L1 flash cards alternative investments (ss18)analystbuddy
 
L1 flash cards derivatives (ss17)
L1 flash cards derivatives (ss17)L1 flash cards derivatives (ss17)
L1 flash cards derivatives (ss17)analystbuddy
 
L1 flashcards portfolio management (ss12)
L1 flashcards portfolio management (ss12)L1 flashcards portfolio management (ss12)
L1 flashcards portfolio management (ss12)analystbuddy
 
L1 flash cards corporate finance (ss11)
L1 flash cards corporate finance (ss11)L1 flash cards corporate finance (ss11)
L1 flash cards corporate finance (ss11)analystbuddy
 
L1 flash cards financial reporting (ss10)
L1 flash cards financial reporting (ss10)L1 flash cards financial reporting (ss10)
L1 flash cards financial reporting (ss10)analystbuddy
 

Mais de analystbuddy (20)

L2 flash cards economics - SS 4
L2 flash cards economics - SS 4L2 flash cards economics - SS 4
L2 flash cards economics - SS 4
 
L2 flash cards portfolio management - SS 18
L2 flash cards portfolio management - SS 18L2 flash cards portfolio management - SS 18
L2 flash cards portfolio management - SS 18
 
L2 flash cards derivatives - ss 17
L2 flash cards derivatives - ss 17L2 flash cards derivatives - ss 17
L2 flash cards derivatives - ss 17
 
L2 flash cards derivatives - ss 16
L2 flash cards derivatives - ss 16L2 flash cards derivatives - ss 16
L2 flash cards derivatives - ss 16
 
L2 flash cards fixed income - SS 14
L2 flash cards fixed income - SS 14L2 flash cards fixed income - SS 14
L2 flash cards fixed income - SS 14
 
L2 flash cards fixed income - SS 14
L2 flash cards fixed income - SS 14L2 flash cards fixed income - SS 14
L2 flash cards fixed income - SS 14
 
L2 flash cards alternative investments - SS 13
L2 flash cards alternative investments - SS 13L2 flash cards alternative investments - SS 13
L2 flash cards alternative investments - SS 13
 
L2 flash cards equity - SS 10
L2 flash cards equity - SS 10L2 flash cards equity - SS 10
L2 flash cards equity - SS 10
 
L2 flash cards corporate finance - SS 8
L2 flash cards corporate finance - SS 8L2 flash cards corporate finance - SS 8
L2 flash cards corporate finance - SS 8
 
L2 flash cards financial reporting - SS 7
L2 flash cards financial reporting - SS 7L2 flash cards financial reporting - SS 7
L2 flash cards financial reporting - SS 7
 
L2 flash cards financial reporting - SS 6
L2 flash cards financial reporting - SS 6L2 flash cards financial reporting - SS 6
L2 flash cards financial reporting - SS 6
 
L2 flash cards financial reporting - SS 5
L2 flash cards financial reporting - SS 5L2 flash cards financial reporting - SS 5
L2 flash cards financial reporting - SS 5
 
L2 flash cards economics - SS 4
L2 flash cards economics - SS 4L2 flash cards economics - SS 4
L2 flash cards economics - SS 4
 
L2 flash cards ethics SS 2
L2 flash cards ethics   SS 2L2 flash cards ethics   SS 2
L2 flash cards ethics SS 2
 
L2 flash cards ethics SS 1
L2 flash cards ethics   SS 1L2 flash cards ethics   SS 1
L2 flash cards ethics SS 1
 
L1 flash cards alternative investments (ss18)
L1 flash cards alternative investments (ss18)L1 flash cards alternative investments (ss18)
L1 flash cards alternative investments (ss18)
 
L1 flash cards derivatives (ss17)
L1 flash cards derivatives (ss17)L1 flash cards derivatives (ss17)
L1 flash cards derivatives (ss17)
 
L1 flashcards portfolio management (ss12)
L1 flashcards portfolio management (ss12)L1 flashcards portfolio management (ss12)
L1 flashcards portfolio management (ss12)
 
L1 flash cards corporate finance (ss11)
L1 flash cards corporate finance (ss11)L1 flash cards corporate finance (ss11)
L1 flash cards corporate finance (ss11)
 
L1 flash cards financial reporting (ss10)
L1 flash cards financial reporting (ss10)L1 flash cards financial reporting (ss10)
L1 flash cards financial reporting (ss10)
 

Último

SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTSIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTxtailishbaloch
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTopCSSGallery
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024Brian Pichman
 
Graphene Quantum Dots-Based Composites for Biomedical Applications
Graphene Quantum Dots-Based Composites for  Biomedical ApplicationsGraphene Quantum Dots-Based Composites for  Biomedical Applications
Graphene Quantum Dots-Based Composites for Biomedical Applicationsnooralam814309
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)IES VE
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingFrancesco Corti
 
How to release an Open Source Dataweave Library
How to release an Open Source Dataweave LibraryHow to release an Open Source Dataweave Library
How to release an Open Source Dataweave Libraryshyamraj55
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveIES VE
 
IT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingIT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingMAGNIntelligence
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)codyslingerland1
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNeo4j
 
LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0DanBrown980551
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Alkin Tezuysal
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfTejal81
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updateadam112203
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInThousandEyes
 
CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024Brian Pichman
 
Extra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfExtra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfInfopole1
 
Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...DianaGray10
 

Último (20)

SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTSIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development Companies
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024
 
SheDev 2024
SheDev 2024SheDev 2024
SheDev 2024
 
Graphene Quantum Dots-Based Composites for Biomedical Applications
Graphene Quantum Dots-Based Composites for  Biomedical ApplicationsGraphene Quantum Dots-Based Composites for  Biomedical Applications
Graphene Quantum Dots-Based Composites for Biomedical Applications
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is going
 
How to release an Open Source Dataweave Library
How to release an Open Source Dataweave LibraryHow to release an Open Source Dataweave Library
How to release an Open Source Dataweave Library
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
 
IT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingIT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced Computing
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4j
 
LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 update
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
 
CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024
 
Extra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfExtra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdf
 
Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...
 

L2 flash cards quantitative methods - SS3

  • 1. Scatter Plot scatter plot - a graphical representation of the relationship between the two variables Study Session 3, Reading 11
  • 2. Correlation and Covariance Analysis Correlation analysis - expresses the relationship between two variables with the help of a single number. It measures both the extent and direction of the linear relationship between two variables. Formula: Sample covariance of X and Y for a sample size of ‘n’ can be calculated as: Study Session 3, Reading 11
  • 3. Correlation and Covariance Analysis (cont.) Formula: Sample Correlation Coefficient: Where: SX - standard deviation of variable X SY - standard deviation of variable Y Formula: Sample Standard Deviation Study Session 3, Reading 11
  • 4. Limitations to Correlation Analysis Outliners Outliers are a small number of observations that are at either extreme of a sample Spurious Correlation  The correlation between two variables that shows a chance relationship in a particular data set is called spurious correlation.  The correlation between two variables that arises not from a direct relationship between them but their relation to third variable is also called spurious correlation. Study Session 3, Reading 11
  • 5. Hypothesis Testing For Population Correlation Coefficient Proposed Hypothesis:  null hypothesis - H0 , that the correlation is 0 (p=0)  alternative hypothesis - Ha that the correlation of population is different from 0 (p≠0) Formula: t-test Study Session 3, Reading 11
  • 6. Dependent and Independent Variables in Linear Regression independent variable (denoted as X) - the variable that is used to explain changes dependent variable (denoted as Y) - the variable that is to be explained. Linear regression involves the use of one variable to make a prediction about other variable. It also involves testing hypotheses about the relation between the two variables and quantifying the strength of relationship between the two variables. Study Session 3, Reading 11
  • 7. Dependent and Independent Variables in Linear Regression (cont.) Regression equation that defines the linear relation between the dependant and independent variable: Where: Y - dependent variable b0 – intercept b1 - slope coefficient X - independent variable - error term Study Session 3, Reading 11
  • 8. Dependent and Independent Variables in Linear Regression (cont.) In linear regression, estimated or fitted parameters b0 and b1 are chosen in the given equation to minimize: cross sectional data - uses many observations on the dependant and independent variables for the same time period time-series data - many observations from different time periods are used Study Session 3, Reading 11
  • 9. Assumptions of a Classical Linear Regression Model 1. There is a linear relationship between the independent and 2. 3. 4. 5. 6. dependant variable. The independent variable is not random. The expected value of the error term is 0. The error term is normally distributed. The error term is uncorrelated across observations. The variance of the error term is the same for all observations (Homoskedasticity Assumption). Study Session 3, Reading 11
  • 10. Standard Error of Estimate Standard Error of Estimate (also called the standard error of regression) - used to measure how accurately a regression model fits the data. Formula: Study Session 3, Reading 11
  • 11. Coefficient of Determination coefficient of determination - used in measuring the proportion variance in the dependent variable that is explained by the independent variable Formula: Study Session 3, Reading 11
  • 12. Confidence Interval for Regression Coefficient regression coefficient - the average change in the dependant variable for every one unit change in the value of the independent variable. Things needed to estimate confidence interval for the regression coefficient:  estimated parameter value for a sample  standard error of estimate  Significance level for t-distribution  degree of freedom (n-2). Formula: Where: tc - critical t value at a chosen significant level Study Session 3, Reading 11
  • 13. Hypothesis Testing for a Population Value of the Regression Coefficient Formula: When testing a hypothesis using a regression model with t-test of significance, the t statistic is computed as: Formula: The confidence interval for the test is given as: Study Session 3, Reading 11
  • 14. Calculating a Predicted Value for the Dependent Variable Two sources of uncertainty in using regression model: 1. the error term 2. estimated parameters ( bˆ0 and bˆ1 ) Given the regression model Yi =bo +b1 Xi +Ei , if estimated parameters bˆ0 and bˆ1 are known, the predicted value of dependent variable ,Y, can be calculated as: Study Session 3, Reading 11
  • 15. Calculating a Predicted Value for the Dependent Variable (cont.) The prediction interval for a regression equation for a particular predicted value of the dependent variable is computed as: Where: Sf - square root of estimated variance of prediction error tc - critical level for t-statistic at chosen significance level The confidence level is taken as Study Session 3, Reading 11
  • 16. Calculating a Predicted Value for the Dependent Variable (cont.) The estimated variance of the prediction error ( calculated as: of Y) is Where: S2 - squared standard error of estimate -variance of independent variable Study Session 3, Reading 11
  • 17. Calculating ANOVA in Regression Analysis Analysis of Variance (ANOVA) - a statistical procedure that is used to determine how well the independent variable or variables explain the variation in the dependant variable. F-test - the statistical test that is used in the analysis of the variance Study Session 3, Reading 11
  • 18. F-test A F-statistic is used to test whether the slope coefficients in a linear regression are equal to 0 or not. In a regression equation with one independent variable:  Null Hypothesis H0 : b1= 0  Alternative Hypothesis Ha : b1≠ 0 Things required to undertake an F-test 1. the total number of observations 2. the total number of parameters to be estimated 3. the sum of squared errors(SSE) 4. regression sum of squares (RSS) Study Session 3, Reading 11
  • 19. F-test (cont.) Formula: SSE Formula: RSS Formula: Total Variation (TSS) = SSE + RSS Formula: F-statistic in a regression with one independent variable Study Session 3, Reading 11
  • 20. Limitations to Regression Analysis Parameter instability In investment analysis, regression models can have limited use because public knowledge of regression relationships can negate their use for future purpose Violations of assumptions can make hypothesis tests and predictions invalid Study Session 3, Reading 11
  • 21. Multiple Regression Equation multiple regression equation - used to determine how a dependent variable is affected by more than one independent variables log-log regression model - used when the proportional changes in the dependent variable bear a constant relationship to a proportional changes in independent variables General Form of the Multiple Regression Model Where: Yi - the ith observation of the dependent variable Y Xji - the ith observation of the independent variable Xj, j=1,2,…,k b0 - the intercept of the equation b1 ,…., bk - the slope coefficients for each of the independent variables Ei - the error term n - the number of observations Study Session 3, Reading 112
  • 22. Hypothesis Testing for a Population Value of a Regression Coefficient Under the null hypothesis, the hypothesis population value of a regression coefficient is taken as 0. The degrees of freedom in the test are the number of observations minus the number of independent variables + 1 (i.e. n – (k+1).) Formula: Hypothesis testing using t-test: Where: b^j - regression estimate of hypothesized value of coefficient -estimated standard error of b^j Study Session 3, Reading 12
  • 23. Hypothesis Testing for a Population Value of a Regression Coefficient (cont.) p-value The p-value for a regression coefficient is the smallest level of significance at which the null hypothesis of that population value of the coefficient is 0 can be rejected in a two-sided test. The lower the p-level, the more accurate the result of the test. Study Session 3, Reading 12
  • 24. Confidence Interval for the Population Value and Predicted Value for the Dependent Variable Two types of uncertainty in predicting the dependent variable using linear regression model:  the regression model itself because of standard error of estimate  uncertainty about estimates of regression model parameters The computation of the prediction interval to accommodate the uncertainties is done with the help of matrix algebra. Study Session 3, Reading 12
  • 25. Points to be considered for predicting a dependent variable Assumptions required for using a regression model must be met. Caution should be exercised on predictions that are based on the value of independent variables that are outside the range of data used for estimating the model. Study Session 3, Reading 12
  • 26. Steps in predicting the value of the dependent variable Obtaining estimates of regression parameters ( ). Determining assumed values of independent variables Computing predicted value of dependent variable using the equation: Study Session 3, Reading 12
  • 27. Assumptions of a Multiple Regression Model 1. There exists a linear relationship between the dependent variable 2. 3. 4. 5. 6. and the independent variables. There is no exact linear relationship between two or more of the independent variables and the independent variables are not random. The error term is normally distributed. The error term is uncorrelated across observations. The variance of the error term is the same for all of the observations. The expected value of error term, conditioned upon the independent variable, is 0. Study Session 3, Reading 12
  • 28. F-statistic in Regression Analysis F-statistic - used to test whether at least one of the slope coefficients of the independent variables is not equal to 0 null hypothesis - all the slope coefficients in the multiple regression model are equal to 0 is presented as : alternative hypothesis - at least one slope coefficient is not equal to 0. Study Session 3, Reading 12
  • 29. F-statistic in Regression Analysis (cont.) Things required for F-test Total number of observations (n). Total number of regression coefficients to be estimated (k+1) where k is number of slope coefficients. Sum of squared errors (SSE) (Unexplained Variation) Regression sum of squares (RSS) (Explained Variation) Study Session 3, Reading 12
  • 30. F-statistic in Regression Analysis (cont.) Calculating the F-statistic Degrees of freedom in the test 1) k (numerator degrees of freedom) 2) n-(k+1) (denominator degrees of freedom) Study Session 3, Reading 12
  • 31. R2 and Adjusted R2 in Multiple Regression R2 measures how appropriately the regression model fits with one independent variable. Adjusted R2 ( ) is used in place of R2 when there is more than independent variable. Relationship: Where: n - the number of observations k - number of independent variables Study Session 3, Reading 12
  • 32. Dummy Variables Dummy variables - used in regression models to determine whether a qualitative independent variable explains the dependent variable A dummy variable has a value of 1 if a particular qualitative condition is true and 0 if that condition is false. In order to distinguish between n categories, n – 1 dummy variables are required. Study Session 3, Reading 12
  • 33. Heteroskedasticity and its Effect on Statistical Inference Heteroskedasticity - a violation of the regression assumption that the variance of the errors in a regression is constant across observations. Two types of heteroskedasticity : 1. unconditional heteroskedasticity 2. conditional heteroskedasticity Breusch-Pagan test - widely used when testing for conditional heteroskedasticity. Two methods used for correcting conditional heteroskedasticity: 1. Robust Standard Errors 2. Generalized Least Squares Study Session 3, Reading 12
  • 34. Heteroskedasticity and its Effect on Statistical Inference (cont.) Durbin-Waston test – test conducted when serial correlation generally arises in time-series regressions Consequences of Heteroskedasticity F-test does not provide reliable results. T-tests for the significance of individual regression coefficients does not provide reliable results. Standard errors and test statistics will have to be adjusted in order to derive reliable results. Study Session 3, Reading 12
  • 35. Unconditional Heteroskedasticity and Conditional Heteroskedasticity Unconditional heteroskedasticity arises when the heteroskedasticity of an error variance does not correlate with the independent variables. This heteroskedasticity is not a major problem for statistical inference. Conditional heteroskedasticity arises when heteroskedasticity in the error variance is correlated with the independent variables. This heteroskedasticity is a major problem for statistical inference. Study Session 3, Reading 12
  • 36. Methods for Correcting for Heteroskedasticity 1. Under the robust standard error method, the standard errors of a linear regression model’s estimated coefficients are corrected. 2. Under the generalized least square method, original equation is modified and a new modified regression equation is estimated. Study Session 3, Reading 12
  • 37. Consequences of Serial Correlation Incorrect estimates of the regression coefficient standard errors. If the independent variable is a lagged value of the dependent variable, it will make the parameter estimates invalid. In positive serial correlation, a positive (negative) error for one observation increases the positive (negative) error for another observation. Positive serial correlation has no effect on the consistency of estimated regression coefficients, but affects validity of statistical tests. Study Session 3, Reading 12
  • 39. Methods to correct for Serial Correlation 1. The coefficient standard errors for the linear regression parameter estimates can be adjusted. 2. Regression equation can be modified to eliminate serial correlation. Study Session 3, Reading 12
  • 40. Multicollinearity in Regression Analysis Multicollinearity - a violation of the regression assumption that there is no exact linear relationship between two or more independent variables Consequences of Multicollinearity Estimates of regression coefficients become unreliable. It is not possible to ascertain how individual independent variables affect dependent variables. Study Session 3, Reading 12
  • 41. Model Misspecification in Regression Analysis Model specification - the set of variables that are included in the regression and the regression equation’s functional form Misspecified Functional Form It omits one or more important variables from regression. One or more regression variables are required to be transformed before estimating the regression. Data has been pooled from different samples that are not to be pooled. Study Session 3, Reading 12
  • 42. Model Misspecification in Regression Analysis (cont.) Reasons for time-series misspecification Inclusion of lagged dependent variables as independent variables in regressions which have serially correlated errors. The dependent variable being included as an independent variable. If there are independent variables that are measured with errors. Study Session 3, Reading 12
  • 43. Models With Qualitative Dependent Variables Qualitative dependent variables are dummy variables that are used as dependent variables. 1. Probit model - used to estimate the probability of a discrete outcome when values of independent variables used to explain the outcomes given based on normal distribution 2. Logic model - used to estimate the probability of a discrete outcome when values of independent variables used to explain the outcomes given based on logical distribution Study Session 3, Reading 12
  • 44. Calculating the Predicted Trend Value for a Time Series Linear Trend Models - the dependent variable changes at a constant rate with time Formula: Where: yt - value of the time series at time t b0 - the y-intercept term b1 - the slope coefficient (trend coefficient) t - time (independent variable) Et - a random error term Study Session 3, Reading 13
  • 45. Calculating the Predicted Trend Value for a Time Series (cont.) Log-Linear Trend Models - used when the time series tends to grow at a constant rate Formula: Predicted trend value of Study Session 3, Reading 13
  • 46. Limitations of the Use of Trend Models for a Given Time Series Trend models can suffer from the limitation of serially correlated errors. If trend models have errors that are serially correlated, better forecast models for such time series are required than trend models. Study Session 3, Reading 13
  • 47. Covariance Stationary Following things should be finite and constant in all periods:  Expected value of time series.  Variance of time series.  Covariance of time series with itself for a fixed number of periods in the past or future. Implications if the Time Series is not Covariance Stationary  Estimate of autoregressive time series by using linear regression will not be valid  The hypothesis test will provide invalid results. Study Session 3, Reading 13
  • 48. Structure of an Autoregressive Model of Order p In an autoregressive model, a time series is regressed on its past values and shows the relationship between current period-values and past-period values. pth-order Autoregressive Model: First Order Autoregression Study Session 3, Reading 13
  • 49. Autocorrelation for Time Series Autocorrelation of a time series - the correlation of the time series with its past values Formula: Study Session 3, Reading 13
  • 50. Autocorrelation for Error Term Error autocorrelation is estimated by using sample autocorrelations of the residuals called residual autocorrelations and their sample variance. Formula: Study Session 3, Reading 13
  • 51. Mean Reversion A time series shows mean reversion if it tends to rise when its level is below its mean and falls when its level is above its mean. Formula: Mean Reverting Level Study Session 3, Reading 13
  • 52. Mean Reversion (cont.) Interpretation of Mean Reversion Level If the current value of time series is b0 /(1 – b1 ) , it will neither increase nor decrease. If the current value is below b0 /(1 – b1 ) , the time series will increase. If the current value is above b0 /(1 – b1 ), the time series will decrease. Study Session 3, Reading 13
  • 53. Mean Reversion (cont.) Multiple Periods of Forecasting and the Chain Rule of Forecasting Formula: AR Model Formula: Two-period ahead forecast Study Session 3, Reading 13
  • 54. In-Sample and Out-of-Sample Forecasts In-sample forecasts can be defined as the in-sample predicted values from the estimated time series model. Out-of-sample forecasts are made from estimated time-series models for a period that is different from the period from which the model was estimated. Root Mean Squared Error (RMSE) (calculated as square root of average squared error) - used for comparing the out-ofsample forecasting accuracy of different time series models. Study Session 3, Reading 13
  • 55. Instability of Coefficients in Time-Series Models Generally unstable across different sample periods Different between models that are estimated based on longer or shorter sample periods Depends upon the sample period Study Session 3, Reading 13
  • 56. Random Walk random walk - a time series model in which the value of a series in one period is calculated as the value of the series in the previous period plus an unpredictable random error Formula: Random walk with a drift increases or decreases by a constant amount in each period Formula: Study Session 3, Reading 13
  • 57. Random Walk (cont.) First-differencing - differencing a time series by creating a new time series that in each period is equal to the difference between xt and xt-1. Formula: Study Session 3, Reading 13
  • 58. Dickey Fuller Unit Root Test Formula: Where: g1 = (b1 – 1) Null Hypothesis is H0 : g1 = 0 Alternative Hypothesis is Ha : g1 < 0 Study Session 3, Reading 13
  • 59. Seasonality in a Time-Series Model Seasonality of time series occurs when regular patterns of movement within the year are observed. Formula: Seasonal lag in autoregressive model Formula: Forecasted Value Study Session 3, Reading 13
  • 60. ARCH Models Autoregressive Conditional Heteroskedasticity (ARCH) - if the variance of errors in a time series model depends on the variance of previous Formula: Linear regression error Where: u1 = error term Study Session 3, Reading 13
  • 61. ARCH Models (cont.) Predicting Variance of Errors Formula: Formula: Calculate the variance of the error term in the current period Study Session 3, Reading 13
  • 62. Analysis of Time-Series Variables Prior To Linear Regression Two time series - said to be cointegrated if there is such a longterm financial or economic relationship between the two variables that they do not diverge from each other without being bound in the long run. The (Engle Granger) Dickey Fuller test is used to determine whether time series are cointegrated. Study Session 3, Reading 13
  • 63. Analysis of the Appropriate Time-Series Model Given an Investment Problem Regression models or time series models can be used in the analysis of investment problems. In a regression model, predicting the future value of a variable is undertaken on the basis of a hypothesized casual relationship with other variables. In time series mode, the future behavior of the variable is made on the basis of past behavior of that variable. Study Session 3, Reading 13
  • 64. Explanation of the Dependent Variable by Analysing the Regression Equation and ANOVA Table Key analysis of variance (ANOVA) - used to provide information about a regression model’s explanatory power F-statics are used to test the explanatory power of the dependent variable If independent variables do not explain the dependent variables, the value of the F-statistic is 0. Variability in values of the dependent variable can be divided into two parts: Total Sum of Squares = Regression Sum of Squares + Residual Sum of Squares Study Session 3, Reading 13
  • 65. Uses of Multiple Regression Analysis in Financial Analysis Used in various finance and investment decisions The effect of various parameters on investment decisions can be measured To predict the expected return of a fund or portfolio Dummy variable can be used in various financial analysis models If there are any violations of assumptions, they should be adjusted by analysts before making any decisions Study Session 3, Reading 13