3. POOLED OLS
_cons .3444242 .860552 0.40 0.689 -1.344217 2.033065
ln_Industry_Output .4791146 .1810233 2.65 0.008 .1238969 .8343324
ln_Capital .8090177 .0112526 71.90 0.000 .786937 .8310984
ln_Wage -.3669498 .0646708 -5.67 0.000 -.4938518 -.2400478
ln_Firm_Employment Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 1853.62881 1,030 1.79963962 Root MSE = .54471
Adj R-squared = 0.8351
Residual 304.717448 1,027 .296706376 R-squared = 0.8356
Model 1548.91136 3 516.303787 Prob > F = 0.0000
F(3, 1027) = 1740.12
Source SS df MS Number of obs = 1,031
. regress ln_Firm_Employment ln_Wage ln_Capital ln_Industry_Output
Source: Kaggle
4. MULTICOLLINEARITY
Mean VIF 1.01
ln_Wage 1.00 0.995715
ln_Industr~t 1.00 0.995690
ln_Capital 1.01 0.992333
Variable VIF 1/VIF
. vif
Source: Kaggle
/* Note: type vif immediately after running the pooled OLS
regression.*/
VIF>10 or (1/VIF)<0.10 indicates trouble, if so then you
may delete the troubling independent variable.
5. SPECIFICATION ERROR
_cons -.002207 .0225843 -0.10 0.922 -.0465236 .0421095
_hatsq -.0059652 .0082826 -0.72 0.472 -.0222179 .0102874
_hat 1.016876 .0272088 37.37 0.000 .9634848 1.070267
ln_Firm_Em~t Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 1853.62881 1,030 1.79963962 Root MSE = .54431
Adj R-squared = 0.8354
Residual 304.56377 1,028 .296268259 R-squared = 0.8357
Model 1549.06504 2 774.53252 Prob > F = 0.0000
F(2, 1028) = 2614.29
Source SS df MS Number of obs = 1,031
. linktest
/* Note: type linktest immediately after running the pooled OLS regression
Ho: There is no specification error
Since, P value of _hatsq>0.05, @ 95% LOC for α=0.05, we do not reject the null and
conclude that, there is no specification error, and we expect that the model is
correctly specified.
Source: Kaggle
6. TESTING FOR NORMALITY
predict e, resid [then] kdensity e, normal [then] pnorm e [then] qnorm e
0
.2.4.6.8
-2 -1 0 1 2
Residuals
Kernel density estimate
Normal density
kernel = epanechnikov, bandwidth = 0.1085
Kernel density estimate
0.000.250.500.751.00
NormalF[(e-m)/s]
0.00 0.25 0.50 0.75 1.00
Empirical P[i] = i/(N+1)
-2-1
012
-2 -1 0 1 2
Inverse Normal
.
e 1,031 0.97020 19.323 7.343 0.00000
Variable Obs W V z Prob>z
Shapiro-Wilk W test for normal data
. swilk e
Ho: The distn of the residuals is
normal
Since, P<=0.05, @ 95% LOC for α=0.05, we reject the null
and conclude that, The distn of the residuals is not
normal. In practice, when we are dealing with really big
samples normality doesn’t represent much of a problem.
Source:Kaggle
7. PANEL REGRESSION
/* Declare Panel_ID : xtset Panel_ID variable year, yearly */ here Panel_ID variable is Industry_Code
F test that all u_i=0: F(8, 1019) = 56.45 Prob > F = 0.0000
rho .35231013 (fraction of variance due to u_i)
sigma_e .45520323
sigma_u .33572551
_cons 2.844049 .826946 3.44 0.001 1.221337 4.466761
ln_Capital .8632881 .0100616 85.80 0.000 .8435443 .8830319
ln_Wage -.6147252 .0765928 -8.03 0.000 -.7650228 -.4644275
ln_Industry_Output .1132458 .1658368 0.68 0.495 -.2121748 .4386665
ln_Firm_Employment Coef. Std. Err. t P>|t| [95% Conf. Interval]
corr(u_i, Xb) = -0.2603 Prob > F = 0.0000
F(3,1019) = 2480.08
overall = 0.8332 max = 206
between = 0.5707 avg = 114.6
within = 0.8795 min = 36
R-sq: Obs per group:
Group variable: Industry_C~e Number of groups = 9
Fixed-effects (within) regression Number of obs = 1,031
. xtreg ln_Firm_Employment ln_Industry_Output ln_Wage ln_Capital ,fe
Source:Kaggle
8. rho .34630852 (fraction of variance due to u_i)
sigma_e .45520323
sigma_u .33132218
_cons 2.776657 .8313633 3.34 0.001 1.147215 4.406099
ln_Capital .8623775 .0100485 85.82 0.000 .8426828 .8820721
ln_Wage -.6107086 .0759802 -8.04 0.000 -.7596271 -.46179
ln_Industry_Output .1196021 .1655413 0.72 0.470 -.204853 .4440572
ln_Firm_Employment Coef. Std. Err. z P>|z| [95% Conf. Interval]
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
Wald chi2(3) = 7443.42
overall = 0.8333 max = 206
between = 0.5708 avg = 114.6
within = 0.8795 min = 36
R-sq: Obs per group:
Group variable: Industry_C~e Number of groups = 9
Random-effects GLS regression Number of obs = 1,031
. xtreg ln_Firm_Employment ln_Industry_Output ln_Wage ln_Capital ,re
Source:Kaggle
9. Since the test is not conclusive, and there is neither significant industry nor significant
temporal effects, we could pool all the data and run an ordinary least squares (OLS)
regression model (Pooled regression model or Constant coefficients model).
(V_b-V_B is not positive definite)
Prob>chi2 = .
= 0.00
chi2(0) = (b-B)'[(V_b-V_B)^(-1)](b-B)
Test: Ho: difference in coefficients not systematic
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
b = consistent under Ho and Ha; obtained from xtreg
ln_Capital .8623775 .8623775 0 0
ln_Wage -.6107086 -.6107086 0 0
ln_Industr~t .1196021 .1196021 0 0
fixed random Difference S.E.
(b) (B) (b-B) sqrt(diag(V_b-V_B))
Coefficients
anything unexpected and possibly consider scaling your variables so that the coefficients are on a similar scale.
this is what you expect, or there may be problems computing the test. Examine the output of your estimators for
Note: the rank of the differenced variance matrix (0) does not equal the number of coefficients being tested (3); be sure
. hausman fixed random
/* type as below */
estimates store fixed
estimates store random
hausman fixed random
Source:Kaggle
10. HETEROSKEDASTICITY
/* xtgls depvar indepvars, igls panels(heteroskedastic)
. estimates store hetero
. xtgls depvar indepvars, igls
. local df = e(N_g) - 1
. lrtest hetero . , df(`df ‘ )
H0: Homoskedasticity */
Note: If P_ value is <=0.05 reject the H0 otherwise accept the H0
11. AUTOCORRELATION
/* findit xtserial ┘
Click on st0039 then click on click here to install */
xtserial dependent indep1 indp2…
H0: No serial correlation
If P_ value is <=0.05 reject the H0 otherwise accept the H0