This document provides an introduction and overview of stochastic frontier analysis, which models a production frontier as a stochastic function to account for noise in production. It discusses estimating the parameters of a stochastic frontier model using maximum likelihood, predicting technical efficiency at the firm and industry level, and hypothesis testing using likelihood ratio tests. The key steps are estimating the stochastic frontier model, predicting technical efficiencies based on the estimates, and testing hypotheses about inefficiency effects.
3. Outline Introduction The stochastic production frontier Estimating the parameters Predicting technical efficiency Hypothesis testing Conclusion 3 ECON377/477 Topic 4.1
4. Introduction Assume cross-sectional data on I firms A simple method to estimate a production frontier using such data is to envelop the data points using an arbitrarily chosen function Consider a Cobb-Douglas production frontier: ln qi = xiβ – ui i = 1, …,I where qi is the output of the i-th firm, xi is a K×1 vector containing the logarithms of inputs; β is a vector of unknown parameters; and ui is a non-negative random variable associated with technical inefficiency 4 ECON377/477 Topic 4.1
5. Introduction This production frontier is deterministic insofar as qiis bounded from above by the non-stochastic (deterministic) quantity A problem with frontiers of this type (and with the DEA frontier studied in Topic 3) is that no account is taken of measurement errors and other sources of statistical noise All deviations from the frontier are assumed to be the result of technical inefficiency 5 ECON377/477 Topic 4.1
8. The stochastic production frontier The stochastic frontier production function model is of the form: ln qi = xiβ + vi – ui where vi is a symmetric random error to account for statistical noise The model is called a stochastic frontier production function because the output values are bounded from above by the stochastic (random) variable 7 ECON377/477 Topic 4.1
9. The stochastic production frontier The random error vi can be positive or negative Therefore, the stochastic frontier outputs vary about the deterministic part of the model, exp(xiβ) These features of the stochastic frontier model can be illustrated graphically It is convenient to restrict attention to firms that produce the output qiusing only one input, xi 8 ECON377/477 Topic 4.1
10. The stochastic production frontier In this case, a Cobb-Douglas stochastic frontier model takes the form: ln qi = β0 +β1 ln xi + vi – ui Alternatively, qi = exp(β0 +β1 ln xi) × exp(vi) × exp(– ui) Deterministic component Noise Inefficiency 9 ECON377/477 Topic 4.1
11. The stochastic production frontier Such a frontier is depicted on the next slide where we plot the inputs and outputs of two firms, A and B The deterministic component of the frontier model has been drawn to reflect the existence of diminishing returns to scale Values of the input are measured along the horizontal axis and outputs are measured on the vertical axis Firm A uses the input level xA to produce the output qA, while Firm B uses the input level xB to produce the output qB 10 ECON377/477 Topic 4.1
12. The stochastic production frontier qi Deterministic frontier qi = exp(β0 +β1 ln xi) qA* Noise effect Noise effect qB* Inefficiency effect qB Inefficiency effect qA* ≡ exp(β0 + β1ln xA + vA) qB* ≡ exp(β0 + β1ln xB + vB) qA No inefficiency effects: uA= uB = 0 0 xi xA xB 11 ECON377/477 Topic 4.1
13. The stochastic production frontier The frontier output for Firm A lies above the deterministic part of the production frontier only because the noise effect is positive (vA > 0) The frontier output for Firm B lies below the deterministic part of the frontier because the noise effect is negative (i.e., vB < 0) The observed output of Firm A lies below the deterministic part of the frontier because the sum of the noise and inefficiency effects is negative (vA – uA < 0) 12 ECON377/477 Topic 4.1
14. The stochastic production frontier The (unobserved) frontier outputs tend to be evenly distributed above and below the deterministic part of the frontier But observed outputs tend to lie below the deterministic part of the frontier Indeed, they can only lie above the deterministic part of the frontier when the noise effect is positive and larger than the inefficiency effect Much of stochastic frontier analysis is directed towards the prediction of the inefficiency effects 13 ECON377/477 Topic 4.1
15. The stochastic production frontier The most common output-oriented measure of technical efficiency is the ratio of observed output to the corresponding stochastic frontier output: This measure of technical efficiency (TE) takes a value between zero and one 14 ECON377/477 Topic 4.1
16. The stochastic production frontier TE measures the output of the i-th firm relative to the output that could be produced by a fully efficient firm using the same input vector The first step in predicting the technical efficiency, TEi, is to estimate the parameters of the stochastic production frontier model Because TEiis a random variable, and not a parameter, we use the term ‘predict’ instead of ‘estimate’ 15 ECON377/477 Topic 4.1
17. Estimating the parameters It is common to assume that each vi is distributed independently of each ui, and that both errors are uncorrelated with the explanatory variables in xi In addition, we assume: E(vi) = 0 (zero mean) E(vi2) = σv2 (homoskedastic) E(vivj) = 0 for all i≠j (uncorrelated) E(ui2) = constant (homoskedastic) E(uiuj) = 0 for all i≠j (uncorrelated) 16 ECON377/477 Topic 4.1
18. Estimating the parameters We cannot use the OLS estimates to compute measures of technical efficiency One solution to this problem is to correct for the bias in the intercept term using an estimator known as the corrected ordinary least squares (COLS) estimator A better solution is to make some distributional assumptions concerning the two error terms and estimate the model using the method of maximum likelihood (ML) 17 ECON377/477 Topic 4.1
19. Estimating the parameters: the half-normal distribution We assume the vis are independently and identically distributed normal random variables with zero means and variances We also assume the uis are independently and identically distributed half-normal random variables with scale parameter σu2 That is, the pdf of each uiis a truncated version of a normal random variable having zero mean and variance 18 ECON377/477 Topic 4.1
20. Estimating the parameters: the half-normal distribution We parameterise the log-likelihood function for this so-called half-normal model in terms of: σ2 = σv2 + σu2 λ2 = σu2/σv2≥ 0 If λ = 0, there are no technical inefficiency effects and all deviations from the frontier are due to noise 19 ECON377/477 Topic 4.1
21. Estimating the parameters: the log-likelihood function The log-likelihood function is: where y is a vector of log-outputs; εi = vi - ui is a composite error term; and Φ(x) is the cdf of the standard normal random variable evaluated at x The likelihood function is maximised using an iterative optimisation procedure 20 ECON377/477 Topic 4.1
22. Estimating the parameters: ML The ML estimation of the half-normal stochastic frontier model is illustrated in CROB, page 248, by presenting annotated SHAZAM output from the estimation of a translog production frontier: 21 ECON377/477 Topic 4.1
23. Estimating the parameters: ML It is easier to use purpose-built software packages such as FRONTIER and LIMDEP The FRONTIER instruction and data files used for estimating the half-normal model are presented in CROB, Tables 9.2 and 9.3 The instruction file should be self-explanatory (see the comments on the right-hand side of the file on the next slide) The frontier output file is presented in CROB, Table 9.4 22 ECON377/477 Topic 4.1
24. Estimating the parameters: FRONTIER instruction file 1 1=ERROR COMPONENTS MODEL, 2=TE EFFECTS MODEL chap9.txt DATA FILE NAME chap9_2.out OUTPUT FILE NAME 1 1=PRODUCTION FUNCTION, 2=COST FUNCTION y LOGGED DEPENDENT VARIABLE (Y/N) 344 NUMBER OF CROSS-SECTIONS 1 NUMBER OF TIME PERIODS 344 NUMBER OF OBSERVATIONS IN TOTAL 10 NUMBER OF REGRESSOR VARIABLES (Xs) n MU (Y/N) [OR DELTA0 (Y/N) IF USING TE EFFECTS MODEL] n ETA (Y/N) [OR NUMBER OF TE EFFECTS REGRESSORS (Zs)] n STARTING VALUES (Y/N) 23 ECON377/477 Topic 4.1
28. Gamma with mean λ and degrees of freedom mTheoretical considerations and computational complexity may influence the choice Nevertheless, estimated elasticities and technological change effects are fairly robust to this change in the distributional assumption 25 ECON377/477 Topic 4.1
29. Estimating the parameters: alternative distributional specifications Different distributional assumptions may give rise to different predictions of technical efficiency But when we rank firms on the basis of predicted technical efficiencies, the rankings are often quite robust to distributional choice In such cases, the principle of parsimony favours the simpler half-normal and exponential models 26 ECON377/477 Topic 4.1
30. Predicting technical efficiency: firms The technical efficiency of the i-th firm is defined by TEi= exp(–ui) This result provides a basis for the prediction of both firm and industry technical efficiency Firm technical efficiency refers to the individual TE scores of firms within an industry Industry efficiency can be viewed as the average of the TEs of all the firms in the industry 27 ECON377/477 Topic 4.1
31. Predicting technical efficiency: firms We can summarise information about ui in the form of the truncated normal pdf as: where: and: 28 ECON377/477 Topic 4.1
32. Predicting technical efficiency: firms This conditional pdf gives information about likely and unlikely values of ui after firm i has been selected in our sample and after we have observed its output, qi In most situations, we are interested in the efficiency of the i-th firm, TEi= exp(–ui) We use p(ui | qi) to derive the predictor that minimises the mean square prediction error: 29 ECON377/477 Topic 4.1
33. Predicting technical efficiency: industry A natural predictor of industry efficiency is the average of the predicted efficiencies of the firms in the sample: Industry efficiency can also be viewed as the expected value of the efficiency of the i-th firm before any firms have been selected in the sample 30 ECON377/477 Topic 4.1
34. Predicting technical efficiency: industry Before we have collected the sample, our knowledge of uican be summarised in the form of the half-normal pdf: We can use this unconditional pdf to derive results similar to the above firm-specific results An optimal estimator of industry efficiency is: 31 ECON377/477 Topic 4.1
35. Hypothesis testing The t- and F-tests are no longer justified in small samples because the composed error in the stochastic frontier model is not normally distributed In addition to testing hypotheses concerning β, stochastic frontier researchers are often interested in testing for the absence of inefficiency effects Stochastic frontier researchers normally use the Wald and likelihood ratio (LR) tests 32 ECON377/477 Topic 4.1
36. Hypothesis testing The one-sided nature of the alternative hypothesis implies these tests are difficult to interpret Moreover, they do not have the asymptotic chi-square distributions We will use the LR test statistic in this part of the unit This statistic is asymptotically distributed as a mixture of chi-square distributions 33 ECON377/477 Topic 4.1
37. Hypothesis testing In the case of the truncated-normal model, the null hypothesis should be rejected at the 5 per cent level of significance if the LR test statistic exceeds 5.138 This value is taken from Table 1 in Kodde and Palm (1986) and is smaller than the 5 per cent critical value, Table 9.9 in CROB presents FRONTIER output from the estimation of a truncated-normal model 34 ECON377/477 Topic 4.1
38. Hypothesis testing From the results reported in this table, we compute: LR = –2[–88.8451 + 71.6403] = 34.41 This value, which is also reported in Table 9.9, exceeds 5.138 so we reject the null hypothesis We can also use estimates from the truncated-normal model to test the null hypothesis that the simpler half-normal model is adequate The relevant null and alternative hypotheses are H0: μ = 0 and H1: μ≠ 0 35 ECON377/477 Topic 4.1
39. Conclusion Unfortunately, the simple production frontier model does not permit the prediction of the technical efficiencies of firms that produce multiple outputs Moreover, the ML method does not allow us to assess the reliability of our inferences in small samples These are two of the issues to be addressed in Part 2 of Topic 4, together with how the parameters of multiple-output technologies can be estimated using distance and cost functions 36 ECON377/477 Topic 4.1