1. LU p230 onwards…
Multilevel modeling: A brief Introduction
Many kinds of data have a hierarchical or clustered structure. Multilevel models
recognise the existence of such data hierarchies by allowing for residual
components at each level in the hierarchy.
For example, a two-level model which allows for grouping of child outcomes
within schools would include residuals at the child and school level. Thus the
residual variance is partitioned into a between-school component (the variance of
the school-level residuals) and a within-school component (the variance of the
child-level residuals).
The school residuals, often called ‘school effects’, represent unobserved school
characteristics that affect child outcomes. It is these unobserved variables which
lead to correlation between outcomes for children from the same school.
Traditional multiple regression techniques treat the units of analysis as
independent observations. One consequence of failing to recognise hierarchical
structures is that standard errors of regression coefficients will be
underestimated, leading to an overstatement of statistical significance.
Statistical Analysis of our data
Our statistical analysis of the survey data was carried out in several steps:
(A) Conventional Analysis of SBDC Responses using the Logit Model
Before analyzing the DBDC data using MLM we have first carried out
conventional analysis of the Single Bound Dichotomous Choice data first
using logistic regression.
2. Those who gave a negative response to the payment principle question
(Q39) were recognised as protest bids and excluded from the analysis.
The dependent variable in this model is the ‘yes’ and ‘no’ response (binary
data) to whether a respondent will pay the initial bid as a monthly WWTF.
In the logistic regression of the SBDC responses we found that apart from
pcy and bid none of the other explanatory variables were significant.
The equation we get is: Li = ln (Yi / 1- Yi) = 2.007 + -.039 bid + .001 pcY
(B)
Using MLM to estimate WTP from DBDC data
The random effects at Level 2 may contain up to six terms. We have a variance
term
associated with the intercepts, one with the slope of BID, one with the slope of
PCY, one
for the variance between slope of BID and the intercept, one for the variance
between
slope of PCY and the intercept and one for the variance between slope of BID and
the
slope of PCY. The MLM analysis was carried out on LISREL 8.30. On running the
software program we found that there is no significant estimated variance between
the
response variable and BID and the insignificant terms were omitted. The best
model is
given as equation (7.38)
(7.38) Li = ln (Yi / 1- Yi) = 1.20881 +
(0.38446) (0.00154)
-0.01946BID
(0.00021)
+ 0.00013PCY
The final model includes PCY and BID as explanatory variables and significant
first order interactions. We get the median WTP as equal to Rs 93.74.
Note that the variance in the model has been divided
3. between individual effects and effects due to different mean responses to each bid
amount
offered. The multilevel model is therefore a correct representation of the
information
gathered in the CV study, as it models the natural hierarchy present in the data.
This also shows that the multi-level approach requires more thoughtful
interpretation than
OLS equivalents (Bateman and Langford 1999)
Our Use of MLM method
MLM is the appropriate approach for analyzing DBDC data since it provides the
opportunity to study variation at different levels of the hierarchy.
The DBDB data generated by the CV survey is essentially hierarchical in
character. It is a three level model with responses at level 1, individuals at level 2,
and initial bid level presented at level 3.
However in carrying out the MLM exercise we are principally concerned with
estimating the E(WTP) using DBDC data and separating out effects due to the
design of the response structure from those due to individuals.
Consequently in this study we have only considered a two-stage hierarchy with
responses nested within individuals by defining the former as level-1 variation and
the latter as level-2.
Moreover for modeling the 3 level hierarchical data the number of initial bids
should be large which we did not have.
Since we found pcY and bid as the only significant variables in the analysis of the
SBDC data these were included in the estimation of WTP.
The multilevel model we have estimated is:
LOGIT ki= a + bBIDki+cPCYki + viBIDki + wiPCYki + ui + eki,
PCY = per capita income of the household of the respondent
wi = allows random slopes, vi ~ N (0, σ2v)
4. The random effects at Level 2 may contain up to six terms. We have a variance
term associated with the intercepts, one with the slope of BID, one with the slope
of PCY, one for the variance between slope of BID and the intercept, one for the
variance between slope of PCY and the intercept and one for the variance between
slope of BID and the slope of PCY.
The MLM analysis was carried out on LISREL 8.30. On running the
software program we found that there is no significant estimated variance between
the response variable and BID and the insignificant terms were omitted. The best
model is given as equation:
Li = ln (Yi / 1- Yi) = 1.20881 + -0.01946BID+ 0.00013PCY
(0.38446) (0.00154)
(0.00021)
(C) Estimating the incidence of the benefits from WQIYD
For this we used the linear regression model where the dependent variable is
SWTP i.e. the maximum WTP figure stated by the respondents in response to the
open-ended question at the end of the DBDC valuation questions.
The figures are considered to be continuous in nature. After the protest bids
outliers were excluded the final count of observations stood at 440 and the OLS
regression was run on these using the software package SPSS.
The variable bid was included to examine the presence of any anchoring effect or
starting point bias. Taking bid C as default, bid B did not turn out to be significant
but bid A was positive and significant indicating that those who were offered a bid
of Rs 100 did state a higher WTP. Effects of the bid variable were removed by
setting bid_a = 0.
The mean WTP for WQIYD comes down from Rs 79.46 to Rs 63.23 after
correcting for the starting point bias. The median WTP is Rs. 60.41.
We see that the WTP estimates obtained from the OLS regression are the most
conservative figures. DBDC data have yielded much higher WTP estimates.
5. Moreover the OLS regression method yields the mean WTP whereas the other two
give us estimates of median WTP.
Since median cannot be aggregated over the population (Duffield and Patterson
1991) the OLS estimates have been used for determining the incidence of these
benefits between different income groups.
In this study we have used the income groups fixed by the Market Information
Survey of Households (MISH) 2001-02 conducted by NCAER.
The NCAER survey fixes income groups and the population is divided into 5
income groups: lower income group (an annual household income of upto Rs
45000), lower-middle income group (Rs 45001 to Rs 90000), middle income group
(Rs 90001 to Rs 135000), upper-middle income group (Rs 135000 to Rs 180,000)
and higher income group (more than Rs 180,000 of annual household income).
Using the SWTP data collected in the survey we were able to estimate the average
WTP of households belonging to different income groups a monthly WWTF
tabulated below.
Average WTP for Different Income Classes
The average WTP per month per household (for the user and non user values
associated with WQIYD) has been estimated by taking the average of the SWTP of
the individuals belonging to each income class.
In our study we have taken this WTP figure as the measure of the benefits derived
by different income groups (Ebert 2003).
6. The above figure clearly shows that water quality improvement is a normal good
and, as expected, overall the demand for water quality improvement increases with
income.
The WTP a higher monthly WWTF by the higher income groups is of course a
very positive result since they are the ones who have the ability to fund
environmental protection programs.
Some reconsideration of the meaning of ‘regression’ and ‘progression’ is
needed here since the incidence of benefits as well as costs is estimated. Using
the terms in parallel fashion this study assumes that a benefit schedule is regressive
when the gain as a percentage of income declines as the level of income rise, and
that it is progressive when the opposite occurs.
It follows, however, that the implications for equality differ depending on whether
the regressive schedule is applied to benefits or to costs. Whereas a regressive
tax/costs schedule is “against the poor” and “pro-rich”, a regressive benefits/
expenditure schedule is “pro- poor” and “against the rich”.
A crucial issue in this method is the choice of the bids offered to respondents since
it can affect the estimation of the mean WTP. Three questionnaires were designed
with varying double dichotomous bids to measure the respondents’ willingness to
pay. These were Questionnaire A: 80-100-120, Questionnaire B: 35-50-65 and
Questionnaire C: 10-20-30.
These were based on the WTP figures obtained in a pilot study where an open
ended format was used.
This is, in fact, supported by literature. Most examinations of elicitation effects
have compared WTP measures between open-ended (OE) and dichotomous choice
(DC) studies, with the majority reporting DC mean WTP exceeding those from OE
experiments (Bateman et. al. 1999).