Bài giảng 15b:
Phân tích dữ liệu CVM
Trương Đăng Thụy
Sampling techniques
Non-Probabilistic
Convenient sample: asembles sample at the
convenience of researcher
Judgement sample: a panel of respondents
judged to be representative of the target
population is assembled.
Quota sample: Selection is controlled by
interviewer, ensuring that sample contain given
proportion of various types of respondents.
Sampling techniques
Probabilistic
Simple random sampling: every respondents in the
sample frame has the same chance of being selected.
Systematic sampling: select every kth respondent from a
randomly-ordered population frame.
Stratified sampling: sampling frame is divided into sub-
populations (strata), using random sampling for each
stratum.
Clustered sampling: population is divided into a set of
groups (clusters), and clusters are randomly selected. All
elements in the chosen clusters will be included.
Multi-stage sampling: random sample of elements within
the randomly-chosen clusters.
Sample size
Coefficient of variation:
Necessary sample size:
If V=1, =.05 (for Z=1.96), =.1. Then sample
size must be 385.
TWTP
V
σ
=
2
=
δ
ZV
N
α
δ
In this session
Data of WTP
Estimating mean and median WTP
Non-parametric
Parametric
Testing validity of WTP values
Exercise
Data of WTP
Three types of CV data:
Continuous data (results from open-ended or
bidding game questions)
Binary data (response “yes” or “no” to a bid level)
Interval data (payment card or double-bounded
choice)
Estimating mean and median WTP:
non-parametric
Continuous data
Imagine a dataset of max WTP of HH/ind
Total number of HH is N
There are J diferent values of WTP. J might be smaller than
N for there could be several HH/ind reporting the same WTP
Order the values of WTP Cj from lowest to highest (J=0,J).
C
0
is always zero and C
J
is largest in the sample
Let h
j
is the number of HH/ind in the sample with WTP of Cj
Total number of HH/ind with a WTP greater than Cj will be
The survivor function is
Mean WTP is
∑
+=
=
J
jk
kj
hn
1
N
n
CS
j
j
=)(
[ ]
∑
=
+
−=
J
j
jjj
CCCSC
0
1
)(
Estimating mean and median WTP:
non-parametric
Binary data
Total number of respondents is N
The sun-sample facing Bj is Nj.
The number of respondents saying “Yes” to
amount Bj is nj.
Survivor function:
Mean WTP is
j
j
j
N
n
BS =)(
[ ]
∑
=
−
−=
J
j
jjj
BBBSC
0
1
)(
Estimating mean and median WTP:
non-parametric
Binary data – increasing survivor function
Calculate
Beginning with the first bid level, compare S(Bj) with
S(Bj+1)
If S(Bj+1) is less than or equal S(Bj), continue
If S(Bj+1) > S(Bj), pool the observations of the two bid
levels and recalculate the survivor function:
Continue until survivor function is non-increasing
Mean WTP is
j
j
j
N
n
BS =)(
[ ]
∑
=
−
−=
J
j
jjj
BBBSC
0
1
)(
1
1
)(
+
+
+
+
=
jj
jj
j
NN
nn
BS
Estimating mean and median WTP:
non-parametric
Interval data: WTP lies in a range lower B
L
and upper
B
H
Example intervals – non-overlapping
resp lower upper
1 0.5 1
2 0 0.5
3 1 4
4 4 10
5 1 4
Estimating mean and median WTP:
non-parametric
Interval data:
Non-overlapping: use the lower bound and
calculate as continuous data
Overlapping: may occur in double-bounded
dichotomous choice
Resp is offered an initial bid
If yes, follow up with a higher amount
If no, lower amount
Estimating mean and median WTP:
non-parametric
Interval data:
WTP will fall in ranges:
Yes to B and Yes to BH: WTP lies from BH to ∞
Yes to B and No to BH: WTP lies in the interval of B to
BH
No to B and Yes to BL: WTP lies in [BL,B]
No to B and No to BL: WTP lies in [0,BL]
Example intervals and data
lower upper No. of resp in interval
0 0.5 10
0.5 1 14
1 4 12
4 10 4
10 ∞ 1
0 1 13
0 4 7
0 10 8
0.5 4 4
0.5 10 5
0.5 ∞ 7
1 10 4
1 ∞ 3
4 ∞ 4
Estimating mean and median WTP:
non-parametric
Break overlapping intervals into basic ints
Starting from:
Using basic intervals only, the probability of lying
in basic interval j from Bj-1 to Bj is:
Consider overlapping interval of Bi to Bk that
spans the basic interval j
0)()( )()()(1
1210
=≥≥≥≥≥=
+jj
BSBSBSBSBS
)()(
1 jj
BSBS −
−
Estimating mean and median WTP:
non-parametric
Break overlapping intervals into basic ints
Calculate conditional probability of resp whose
WTP is in interval Bi to Bk having a WTP that lies
in basic interval j:
Multiply this probability by number of resp whose
WTP is in interval Bi to Bk to obtain estimated
number of resp falling in the basic interval j.
)()(
)()(
1
ki
jj
BSBS
BSBS
−
−
−
Estimating mean and median WTP:
non-parametric
Break overlapping intervals into basic ints
Continue the process for all overlapping, we
obtain the survivor function for basic intervals
only.
Then estimate WTP:
Total number of HH/ind with a WTP greater than the
boundary value Bj will be
The survivor function is
Mean WTP is
∑
+=
=
J
jk
kj
hn
1
N
n
BS
j
j
=)(
[ ]
∑
=
−
−=
J
j
jjj
BBBSC
0
1
)(
Confidence intervals from non-
parametric estimation
Continuous data
Variance of population WTP
Estimate of variance of mean WTP
Confidence interval (95%)
1
)var(
2
2
−
−
=
∑
N
CNC
C
N
i
i
N
C
C
)var(
)var( =
)var()96.1( CC −
)var()96.1( CC +
Confidence intervals from non-
parametric estimation
Binary data
Estimate of variance of mean WTP
Confidence interval (95%)
)var()96.1( CC −
)var()96.1( CC +
[ ]
∑
=
+
−−=
J
j
jjj
BSBSCBC
0
1
2
)()()()var(
Parametric estimation of mean
and median WTP
The mean and meadian WTP is:
Restricted mean/median WTP:
β
α
∑
−=
ii
z
EWTP
)1ln(
1
∑
+−=
ii
z
eEWTP
α
β
Testing validity of WTP values
Test whether WTP values provided follow
distinguishable patterns, conforming prior
expectations and economic theory
Regress WTP on a numbers of variables:
Income
Socio-economic characteristics
Attitudinal variables
Attitude toward CV program design
Knowledge on the good provided
Proximity to the site of provision
Testing validity of WTP values
To test:
Regress WTP on variables
Test for significance of coefficients (t-test can be
used)
Examine the sign of coefficients. Are they
consistent with economic theory?
Look at pseudo-R2. Should not be less than 0.1.
Exercise
Use the provided data sets (binary and interval data)
to calculate:
Mean/Median WTP
95% confidence interval
for each data set
Exercise: Binary data
Bid Yes Total Pr(Yes) WTP
1,000 27 31
20,000 17 46
200,000 5 24
Total
Exercise: Interval data
Interval Lower Upper No. of resp
A 0 1,000 36
B 1,000 20,000 7
C 20,000 200,000 9
D 200,000 ∞ 17
E 0 20,000 7
G 1,000 200,000 15
H 1,000 ∞ 3
I 20,000 ∞ 7
Data of WTP
Data collected from CV survey:
HH characteristics
Attitude, knowledge, use
Program characteristics
Design characteristics