<span class='text_page_counter'>(1)</span><div class='page_container' data-page=1>
ENDOGENEITY AND
INSTRUMENTAL VARIABLE
REGRESSION
</div>
<span class='text_page_counter'>(2)</span><div class='page_container' data-page=2>
Endogeneity
OLS assumption
When
:
Endogeneity problem
|
<i>X</i>
0
2
V
<i>| X</i>
|
<i>X</i>
0 or
<i>X</i>
0
</div>
<span class='text_page_counter'>(3)</span><div class='page_container' data-page=3>
Reasons for Endogeneity
errors in variables
</div>
<span class='text_page_counter'>(4)</span><div class='page_container' data-page=4>
Consequences of endogeneity
If we use OLS in a regression with endogeneity:
BIASED AND INCONSISTENT ESTIMATES
x
y
ε
<i>y</i>
<i>x</i>
<i>x</i>
</div>
<span class='text_page_counter'>(5)</span><div class='page_container' data-page=5>
Endogeneity: errors in variables
Consider a regression
We can’t observe , but
Then the regression becomes
<i>y</i>
<i>x</i>
<i>y x</i>
,
<i>y x</i>
*
,
*
*
<i>y</i>
<i>y</i>
<i>v</i>
*
<i>x</i>
<i>x</i>
<i>u</i>
*
ˆ
*
ˆ
ˆ
<i>y</i>
<i>x</i>
<i>y</i>
<i>x</i>
<i>v</i>
<i>u</i>
*
</div>
<span class='text_page_counter'>(6)</span><div class='page_container' data-page=6>
Endogeneity: Endogenous variables
Consider a (market) demand equation
is not exogenous by theory
Instead, it should be the supply-demand system
1
2
<i>d</i>
<i>d</i>
<i>q</i>
<i>p</i>
<i>y u</i>
1
2
<i>d</i>
<i>d</i>
<i>q</i>
<i>p</i>
<i>y u</i>
<i>p</i>
1
<i>s</i>
<i>s</i>
<i>q</i>
<i>p u</i>
<i>s</i>
<i>d</i>
<i>q</i>
<i>q</i>
2
1
1
1
1
<i>d</i>
<i>s</i>
<i>u</i>
<i>u</i>
<i>p</i>
<i>y</i>
2
1
1
0
<i>d</i>
<i>u</i>
<i>d</i>
<i>u p</i>
</div>
<span class='text_page_counter'>(7)</span><div class='page_container' data-page=7>
Endogeneity: Omitted variables
Suppose the true model is
If we regress
0
1 1
2
2
<i>y</i>
<i>x</i>
<i>x</i>
0
1 1
omitted variable:
2
<i>y</i>
<i>x</i>
<i>x</i>
2 2
then
<i>x</i>
1
1
2
</div>
<span class='text_page_counter'>(8)</span><div class='page_container' data-page=8>
Solution to Endogeneity:
Instruments
Instrumental variables (instruments) Z must satisfy
exogeneity (uncorrelated with or )
relevance (correlated with )
<i>u</i>
<i>y</i>
</div>
<span class='text_page_counter'>(9)</span><div class='page_container' data-page=9>
Identification problem
If is the number of endogenous variables, and
is the number of instruments, then
If
the model is unidentified
If
the model is just-identified
If
the model is over-identified
<i>k</i>
<i>h</i>
</div>
<span class='text_page_counter'>(10)</span><div class='page_container' data-page=10>
IV Estimation
If is the number of endogenous variables, and
is the number of instruments, then
If
find the instrument!!!
If
use IV estimator
If
use
2SLS
or GMM
<i>k</i>
<i>h</i>
</div>
<span class='text_page_counter'>(11)</span><div class='page_container' data-page=11>
Two-Stage Least Square (2SLS)
Consider a regression
where is endogenous
if is used as instruments. Then the procedure is
Step 1: Regress each endogenous variable on
and
Step 2: Compute the fitted values
Step 3: Regress
1 1 2 2
<i>y</i>
<i>X</i>
<i>X</i>
2
<i>X</i>
<i>Z</i>
2
<i>X</i>
1
<i>X</i>
<i>Z</i>
2 0 1 1 2
<i>x</i>
<i>X</i>
<i>Z</i>
<i>v</i>
2
ˆ
0
ˆ
1 1
ˆ
2
<i>ˆx</i>
<i>X</i>
<i>Z</i>
1 1 2
ˆ
2
</div>
<span class='text_page_counter'>(12)</span><div class='page_container' data-page=12></div>
<span class='text_page_counter'>(13)</span><div class='page_container' data-page=13>
The wage equation
<i>ed: education</i>
<i>X: other control variables</i>
Endogeneity: missing important variable of
<i>ability</i>
<i>ability is believed to be correlated with ed.</i>
</div>
<span class='text_page_counter'>(14)</span><div class='page_container' data-page=14>
Summary statistics
year 20306 2001.088 1.61576 1999 2003
h 20306 2022.203 706.4409 1 5508
married 20306 .660002 .4737198 0 1
nch 20306 .9591746 1.137898 0 8
race 20306 1.410618 .6499018 1 3
mo_ed 20306 1.844726 .6290755 1 3
fa_ed 20306 1.83857 .6961686 1 3
ed 20306 13.4512 2.488962 0 17
union 20306 .1518763 .3589098 0 1
tenure 20306 6.359746 7.725706 0 42
wage 20306 20.08589 19.17634 5 491
age 20306 39.01532 9.901983 21 59
</div>
<span class='text_page_counter'>(15)</span><div class='page_container' data-page=15>
OLS Regression
</div>
<span class='text_page_counter'>(16)</span><div class='page_container' data-page=16>
Testing for endogeneity
<i>regress ed on X and IV variables</i>
<i>predict error terms e</i>
regress
<i>with e included</i>
<i>endogeneity if e is statistically significant</i>
ln
<i>wage</i>
<i>f ed X</i>
,
var,
<i>ed</i>
<i>f IV</i>
<i>X</i>
<i>e</i>
</div>
<span class='text_page_counter'>(17)</span><div class='page_container' data-page=17>
Testing for endogeneity
. quietly regress ed age age2 tenure union nch married
white black
fa_ed1 fa_ed2 mo_ed1 mo_ed2
year2001
year2003
. predict ed_hat, xb /* find the fitted value of ed*/
. predict r, resid /* find the error variance of the
model*/
. regress lnwage ed age age2 tenure union nch married
</div>
<span class='text_page_counter'>(18)</span><div class='page_container' data-page=18>
Testing for endogeneity
_cons -.2893795 .0781816 -3.70 0.000 -.4426218 -.1361372
r -.0745455 .0048828 -15.27 0.000 -.0841163 -.0649748
year2003 -.0092245 .0092467 -1.00 0.318 -.0273487 .0088997
year2001 -.000035 .0092487 -0.00 0.997 -.0181632 .0180932
black -.1986132 .0159947 -12.42 0.000 -.229964 -.1672624
white -.0707782 .0163623 -4.33 0.000 -.1028496 -.0387068
married .0142878 .008775 1.63 0.103 -.002912 .0314876
nch .0253419 .0038428 6.59 0.000 .0178097 .0328742
union .1061971 .0107717 9.86 0.000 .0850837 .1273104
tenure .011755 .0005423 21.68 0.000 .0106921 .0128179
age2 -.0004601 .0000409 -11.26 0.000 -.0005401 -.00038
age .0444652 .0032132 13.84 0.000 .038167 .0507634
ed .1527935 .0045929 33.27 0.000 .143791 .161796
lnwage Coef. Std. Err. t P>|t| [95% Conf. Interval]
Prob > F = 0.0000
F( 1, 20293) = 233.08
( 1) r = 0
</div>
<span class='text_page_counter'>(19)</span><div class='page_container' data-page=19>
2SLS IV Regression [Manually]
</div>
<span class='text_page_counter'>(20)</span><div class='page_container' data-page=20>
Testing for good instruments
quietly regress ed age age2 tenure union nch
married white black
fa_ed1 fa_ed2 mo_ed1
mo_ed2
year2001 year2003
Prob > F = 0.0000
F( 4, 20291) = 660.56
( 4) mo_ed2 = 0
( 3) mo_ed1 = 0
( 2) fa_ed2 = 0
( 1) fa_ed1 = 0
</div>
<span class='text_page_counter'>(21)</span><div class='page_container' data-page=21>
Implement IV reg in Stata
. ivreg lnwage age age2 tenure union nch married
white black year2001 year2003 (ed = fa_ed1
</div>
<span class='text_page_counter'>(22)</span><div class='page_container' data-page=22>
Implement IV reg in Stata
_cons 10.03607 .2575517 38.97 0.000 9.531245 10.54089
mo_ed2 1.221048 .0654893 18.65 0.000 1.092684 1.349412
mo_ed1 .5029502 .0450191 11.17 0.000 .4147092 .5911912
fa_ed2 1.833566 .0582525 31.48 0.000 1.719386 1.947746
fa_ed1 .6310663 .0429161 14.70 0.000 .5469473 .7151854
year2003 -.0024599 .0391737 -0.06 0.950 -.0792435 .0743237
year2001 -.0107218 .0391791 -0.27 0.784 -.0875159 .0660724
black .8421189 .0659016 12.78 0.000 .7129464 .9712914
white 1.072611 .0633436 16.93 0.000 .9484524 1.19677
married .3649581 .0366104 9.97 0.000 .2931988 .4367174
nch -.2159402 .0156491 -13.80 0.000 -.2466137 -.1852667
union .074779 .0456544 1.64 0.101 -.0147073 .1642652
tenure .0054745 .0022971 2.38 0.017 .0009721 .009977
age2 -.0004542 .0001726 -2.63 0.009 -.0007926 -.0001158
age .053069 .0135771 3.91 0.000 .0264568 .0796812
ed Coef. Std. Err. t P>|t| [95% Conf. Interval]
</div>
<span class='text_page_counter'>(23)</span><div class='page_container' data-page=23>
Implement IV reg in Stata
SECOND STAGE
</div>
<span class='text_page_counter'>(24)</span><div class='page_container' data-page=24>
Hausman test OLS agaisnt IV
regress lnwage ed age age2 tenure union nch
married white black year2001 year2003
est store OLS
ivreg lnwage age age2 tenure union nch married
white black year2001 year2003 (ed = fa_ed1
fa_ed2 mo_ed1 mo_ed2), first
est store IV
</div>
<span class='text_page_counter'>(25)</span><div class='page_container' data-page=25>
Hausman test OLS agaisnt IV
year2003 -.0092245 -.006866 -.0023585 .0027505
year2001 -.000035 .0009202 -.0009552 .0027474
black -.1986132 -.1106727 -.0879405 .0074915
white -.0707782 .0655964 -.1363746 .0102137
married .0142878 .0362749 -.0219871 .0029815
nch .0253419 .010029 .0153129 .0015232
union .1061971 .1102531 -.004056 .0032101
tenure .011755 .0120222 -.0002671 .000162
age2 -.0004601 -.0005048 .0000447 .0000125
age .0444652 .047922 -.0034568 .0009811
ed .1527935 .0868367 .0659568 .004554
IV OLS Difference S.E.
(b) (B) (b-B) sqrt(diag(V_b-V_B))
Coefficients
. hausman IV OLS /*note the order of IV and OLS*/
Prob>chi2 = 0.0000
= 209.77
</div>
<span class='text_page_counter'>(26)</span><div class='page_container' data-page=26>
OLS vs. IV – the contribution of ed
</div>
<span class='text_page_counter'>(27)</span><div class='page_container' data-page=27></div>
<span class='text_page_counter'>(28)</span><div class='page_container' data-page=28>
Relationship and credit limit
Chakraborty et al. (2010) The Importance of Being
Known: Relationship Banking and Credit Limits.
<i>Quarterly J of Finance and Accounting 49(2) 27-48.</i>
Objective: investigate the effect of relationship on
credit limits given to firms
</div>
<span class='text_page_counter'>(29)</span><div class='page_container' data-page=29>
Relationship and credit limit
Chakraborty et al. (2010)
Indep var:
contract’s characteristics (prices, collateral, loan terms)
relationship (bank-firm years of relationship)
bank’s characteristics
Endogeneity: credit limit (dep var) and contract’s
characteristics are determined simultaneously
Istrumented vars: contract’s characteristics (interest rate
and collateral)
</div>
<span class='text_page_counter'>(30)</span><div class='page_container' data-page=30>
Bank loan and trade credit
Du et al. (2012) Bank Loan vs. Trade Credit –
<i>Evidence from China. Economics of Transition </i>
20(3): 457-80
Objective: effects of bank loan and trade credit on
firm performance and growth
</div>
<span class='text_page_counter'>(31)</span><div class='page_container' data-page=31>
Bank loan and trade credit
Dep var:
labor productivity: output per worker [in log]
ROA
change in employment [in log]
reinvestment rate [share of profit reinvested]
Indep var
bank loan [ratio of bank loan to total asset]
trade credit [% purchased with credit of two main
inputs]
</div>
<span class='text_page_counter'>(32)</span><div class='page_container' data-page=32>
Bank loan and trade credit
Instrumented variables: bank loan and trade credit
Endogeneity:
reverse causality
spurious correlation
Instrumental variables:
for trade credit: relationship [dummy, 1 if the two main inputs
are supplied by relatives or friends]
previous studies showed that suppliers are more likely to offer trade
credit when customers are in the same network
for bank loan: British administration [dummy, 1 if the located
city is administered by GB in the Qing dynasty]
<sub>reason: GB during their administration develop their own bank </sub>
</div>
<span class='text_page_counter'>(33)</span><div class='page_container' data-page=33>
Incentive Contracts and Bank
Performance
Li et al. (2007) Incentive Contracts and Bank
Performance – Evidence from Rural China.
<i>Economics of Transition 15(1): 109-24.</i>
Objective: the effect of incentive to bank’s
manager to bank performance.
Data: bank branches in rural China
Dep var:
deposit growth
</div>
<span class='text_page_counter'>(34)</span><div class='page_container' data-page=34>
Incentive Contracts and Bank
Performance
Indep var:
the amount of money given to manager per
performance point
branch size [asset value]
town’s industrial development [per capita industrial
output]
</div>
<span class='text_page_counter'>(35)</span><div class='page_container' data-page=35>
Incentive Contracts and Bank
Performance
Endogeneity: omitted variables, such as manager’s
ability
Instrumented variable: incentive
</div>
<!--links-->