Tải bản đầy đủ (.pdf) (24 trang)

Bài 1: Các dạng hàm hồi quy

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1008.36 KB, 24 trang )

<span class='text_page_counter'>(1)</span><div class='page_container' data-page=1>

FUNCTIONAL FORMS



Truong Dang Thuy



</div>
<span class='text_page_counter'>(2)</span><div class='page_container' data-page=2>

Linear model



Consider a linear regression function



: change in Y when X increases by 1 unit.


Sometimes the relationship is not linear.


Common functional form:



Log-linear


Log-lin


Lin-log



Reciprocal


Polynomial



0

1



<i>Y</i>

<i>X</i>



</div>
<span class='text_page_counter'>(3)</span><div class='page_container' data-page=3>

Functional forms



<b>Linear model </b>

<b>Log-linear </b>



<b>Lin-log </b>



<b>Log-lin </b>




0 1


<i>Y</i>

<i>X</i>



0 1


ln

<i>Y</i>

ln

<i>X</i>



0 1

ln



<i>Y</i>

<i>X</i>



0 1


</div>
<span class='text_page_counter'>(4)</span><div class='page_container' data-page=4>

Functional forms



<b>Reciprocal (negative beta) </b>

<b>Reciprocal (positive beta) </b>



0 1 1


1



0



<i>Y</i>



<i>X</i>








1




0 1 1


1



0



<i>Y</i>



<i>X</i>





</div>
<span class='text_page_counter'>(5)</span><div class='page_container' data-page=5>

Example dataset



Viet Nam Provincial data on (file ‘

gdpprov.xlsx

’)



gdp

:

provincial GDP (mil. VND)



labfo

:

number of laborers of provinces (1000



persons)



</div>
<span class='text_page_counter'>(6)</span><div class='page_container' data-page=6>

<b>Record of </b>


<b>commands </b>




<b>Record of results </b>



<b>Variables </b>


<b>(data) </b>



<b>Commands </b>



<b>Taskbar </b>



</div>
<span class='text_page_counter'>(7)</span><div class='page_container' data-page=7>

Import data



<b>Copy from Excel </b>



</div>
<span class='text_page_counter'>(8)</span><div class='page_container' data-page=8>

Data description




</div>
<span class='text_page_counter'>(9)</span><div class='page_container' data-page=9>

Linear function



</div>
<span class='text_page_counter'>(10)</span><div class='page_container' data-page=10>

LOG-LINEAR MODEL



The Cobb-Douglas Production Function:



can be transformed into a linear model by taking natural


logs of both sides:



The slope coefficients can be interpreted as elasticities.



<i><sub>If (B</sub></i>

<i><sub>2</sub></i>

<i><sub> + B</sub></i>

<i><sub>3</sub></i>

<sub>) = 1, we have constant returns to scale. </sub>


<i><sub>If (B</sub></i>

<i><sub>2</sub></i>

<i><sub> + B</sub></i>

<i><sub>3</sub></i>

<sub>) > 1, we have increasing returns to scale. </sub>



<i><sub>If (B</sub></i>

<i><sub>2</sub></i>

<i><sub> + B</sub></i>

<i><sub>3</sub></i>

<sub>) < 1, we have decreasing returns to scale. </sub>



3


2



1



<i>B</i>


<i>B</i>



<i>i</i>

<i>i</i>

<i>i</i>



<i>Q</i>

<i>B L K</i>



1

2

3



</div>
<span class='text_page_counter'>(11)</span><div class='page_container' data-page=11>

Log-linear model




_cons 3.06333 .4515804 6.78 0.000 2.174233 3.952426
linvest .644785 .0405325 15.91 0.000 .5649824 .7245876
llabor .508612 .0643267 7.91 0.000 .381962 .635262

lgdp Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 224.910559 270 .833002069 Root MSE = .42886
Adj R-squared = 0.7792
Residual 49.2915017 268 .183923514 R-squared = 0.7808
Model 175.619057 2 87.8095284 Prob > F = 0.0000
F( 2, 268) = 477.42


Source SS df MS Number of obs = 271
. reg lgdp llabor linvest


(17 missing values generated)
. gen linvest = ln(rinvest)
. gen llabor = ln(labfo)


(10 missing values generated)
. gen lgdp = ln(rgdp)


</div>
<span class='text_page_counter'>(12)</span><div class='page_container' data-page=12>

LOG-LIN OR GROWTH MODELS



The rate of growth of real GDP:



can be transformed into a linear model by taking natural logs


of both sides:



<i>Letting B</i>

<sub>1</sub>

<i> = ln RGDP</i>

<sub>0</sub>

<i> and B</i>

<sub>2</sub>

<i> = ln (l+r), this can be </i>



rewritten as:



<i>ln RGDP</i>

<sub>t</sub>

<i> = B</i>

<sub>1</sub>

<i> +B</i>

<sub>2</sub>

<i> t </i>



<i>B</i>

<i><sub>2</sub></i>

<i> is considered a semi-elasticity or an instantaneous growth rate. </i>


<i>The compound growth rate (r) is equal to (e</i>

<i>B2</i>

<i> – 1). </i>



0

(1

)



<i>t</i>


<i>t</i>




<i>RGDP</i>

<i>RGDP</i>

<i>r</i>



0



</div>
<span class='text_page_counter'>(13)</span><div class='page_container' data-page=13>

LOG-LIN MODEL



t 290 3 1.416658 1 5



Variable Obs Mean Std. Dev. Min Max


. sum t



</div>
<span class='text_page_counter'>(14)</span><div class='page_container' data-page=14>

LOG-LIN MODEL



</div>
<span class='text_page_counter'>(15)</span><div class='page_container' data-page=15>

LIN-LOG MODELS



Lin-log models follow this general form:



<i>Note that B</i>

<i><sub>2</sub></i>

<i> is the absolute change in Y responding to a </i>



<i>percentage (or relative) change in X </i>



<i>If X increases by 100%, predicted Y increases by B</i>

<i><sub>2</sub></i>

units



1

2

ln



<i>i</i>

<i>i</i>

<i>i</i>



</div>
<span class='text_page_counter'>(16)</span><div class='page_container' data-page=16>

Exercise – lin-log model




Data: from VHLSS 2010



income

: individual annual income (1000 VND)


healthcost

: individual annual cost for health care



(1000 VND)



Use the data in ‘healthcost.dta’ to run the



regression



where

hcshare

is the share of health cost in income.





0

1

ln



</div>
<span class='text_page_counter'>(17)</span><div class='page_container' data-page=17>

Health cost with Lin-log model




_cons .421608 .0322026 13.09 0.000 .35847 .484746
lincome -.0341629 .0029364 -11.63 0.000 -.0399202 -.0284056

hcshare Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 75.7996618 3474 .021819131 Root MSE = .14494
Adj R-squared = 0.0372
Residual 72.9563097 3473 .021006712 R-squared = 0.0375
Model 2.84335206 1 2.84335206 Prob > F = 0.0000
F( 1, 3473) = 135.35


Source SS df MS Number of obs = 3475
. reg hcshare lincome


. gen lincome = ln(income)


</div>
<span class='text_page_counter'>(18)</span><div class='page_container' data-page=18>

RECIPROCAL MODELS



Lin-log models follow this general form:



Note that:



<i>As X increases indefinitely, the term approaches zero and Y approaches </i>


<i>the limiting or asymptotic value B</i>

<i><sub>1</sub></i>

.



The slope is:



<i>Therefore, if B</i>

<sub>2</sub>

<i> is positive, the slope is negative throughout, and if B</i>

<sub>2</sub>

is negative,


the slope is positive throughout.



1

2



1


(

)



<i>i</i>

<i>i</i>



<i>i</i>



<i>Y</i>

<i>B</i>

<i>B</i>

<i>u</i>




<i>X</i>



2

1


(

)


<i>i</i>

<i>B</i>


<i>X</i>


2

<sub>2</sub>


1


(

)


<i>dY</i>


<i>B</i>



</div>
<span class='text_page_counter'>(19)</span><div class='page_container' data-page=19>

Exercise – Reciprocal model



Use the data in ‘

healthcost.dta

’ to run the



regression



0

1



1



<i>hcshare</i>



<i>income</i>






</div>
<span class='text_page_counter'>(20)</span><div class='page_container' data-page=20>

Exercise – Reciprocal model




_cons .023971 .0032251 7.43 0.000 .0176478 .0302943


invincome 942.4843 81.65964 11.54 0.000 782.3786 1102.59



hcshare Coef. Std. Err. t P>|t| [95% Conf. Interval]



Total 75.7996618 3474 .021819131 Root MSE = .14498


Adj R-squared = 0.0367


Residual 72.9997153 3473 .02101921 R-squared = 0.0369


Model 2.79994649 1 2.79994649 Prob > F = 0.0000


F( 1, 3473) = 133.21


Source SS df MS Number of obs = 3475


. reg hcshare invincome



</div>
<span class='text_page_counter'>(21)</span><div class='page_container' data-page=21>

POLYNOMIAL REGRESSION MODELS



The following regression predicting GDP is an example of a



quadratic function, or more generally, a second-degree


<i>polynomial in the variable time: </i>



The slope is nonlinear and equal to:



Exercise: run the above model with ‘gdpprov.dta’



2



1

2

3




<i>t</i>

<i>t</i>



<i>RGDP</i>

<i>A</i>

<i>A time</i>

<i>A time</i>

<i>u</i>



2

2

3



<i>dRGDP</i>



<i>A</i>

<i>A time</i>



</div>
<span class='text_page_counter'>(22)</span><div class='page_container' data-page=22>

SUMMARY OF FUNCTIONAL FORMS



<b>MODEL </b>

<b>FORM </b>

<b>SLOPE </b>

<b>ELASTICITY </b>



<b>(</b>

<i>dY</i>



<i>dX</i>

<b>) </b>

.



<i>dY X</i>


<i>dX Y</i>



Linear

<i>Y =B</i>

<i>1</i>

<i> + B</i>

<i>2</i>

<i> X </i>

<i>B </i>

2 2

(

)



<i>Y</i>


<i>X</i>


<i>B</i>



Log-linear

<i>lnY =B</i>

<i>1</i>

<i> + ln X </i>

2

(

)




<i>Y</i>


<i>B</i>



<i>X</i>

<i>B </i>

2


Log-lin

<i>lnY =B</i>

<i>1</i>

<i> + B</i>

<i>2</i>

<i> X </i>

<i>B Y </i>

2

( )

<i>B</i>

2

(

<i>X</i>

)



Lin-log

<i>Y</i>

<i>B</i>

1

<i>B</i>

2

ln

<i>X</i>

2


1


(

)


<i>B</i>


<i>X</i>

)


1


(


2

<i>Y</i>


<i>B</i>



Reciprocal

1 2


1


(

)



<i>Y</i>

<i>B</i>

<i>B</i>


<i>X</i>



<i>B</i>

<sub>2</sub>

(

1

<sub>2</sub>

)



<i>X</i>




<sub>2</sub>

(

1

)



<i>XY</i>


<i>B</i>





2

ln



</div>
<span class='text_page_counter'>(23)</span><div class='page_container' data-page=23>

COMPARING ON BASIS OF R

2



We cannot directly compare two models that have



different dependent variables.



We can transform the models as follows and compare RSS:



<i>Step 1: Compute the geometric mean (GM) of the dependent </i>



<i>variable, call it Y</i>

*

<sub>. </sub>



<i>Step 2: Divide Y</i>

<i><sub>i</sub></i>

<i> by Y</i>

*

to obtain:



<i>Step 3: Estimate the equation with lnY</i>

<i><sub>i</sub></i>

as the dependent variable



<i>using in lieu of Y</i>

<i><sub>i</sub></i>

as the dependent variable (i.e., use ln as the


dependent variable).



<i>Step 4: Estimate the equation with Y</i>

<i><sub>i</sub></i>

as the dependent variable




<i>using as the dependent variable instead of Y</i>

<i><sub>i</sub></i>

.



<i>i</i>


<i>i</i>



<i>Y</i>


<i>Y</i>



<i>Y</i>

~



*



<i>i</i>



<i>Y</i>

~

<i>Y</i>

~

<i><sub>i</sub></i>



<i>i</i>



</div>
<span class='text_page_counter'>(24)</span><div class='page_container' data-page=24>

MEASURES OF GOODNESS OF FIT



<i>R</i>

2

: Measures the proportion of the variation in the regressand



explained by the regressors.



<i>Adjusted R</i>

2

: Denoted as , it takes degrees of freedom into account:



Akaike’s Information Criterion (AIC): Adds harsher penalty for adding



more variables to the model, defined as:




<i>The model with the lowest AIC is usually chosen. </i>



Schwarz’s Information Criterion (SIC): Alternative to the AIC criterion,



expressed as:



<i><sub>The penalty factor here is harsher than that of AIC. </sub></i>


2



<i>R</i>





_


2 2

1



1 (1

)

<i>n</i>



<i>R</i>

<i>R</i>


<i>n k</i>



  



2



ln

<i>AIC</i>

<i>k</i>

ln(

<i>RSS</i>

)



<i>n</i>

<i>n</i>






ln

<i>SIC</i>

<i>k</i>

ln

<i>n</i>

ln(

<i>RSS</i>

)



<i>n</i>

<i>n</i>



</div>

<!--links-->

×