Tải bản đầy đủ (.pdf) (15 trang)

Chapter 02_Finite Sample Properties Of The OLS Estimator

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (171.52 KB, 15 trang )

Advanced Econometrics

Chapter 2: Finite Sample Properties Of The OLS Estimator

Chapter 2

FINITE SAMPLE PROPERTIES OF
THE OLS ESTIMATOR

Y = X. + ε


with

ε ~ N [0, σ 2 I ]

rank(X) = k non-stochastic.

ε random → Y random.


βˆ = ( X ′X ) −1 X ′Y ; βˆ is a statistics on a sample, βˆ is random because Y is random. Being
random:
- βˆ has a probability distribution, called the sampling distribution.
- Repeatedly draw all possible random sample of size n calculate " βˆ " each time.
Let explore some statistical properties of the OLS estimator βˆ & build up its sampling
distribution.

I.

UNBIASED:



βˆ

= ( X ′X ) −1 X ′Y
= ( X ′X ) −1 X ′( Xβ + ε )

′X ) −1 X ′X β + ( X ′X ) −1 X ′ε
X
= (

I

= β + ( X ′X ) −1 X ′ε
E( βˆ ) = E[ β + ( X ′X ) −1 X ′ε ]

Nam T. Hoang
University of New England - Australia

1

University of Economics - HCMC - Vietnam


Advanced Econometrics

Chapter 2: Finite Sample Properties Of The OLS Estimator

= β + E[( X ′X ) −1 X ′ε ]

E (ε ) =

= β + ( X ′X ) −1 X ′
0

E ( βˆ ) = β



βˆ is an estimator of , it is a function of the random sample (the element of Y).
Note: we talk about the sample → that means we talk about Y only. Because X is a constant
- fix matrix. "Repeatedly draw all possible random samples of size n → draw Y".
The least squares estimator is unbiased for (E(ε) = 0, X is non-stochastic).


ˆ ˆ E ( βˆ ))' ]

VarCov( βˆ ) = E[( βˆ − 
E
))( β − 

β

VarCov( βˆ )

βˆ − β = ( X ′X ) −1 X ′ε

β

= E [( βˆ − β )( βˆ − β )' ]
= E[( X ′X ) −1 X ′ε )(( X ′X ) −1 X ′ε )' ]
= E [( X ′X ) −1 X ′εε ' X ( X ′X ) −1 ]

= ( X ′X ) −1 X ′E (εε ' ) X ( X ′X ) −1
= ( X ′X ) −1 X ′σ ε2 X ( X ′X ) −1
= σ ε2 ( X ′X ) −1 X ′X ( X ′X ) −1

I

= σ ε2 ( X ′X ) −1
So: VarCov( βˆ ) = σ ε2 ( X ′X ) −1
For the model:

~
~
~
Yi = βˆ2 X i 2 + βˆ3 X i 3 + ei
 βˆ2 

 βˆ3 

βˆ = 

σ ε ( X ′X )
2

−1

~
 ∑ X i23
= σ ε  − X~ X~
 ∑ i 2 i 3


Nam T. Hoang
University of New England - Australia

2

∑ X ~X 
∑ X  ∑ X~
~ ~
i2

2

1

i3

2
i2

2
i2

~
X i23 −

(∑ X~

i2

~

X i3

)

2

University of Economics - HCMC - Vietnam


Advanced Econometrics

Chapter 2: Finite Sample Properties Of The OLS Estimator

 βˆ 
= VarCov  2 
 βˆ 
 3

σ ε2 ∑ X i23
~

→ Var ( βˆ )

=

∑X

~2 ~2
X i3 −
i2


(∑ X~

i2

~
X i3

)

2

σ ε2 / ∑ X i22
~

∑( X

~ ~ 2
X i3 )
i2

=

n2
~2
~
∑ X i 2 ∑ X i23

1−


nn


2
r23
sample correlation between X i 2 ; X i 3

→ Var ( βˆ )

=

∑X

σ ε2

~2

i2

(1 − r232 )

determined by:
i.

σ ε2 ↑ → Var ( βˆ ) ↑

ii.

r232 ↑ → Var ( βˆ ) ↑


iii.

Variation in Xi2

iv.

n sample size ↑ → Var ( βˆ ) ↓

∑X

~2
i2

↑ → Var ( βˆ ) ↓

VarCov ( βˆ ) = σ ε2 ( X ′X ) −1 → we don't know σ ε2 → need an estimator for σ ε2 .

Define: σˆ ε2 =

e' e
n−k

n: observations.
k: number of estimators.
e' e = ∑ ei2 = sum of squares.



Show σˆ ε2 is an unbiased estimator.
e = Mε → e'e = ε'M'Mε=ε'Mε




Note: trace of a square matrix.

Nam T. Hoang
University of New England - Australia

3

University of Economics - HCMC - Vietnam


Advanced Econometrics

Chapter 2: Finite Sample Properties Of The OLS Estimator

n

A is the sum of its principal diagonal elements (=

n ×n

∑a
i =1

ii

).


Rules: A, B nxn matrix
tr(A+B) = tr(A) + tr(B)
tr(A.B) = tr(B.A)
tr(λA) = λtr(A)
Trace is a linear operation → sum of certain elements.
E ( e' e )

= E (ε ' Mε )
= E[tr (ε ' Mε )] = E[tr (εε ' M )]
= trE (ε ' Mε ) = tr[σ ε2 . I .M )]
= σ ε2 tr ( M ) = σ ε2 [tr ( I n ) − tr ( X ( X ' X ) −1 X ' )]
= σ ε2 [n − tr ( X ( X ' X ) −1 X ')] = σ ε2 ( n − k )



I k ×k

And:

E ( e' e) σ ε2 ( n − k )
= σ ε2
=
n−k
n−k

So:

E (σˆ ε2 ) = σ ε2 → σˆ ε2 is an unbiased estimator of σ ε2 .

II. LINEARITY:


Any estimator that is a linear function of the random sample data is called a linear estimator.
Yi: random sample data.
ˆ
β
X ′X ) −1 X ′Y = 
A . Y
 = (

k × n n ×1

k ×1

A

where A is non-random:

Nam T. Hoang
University of New England - Australia

4

University of Economics - HCMC - Vietnam


Advanced Econometrics

Chapter 2: Finite Sample Properties Of The OLS Estimator

 βˆ1 

 a11
 
a
 βˆ2 
 21
  =  
 

 βˆk 
a k 1
 

a12
a 22

X k2

 a1n  Y1 
 a 2 n  Y2 
 

   
 
 a kn  Yn 

→ βˆ1 = a11Y1 + a12Y2 + ... + a1nYk 1
→ βˆ , OLS estimator is linear and unbiased for .
Because βˆ is a linear function of Y and Y is a linear function of ε, → if ε is normal then

βˆ is normal. So the sampling distribution of the OLS estimator of is:

βˆ ~ N[ , σ ε2 ( X ′X ) −1 ]

III. EFFICIENCY:

Suppose we have 2 unbiased estimators, θˆ1 ; θˆ2 for θ . Then we say θˆ1 is more efficient
than θˆ2 if Var (θˆ1 ) ≤ Var (θˆ2 ) .
If θˆ1 ; θˆ2 are vectors unbiased estimators of θ , then θˆ1 is more efficient than θˆ2 if
 
k ×1
k ×1

k ×1

∆ = [V (θˆ1 ) − V (θˆ2 )] is positive semi-definite.

IV. GAUSS - MARKOV THEOREM:

"Under the assumptions of the classical regression model, the least squares estimators
of , βˆ = ( X ′X ) −1 X ′Y are the best linear unbiased estimators". (BLUE).
Linear: in Y
Best: Best for any alternative linear on unbiased estimators.
Var ( βˆ j ) ≤ Var (b j ) ∀j .
Proof: Let b is any other linear estimator of :

Nam T. Hoang
University of New England - Australia

5

University of Economics - HCMC - Vietnam



Advanced Econometrics

Chapter 2: Finite Sample Properties Of The OLS Estimator

b = 
A . Y

k ×1

Unbiased:

k × n n ×1

E(b) =
E(b) = E(AY) =E(AX + Aε)
E(b) = AX + 0 = AX =

→ AX =I
Let A = (X'X)-1X' + C where C is any non-stochastic (k×n) matrix.
I = AX = [( X ' X ) −1 X '+C ] X = ( X ' X ) −1 X ' X + CX = CX = 0


I

b = AY = [( X ' X ) −1 X '+C ][ Xβ + ε ]
= (
X
'

X
) −1
X
'
X β + ( X ' X ) −1 X ' ε + CXβ + Cε
I

= β + ( X ' X ) −1 X ' ε + Cε
VarCov(b) = E[(b − β )(b − β )' ]

= E{[( X ' X ) −1 X ' ε + Cε ][( X ' X ) −1 X ' ε + Cε ]' }
= E[( X ' X ) −1 X ' (εε ' ) X ( X ' X ) −1 + ( X ' X ) −1 (εε ' )C '+Cεε ' X ( X ' X ) −1 + Cεε ' C ' ]
= σ ε2 (
X
'
X
) −1
X
'
X ( X ' X ) −1 + σ ε2 ( X ' X ) −1 X ' C '+σ ε2 CX ( X ' X ) −1 + σ ε2 CC '
I

= σ ε2 ( X ' X ) −1 + σ ε2 CC '

VarCov ( βˆ )

The jth diagonal element:
n

Var (b j ) = Var ( βˆ j ) + σ ε2 ∑ c 2ji ≥ Var ( βˆ j )


∀j = 1, k

i =1

→ Var (b j ) ≥ Var ( βˆ j )

∀j = 1, k

→ βˆ j is the best linear unbiased estimator (BLUE).
→ βˆ j is efficient estimator (smallest variance).
Nam T. Hoang
University of New England - Australia

6

University of Economics - HCMC - Vietnam


Advanced Econometrics

Chapter 2: Finite Sample Properties Of The OLS Estimator

V. REVIEW: STATISTICAL INFERENCE:

1. Linear function of normal random variables are also normal:
u

N( µ
)

, Σ

~

n ×1 n × n

n ×1

Z
P u is normally distributed.
 = 



m × n n ×1

m ×1

E ( Z ) = E ( Pu ) = PE (u ) = Pµ

VarCov( Z ) = E [( Z − E ( Z ))( Z − E ( Z ))' ]
= E[( Pu − Pµ )( Pu − Pµ )' ]

µ
= P
E[(
u
−
)(
u −µ

)' ]P' = PΣP'

Σ

Then Z

N ( Pµ , PΣP' )

~

2. Chi-squared distribution:
If Z

r×1

or Z ' Z

~

N (0, I ) then Z'Z has the Chi-squared distribution with r degree of freedom

χ [2r ] Z'Z

~

r: number of these independent standard normal variables in the sum of squares:
Theorem:

If Z


r×1

~

N (0, I ) and A is idempotent with rank equal to r, then:
n ×n

~

χ [2r ]

i.

Z ' AZ

ii.

r = tr ( A) = rank ( A)

3. Eigenvalue - eigenvector problem:
For a square matrix A , we can find n pairs of (λ j , c j ) such that:
n ×n

A c j = (λ j c j )

n ×n

n ×1

1×1 n ×1


j = 1,2, ... , n

1×1 n ×1

Nam T. Hoang
University of New England - Australia

7

University of Economics - HCMC - Vietnam


Advanced Econometrics

Chapter 2: Finite Sample Properties Of The OLS Estimator

n

( ∑ c 2j = 1)

normalizing: c j ' c j = 1

j =1

The eigenvectors are orthogonal to each other:
ci ' c j = 0

(∀i ≠ j )


so c = [c1, c2, ..., cn] is an orthogonal matrix:
( c ' = c −1 )

c' c = I

Eigenvalue - eigenvector problem:
A c j = (λ j c j )

n ×n

n ×1

cj'cj = 1

Let:


j = 1,2, ... , n

1×1 n ×1

ci ' c j = 0

C = [c1

c1 j 
 
c2 j
cj =  
 

 
cnj 

(∀i ≠ j )

c2  cn ] ⇒ c' c = I

n ×n

n ×n

c' = c-1: orthogonal matrix:

AC = A[c1

AC = [c1

Ac2  Acn ] = [c1λ1

c2  cn ] = [ Ac1

c2

c 2 λ2  c n λn ]

λ1 0  0 
0 λ  0
2
 = CΛ
 cn ] 

  



0  λn 
0


Λ

where Λ is a diagonal matrix: C ' AC = C ' CΛ = Λ
and also Rank ( A) = Rank ( Λ ) = number of no-zero of λj's.
Note: C' AC = Λ → C ' −1 C ' ACC −1 = (C ' ) −1 ΛC −1 = CΛC '
Remember: A = CΛC ' and C' AC = Λ ; C'C = I, C' = C-1
Theorem:

Let A be an idempotent matrix with rank = r and let Z

r×1

Z ' AZ

Nam T. Hoang
University of New England - Australia

~

~

N (0, I ) then:


χ [2r ] and rank ( A) = tr ( A)

8

University of Economics - HCMC - Vietnam


Advanced Econometrics

Proof: C' AC = Λ ,

Chapter 2: Finite Sample Properties Of The OLS Estimator

Z

~

r×1

N (0, I )

For A idempotent, λj = 0 or 1
Because: AC j = C j λ j → AAC j = AC j λ j = C j λ2j
So: C j λ2j = C j λ j

→ C j (λ2j − λ j ) = 0
→ C j λ j (λ j − 1) = 0 → λ j = 0 or λ j = 1

1

0

Write: C' AC = Λ =  

0
0

0 0
0 0

 

1 0
0 0

0 
1 
 
0 
0 

There must be r nonzero elements of Λ , because rank ( A) = r = rank ( Λ ) = tr ( Λ ) since all
diagonal elements are 0 or 1.

(Rule: tr(A.B) = tr(B.A))

Also tr ( Λ ) = tr ( ACC ' ) = tr ( A)

so rank ( A) = tr ( A) = r


u = C
)
' , Z

n ×1

Z

n×1

n × n n ×1

~

N (0, I )

' )C = C ' C = I
E (uu ' ) = E (C ' ZZ ' C ) = C ' 
E (
ZZ
I

Contruct quadratic form:
n

u' Λu = Z ' C (C ' AC )C ' Z = Z ' AZ = ∑ ui2

~

χ [2r ]


i =1

So if Z

~

N (0, I ) and A is idempotent with rank equal to r, then
n ×n

Z ' AZ
Extension: So if Z

~

N (0, σ 2 I ) , then

~

Z ' AZ

σ

2

χ [2r ]
~

χ [2r ]


4. Other distribution:
Let Z be N(0,I) and let W be χ [r2 ] and let Z and W be independently distributed, then:
Nam T. Hoang
University of New England - Australia

9

University of Economics - HCMC - Vietnam


Advanced Econometrics

Chapter 2: Finite Sample Properties Of The OLS Estimator

Z
W

~ t[ r ]
r

has the t-distribution with r degree of freedom.
Let W be χ [r2 ] and let v be χ [s2 ] and W and v be independently distributed, then:
W
v

r

~

Fsr


s

has the F-distribution with r (numerator) and s (denominator) degree of freedom.
VI. TESTING HYPOTHESIS ON INDIVIDUAL COEFFICIENT:
Y = X. + ε


ε ~ N [0, σ 2 I ]

with

Recall: βˆ ~ N[ , σ ε2 ( X ′X ) −1 ]
So βˆ j ~ N[ j, σ ε2 [( X ′X ) −1 ]ij ]



βˆ j − β j
σ 2 ( X ' X ) −jj1

~ N [0,1]

but σ2, so this can't be used directly for constructing test or confidence intervals.

e' e = ε ' M ' Mε = ε ' Mε , M is idempotent with with rank(M) = its trace = n-k.

ε ~ N [0, σ 2 I ] → ε / σ ~ N [0, I ]

( n ×1)




e' e

σ

2

=

ε ' Mε
σ2

~

χ [2n − k ]

βˆ j − β j
So follow theorem:

σ 2 ( X ' X ) −jj1

~ tn −k

e' e

σ2




βˆ j − β j
e' e
( X ' X ) −jj1
n
k
−


(n − k )

~ tn −k

σˆ 2

Nam T. Hoang
University of New England - Australia

10

University of Economics - HCMC - Vietnam


Advanced Econometrics

Chapter 2: Finite Sample Properties Of The OLS Estimator



βˆ j − β j
σˆ 2 ( X ' X ) −jj1


~ tn −k

σˆ 2 ( X ' X ) −jj1 = σˆ β2ˆ = standard error of βˆ j .
j

Finally:

βˆ j − β j
σˆ β2ˆ

~ tn −k

j

This basic result enables us to test hypothesis about elements of

and to construct

confidence intervals for them (note that we need the assumption of normality of ε's).
EX: yˆ i = 1.4 + 0.2 xi 2 + 0.6 xi 3
( 0.7 )

0.05

H0:

2

=0


H1:

2

>0

t=

βˆ j − β j
SE ( βˆi )

(1.4 )

=

0.2 − 0
=4
0.05

tα (5%) = 1.74

d.o.f = n-k =17.

tα (1%) = 2.567
t > tα → reject H0.
EX: H0:

1


= 1.5

H1:

2

≠ 1.5 ( or ≥ 1.5 or ≤ 1.5)

t=

βˆ j − β j
SE ( βˆi )

=

1.4 − 1.5
= −0.1429 d.o.f = n-k =17.
0.7

2.5%

Nam T. Hoang
University of New England - Australia

2.5%

11

University of Economics - HCMC - Vietnam



Advanced Econometrics

Chapter 2: Finite Sample Properties Of The OLS Estimator

t < tα / 2 ⇒ cannot reject H0 at 5%.

VII. CONFIDENCE INTERVALS:

βˆi − β i
SE ( βˆi )

Recall:

ti =

so

Pr[ −tα / 2 ≤ ti ≤ −tα / 2 ] = 1 − α

Pr[ −tα / 2 ≤

~ tn −k

βˆi − β i
≤ − tα / 2 ] = 1 − α
SE ( βˆi )

Pr[ βˆi − tα / 2 SE ( βˆi ) ≤ β i ≤ βˆi + tα / 2 SE ( βˆi )] = 1 − α
• If we were to take a sample of size "n", construct this repeat many times then

100(1-α)% of such intervals would cover the true value of

i

• If we construct the interval once, there is no guarantee that the internal will cover the
true i].
• Type of errors: size & power of tests.
Type I: Reject H0 when it is true.
Type II: Accept H0 when it is false.
Assume:

Prob(type I error) = α
Prob(type II error) =

If sample size is fixed: α↓ ⇒ ↑
call α: significant level or size of the test.
→ Fix α and try to design the test so to minimize .
• Definition: The power of a test is 1- .
Power = 1 - Pr(accept H0/H0 false)
= Pr(reject H0/H0 false)
Nam T. Hoang
University of New England - Australia

12

University of Economics - HCMC - Vietnam


Advanced Econometrics


Chapter 2: Finite Sample Properties Of The OLS Estimator

• A test is "uniformly most powerful" if its power exceeds that of any other test (for the
same choice of α) over all possible alternative hypothesis.
• A test is "consistent" if its power → 1 as n →∞ for any false hypothesis.
• A test is unbiased of its power never falls below α.

VIII. FAMILY OF F-TEST:
For general linear restrictions, unrestricted model (U-model), original model.
H0: some restrictions on β . These define the restricted model (R-model):
k ×1

r
Fdfu
=

( ESS R − ESSU ) / r
ESSU ) / dfu

ESSR = error sum of squares from R-model: e′R e R
ESSU = error sum of squares from U-model: eU′ eU
r: number of restrictions in H0.
dfu: degree of freedom in U-model = n-k.
ESSU

σ

2

=


=
 ESS R
 σ 2

 ESSU
 σ 2

eU′ eU

σ

2

=

ε ′Mε
σ2

ε′ ε
M
σ σ
~
~

~

χ [2n − k ]

χ [2n − ( k − r )]




χ [2n − k ]

ESS R

σ

2



ESSU

σ2

~

χ [2r ]

( ESS R − ESSU ) / σ 2 r ( ESS R − ESSU ) / r
=
ESSU ) /(n − k )σ 2
ESSU ) /(n − k )


( ESS R − ESSU ) / r
ESSU ) /(n − k )


Nam T. Hoang
University of New England - Australia

~

Fnr− k

13

University of Economics - HCMC - Vietnam


Advanced Econometrics

Chapter 2: Finite Sample Properties Of The OLS Estimator

Case 1: Join significant of all slopes:

β 
β =  1
k ×1
 β 2  k −1
1

H0:

β

= 0 → r = k −1


2
( k −1) ×1

U-model:

Y = X β +ε



ESSU =e'e

R-model:

Yi = β1 + ε i



βˆ1 + Y



Yi = Y + ei

k ×1

dfu = n-k

n

ESS R = ∑ (Yi − Y ) 2

i =1

n



Fnk−−k1 =

( ∑ (Yi − Y ) 2 − e' e) /(k − 1)
i =1

e' e /(n − k )

=

R 2 /(k − 1)
(1 − R 2 ) /(n − k )

Case 2:
k −r

β 
β =  1
k ×1
β 2  r

H0: β 2 = 0

U-model:


Y = Xβ + ε



ESSU = eU′ eU

R-model:

Y = X β +ε



ESSU = e′R e R

r ×1

r ×1

( k − r ) ×1

n

ESS R = ∑ (Yi − Y ) 2
i =1


EX:

Fnr− k =


( ESS R − ESSU ) / r
ESSU ) /(n − k )

Translog of production function:
log Y = β1 + β 2 log K + β 3 log L + β 4 (log K ) 2 / 2 + β 5 (log L) 2 / 2 + β 6 (log K log L) + ε

H 0 : β 4 = β 5 = β 6 = 0 Cobb-Douglas restrictions.
n = 27

ESSU = 0.67993

r=3

ESSR = 0.85163

n - k = 21
Nam T. Hoang
University of New England - Australia

14

University of Economics - HCMC - Vietnam


Advanced Econometrics

Chapter 2: Finite Sample Properties Of The OLS Estimator

→ Fnr− k = 1.768 . Critical value: F213 ,5% = 3.1
→ Fnr− k < Critical value

⇒ So do not reject H0 and conclude that are consistent with the Cobb-Douglas model.
Case 3: General restrictions.

 β1 
β =  β 2 
 β 2 

R β =C

r × k k ×1

r ×1

Restrictions:

β2 + β3 = 1
r ×1

r ×1

r ×1

→ [0 1 1]β = 1 ( r = 1)



R

If restrictions:
β 2 + β 3 = 1

( r = 2)

β1 = 0
0 1 1 
1 
→ 
β = 

1 0 0
 0

Jarque - Beta statistics:
H0: εi are normally distributed.
H1: εi are not normally distributed.
JB

~

χ 22

JB = SK2 +(Kur)2

Reject H0 for large JB.
Reject H0 if JB >7 (critical) or if p-value < 0.05

Nam T. Hoang
University of New England - Australia

15


University of Economics - HCMC - Vietnam



×