Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Econometrics2011

.pdf
Скачиваний:
10
Добавлен:
21.03.2016
Размер:
1.77 Mб
Скачать

CHAPTER 11. GENERALIZED METHOD OF MOMENTS

203

Exercise 11.6 In the linear model y = X + e with E(xiei) = 0; a Generalized Method of Moments (GMM) criterion function for is de…ned as

 

 

 

 

 

 

J ( ) =

1

(y

 

X )0

X 1X0 (y

 

X )

 

 

 

 

 

 

 

 

 

 

 

 

 

GMM b

1

P

n

2

n

n^

 

 

b

 

b

1

X0Y

where

= n

 

i=1 xixi0e^i ; e^i = yi xi0 are the OLS residuals, and = (X0X)

 

 

estimator of ; subject to the restriction h( ) = 0; is de…ned as

 

 

 

 

 

 

 

 

 

 

e

= argmin Jn( ):

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(11.5)

is LS: The

h( )=0

 

 

 

 

 

The GMM test statistic (the distance statistic) of the hypothesis h( ) = 0 is

 

) = min J

 

( ):

(11.6)

D = Jn(e

h( )=0

n

 

 

 

(a) Show that you can rewrite Jn( ) in (11.5) as

 

 

 

 

 

 

b

0 ^

1

b

 

Jn( ) = n V

 

 

 

e

thus is the same as the minimum distance estimator.

(b) Show that in this setting, the distance statistic D in (11.6) equals the Wald statistic.

Exercise 11.7 Take the linear model

^

and consider the GMM estimator

yi = x0i + ei

E(ziei) = 0:

of : Let

 

 

 

 

 

 

J

 

= n

 

 

( )0

1

 

 

( )

 

n

 

 

n

b

b

n

b

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

d

as n ! 1 by demonstrating

denote the test of overidentifying restrictions. Show that Jn ! `2 k

each of the following:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(a) Since > 0; we can write 1 = CC0 and = C0 1C 1

 

 

 

 

 

 

 

 

 

(b) Jn = n C0

 

n( ) 0 C0 ^C 1

C0

 

n( )

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

g

g

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(c) C0

 

 

( ) = D bC0

 

 

 

 

(

) where

 

 

b

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

g

g

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

n

b

 

 

 

 

 

n

 

 

n

0

 

1

 

1

 

 

 

1

1

 

 

1

1

 

 

1

 

1

 

 

 

 

 

Dn = I` C0

 

Z0X

 

X0Z

 

 

 

Z0X

 

 

 

X0Z

 

 

C0

 

 

 

 

 

 

n

n

 

n

 

n

 

 

 

 

 

 

n( 0) =

 

1

Z0e:

 

 

 

 

 

 

 

 

b

 

 

 

 

 

 

 

 

 

b

 

 

 

 

 

 

g

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

n

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(d) Dn

p

I

`

 

R (R0R) 1 R0

where R = C0

E

(zix0)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

!

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

i

 

 

 

 

 

 

 

 

 

 

 

 

 

1=2 0 d

(e) n C gn( 0) ! X N (0; I`)

(f) J

d

X0

I

`

R (R0R) 1

R0

X

 

n !

 

 

 

 

 

(g)X0 I` R (R0R) 1 R0 X 2` k:

Hint: I` R (R0R) 1 R0 is a projection matrix.

Chapter 12

Empirical Likelihood

12.1Non-Parametric Likelihood

An alternative to GMM is empirical likelihood. The idea is due to Art Owen (1988, 2001) and has been extended to moment condition models by Qin and Lawless (1994). It is a non-parametric analog of likelihood estimation.

The idea is to construct a multinomial distribution F (p1; :::; pn) which places probability pi at each observation. To be a valid multinomial distribution, these probabilities must satisfy the

requirements that pi 0 and

Xn

pi = 1:

(12.1)

i=1

Since each observation is observed once in the sample, the log-likelihood function for this multino-

mial distribution is

Xn

log L (p1; :::; pn) =

log(pi):

(12.2)

 

i=1

 

First let us consider a just-identi…ed model. In this case the moment condition places no additional restrictions on the multinomial distribution. The maximum likelihood estimators of the probabilities (p1; :::; pn) are those which maximize the log-likelihood subject to the constraint (12.1). This is equivalent to maximizing

n

log(pi)

n

pi 1!

X

 

Xi

 

i=1

 

=1

 

where is a Lagrange multiplier. The n …rst order conditions are 0 = pi 1 : Combined with the constraint (12.1) we …nd that the MLE is pi = n 1 yielding the log-likelihood n log(n):

Now consider the case of an overidenti…ed model with moment condition

Egi( 0) = 0

where g is ` 1 and is k 1 and for simplicity we write gi( ) = g(yi; zi; xi; ): The multinomial distribution which places probability pi at each observation (yi; xi; zi) will satisfy this condition if

and only if

Xn

pigi( ) = 0

(12.3)

i=1

The empirical likelihood estimator is the value of which maximizes the multinomial loglikelihood (12.2) subject to the restrictions (12.1) and (12.3).

204

CHAPTER 12. EMPIRICAL LIKELIHOOD

 

 

 

205

The Lagrangian for this maximization problem is

 

 

 

 

L ( ; p1; :::; pn; ; ) = n

log(pi)

n

pi 1! n 0

n

pigi ( )

Xi

 

X

 

X

 

=1

 

i=1

 

i=1

 

where and are Lagrange multipliers. The …rst-order-conditions of L with respect to pi, ; andare

1

pi

Xn

pi

i=1

Xn

pigi ( )

i=1

=+ n 0gi ( )

=1

=0:

Multiplying the …rst equation by pi, summing over i; and using the second and third equations, we …nd = n and

 

 

 

1

 

 

 

 

 

 

 

pi =

 

:

 

 

Substituting into

L

we …nd

n 1 + 0gi ( )

 

 

 

 

 

 

 

 

 

 

 

 

 

 

n

 

 

 

 

 

 

 

Xi

 

 

 

R ( ; ) = n log (n)

log 1 + 0gi ( )

:

(12.4)

 

 

 

=1

 

 

 

 

For given ; the Lagrange multiplier ( ) minimizes R ( ; ) :

 

 

 

 

 

( ) = argmin R( ; ):

 

(12.5)

 

 

 

 

 

 

 

 

 

This minimization problem is the dual of the constrained maximization problem. The solution (when it exists) is well de…ned since R( ; ) is a convex function of : The solution cannot be obtained explicitly, but must be obtained numerically (see section 6.5). This yields the (pro…le) empirical log-likelihood function for .

R( ) = R( ; ( ))

n

 

 

Xi

= n log (n)

log

1 + ( )0gi ( )

=1

 

 

^

 

 

 

 

 

 

 

The EL estimate is the value which maximizes R( ); or equivalently minimizes its negative

^

 

argmin [

 

R( )]

(12.6)

=

 

 

 

 

 

Numerical methods are required for calculation of

^

 

 

 

(see Section 12.5).

 

 

 

 

 

 

^

^

As a by-product of estimation, we also obtain the Lagrange multiplier = ( ); probabilities

p^i =

 

 

1

 

:

 

 

 

n 1 + ^0gi ^

 

and maximized empirical likelihood

 

 

 

 

 

 

 

 

n

 

 

 

 

 

^

Xi

log (^pi) :

(12.7)

R( ) =

 

 

=1

 

 

 

 

 

CHAPTER 12. EMPIRICAL LIKELIHOOD

206

12.2Asymptotic Distribution of EL Estimator

De…ne

@

Gi ( ) = @ 0 gi ( )

G= EGi ( 0)

= E gi ( 0) gi ( 0)0

and

 

 

 

V = G0 1G1 1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

G0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

V = G G0 G

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

i

 

 

 

 

 

i

x

; G =

E

(z

x

), and =

E

z

z e2

:

For example, in the linear model, G

( ) =

 

z

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Theorem 12.2.1 Under regularity conditions,

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

d

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

pn ^ d 0 ! N (0; V )

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

p ^

 

 

 

 

1

N (0; V )

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

n !

 

 

 

 

 

 

 

 

 

 

 

 

 

where V

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

p

 

 

^

 

 

and

 

 

^

and V are de…ned in (12.9) and (12.10), and n

0

 

 

 

 

p

n

are asymptotically independent.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(12.8)

(12.9)

(12.10)

^

The theorem shows that asymptotic variance V for is the same as for e¢ cient GMM. Thus the EL estimator is asymptotically e¢ cient.

Chamberlain (1987) showed that V is the semiparametric e¢ ciency bound for in the overidenti…ed moment condition model. This means that no consistent estimator for this class of models can have a lower asymptotic variance than V . Since the EL estimator achieves this bound, it is an asymptotically e¢ cient estimator for .

 

 

 

^ ^

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Proof of Theorem 12.2.1. ( ; ) jointly solve

 

 

 

 

 

 

 

 

 

 

 

 

 

0

=

@

 

R( ; ) =

 

 

n

 

gi ^

 

 

(12.11)

 

 

 

 

Xi

 

 

 

 

 

 

 

 

@

 

 

 

 

^0

 

i

 

 

 

 

 

 

 

 

 

 

 

=1

 

1 +

g

 

 

 

 

 

0

=

@

R( ; ) =

 

 

n

 

Gi ^ 0

 

:

 

(12.12)

 

 

 

Xi

 

 

 

 

 

 

 

 

@

 

 

 

^0

i

 

 

 

 

 

 

 

 

 

 

 

 

 

 

^

 

 

 

 

P

 

P0

 

 

 

 

=1 1 + g

 

 

 

 

 

 

 

 

 

 

 

0

 

 

 

P

in=1 gi ( 0) gi ( 0)0 :

 

Let Gn = n1

in=1 Gi ( 0) ;

g

n = n1

in=1 gi ( 0) and n

= n1

 

 

 

Expanding (12.12) around = and = = 0 yields

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0 ' Gn0

^ 0

:

 

 

 

 

 

 

(12.13)

Expanding (12.11) around = 0 and = 0 = 0 yields

 

 

^

^

(12.14)

 

0 ' gn Gn 0

+ n

CHAPTER 12. EMPIRICAL LIKELIHOOD

Premultiplying by G0n n 1 and using (12.13) yields

0 1 0 1 ^ 0 1 ^ 0 ' Gn n gn Gn n Gn 0 + Gn n n

 

 

 

 

 

 

= Gn0 n 1

 

n Gn0 n 1Gn ^ 0

 

 

 

 

 

 

 

 

 

 

 

g

 

 

 

 

 

 

 

^

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Solving for and using the WLLN and CLT yields

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

 

 

 

 

1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

p

n

^ 0 'd Gn0

n 1Gn

1 1

Gn0 n 1p

n

g

n

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

! G0 G

 

G0 N (0; )

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

= N (0; V )

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Solving (12.14) for

^

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

and using (12.15) yields

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

^

 

 

 

1

 

 

 

 

1

 

1

 

1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

 

 

 

 

 

1

 

 

 

 

1

 

 

 

 

 

 

 

 

 

 

 

pn 'd

n

I Gn

Gn0 n

Gn

 

 

1

 

Gn0 n

pngn

 

 

 

 

 

 

 

 

 

! 1

 

I G G0

G

 

 

G0 N (0; )

 

 

 

 

 

 

 

 

 

= N (0; V )

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Furthermore, since

 

 

 

 

 

 

 

G0 I 1G

G0 1G 1 G0 = 0

p

 

^ 0

and p

 

 

 

 

 

 

 

 

^ are asymptotically uncorrelated and hence independent.

n

n

207

(12.15)

(12.16)

12.3Overidentifying Restrictions

In a parametric likelihood context, tests are based on the di¤erence in the log likelihood functions. The same statistic can be constructed for empirical likelihood. Twice the di¤erence between the unrestricted empirical log-likelihood n log (n) and the maximized empirical log-likelihood for the model (12.7) is

LRn = i=1

2 log 1 + ^0gi ^ :

(12.17)

n

 

 

 

X

 

 

 

 

d

 

 

Theorem 12.3.1 If Egi( 0) = 0 then LRn ! `2 k

:

 

 

 

 

 

The EL overidenti…cation test is similar to the GMM overidenti…cation test. They are asymptotically …rst-order equivalent, and have the same interpretation. The overidenti…cation test is a very useful by-product of EL estimation, and it is advisable to report the statistic LRn whenever EL is the estimation method.

Proof of Theorem 12.3.1. First, by a Taylor expansion, (12.15), and (12.16),

pn i=1 gi ^

'

pn gn + Gn ^ 0

 

n

 

 

 

 

 

 

 

 

1

X

 

 

 

 

 

 

 

 

 

 

'

I Gn Gn0 n 1Gn 1 Gn0 n 1 p

 

 

n

 

 

n

 

 

g

 

 

'

 

p

 

^

 

 

 

 

 

 

 

 

 

 

 

 

n

n :

CHAPTER 12. EMPIRICAL LIKELIHOOD

 

208

Second, since log(1 + u) ' u u2=2 for u small,

 

 

LRn = i=1

2 log 1 + ^0gi ^

 

n

 

 

 

X

n

n

^ gi ^ 0 ^

 

X

X

' 2^0

i=1 gi ^ ^0

i=1 gi

^0

^

 

 

' n

n

 

 

d 0 1

! N (0; V ) N (0; V )

= 2` k

where the proof of the …nal equality is left as an exercise.

12.4Testing

Let the maintained model be

Egi( ) = 0

(12.18)

where g is ` 1 and is k 1: By “maintained” we mean that the overidentfying restrictions contained in (12.18) are assumed to hold and are not being challenged (at least for the test discussed in this section). The hypothesis of interest is

h( ) = 0:

where h : Rk ! Ra: The restricted EL estimator and likelihood are the values which solve

~

= argmax R( )

h( )=0

~

R( ) = max R( ):

h( )=0

~

Fundamentally, the restricted EL estimator is simply an EL estimator with ` k+a overidentifying

~ ^ restrictions, so there is no fundamental change in the distribution theory for relative to : To test

the hypothesis h( ) while maintaining (12.18), the simple overidentifying restrictions test (12.17) is not appropriate. Instead we use the di¤erence in log-likelihoods:

^~

LRn = 2 R( ) R( ) :

This test statistic is a natural analog of the GMM distance statistic.

d 2

Theorem 12.4.1 Under (12.18) and H0 : h( ) = 0; LRn ! a:

The proof of this result is more challenging and is omitted.

CHAPTER 12. EMPIRICAL LIKELIHOOD

209

12.5Numerical Computation

Gauss code which implements the methods discussed below can be found at

http://www.ssc.wisc.edu/~bhansen/progs/elike.prc

Derivatives

The numerical calculations depend on derivatives of the dual likelihood function (12.4). De…ne

 

 

gi ( ; )

=

 

gi ( )

 

 

 

 

1G+i ( 0)0i

 

 

 

 

 

 

 

 

g

( )

 

 

Gi ( ; )

=

 

 

 

 

 

 

 

1 + 0gi

( )

 

 

 

 

 

 

The …rst derivatives of (12.4) are

 

 

 

 

 

 

 

 

 

 

 

 

@

 

 

 

n

 

 

 

R

 

=

R ( ; ) =

 

g

( ; )

 

@

 

 

 

 

 

i

 

 

 

 

 

 

 

 

 

=1

 

 

 

 

 

 

 

 

 

 

Xi

 

 

 

 

 

@

 

 

 

n

 

 

 

R

 

=

R ( ; ) =

 

G ( ; ) :

 

@

 

 

 

 

 

i

 

 

 

 

 

 

 

 

 

=1

 

 

 

 

 

 

 

 

 

 

Xi

 

 

The second derivatives are

R =

@2

 

 

R ( ; ) =

@ @ 0

 

 

R =

@2

 

 

R ( ; ) =

@ @ 0

 

R =

@2

 

 

R ( ; ) =

@ @ 0

n

 

 

 

 

 

 

 

 

 

Xi

gi ( ; ) gi ( ; )0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

=1

 

 

 

 

 

 

 

 

 

n

 

 

 

 

 

 

 

 

 

X

 

 

 

 

)

 

 

 

i=1 gi ( ; ) Gi ( ; )0

 

Gi (

 

 

 

1 + 0gi ( )

 

 

n

0Gi ( ; ) Gi ( ; )0

 

 

 

@2

gi ( )0

 

1

 

 

 

@ @ 0

 

Xi

 

 

 

1 + 0gi ( )

 

@

 

A

=1

 

 

 

 

 

 

 

 

 

Inner Loop

The so-called “inner loop” solves (12.5) for given : The modi…ed Newton method takes a quadratic approximation to Rn ( ; ) yielding the iteration rule

j+1 = j (R ( ; j)) 1 R ( ; j) :

(12.19)

where > 0 is a scalar steplength (to be discussed next). The starting value 1 can be set to the zero vector. The iteration (12.19) is continued until the gradient R ( ; j) is smaller than some prespeci…ed tolerance.

E¢ cient convergence requires a good choice of steplength : One method uses the following quadratic approximation. Set 0 = 0; 1 = 12 and 2 = 1: For p = 0; 1; 2; set

p

=

j p (R ( ; j)) 1 R ( ; j))

Rp

=

R ( ; p)

A quadratic function can be …t exactly through these three points. The value of which minimizes this quadratic is

^ =

R2 + 3R0 4R1

:

 

4R2 + 4R0 8R1

yielding the steplength to be plugged into (12.19).

CHAPTER 12. EMPIRICAL LIKELIHOOD

210

A complication is that must be constrained so that 0

pi 1 which holds if

n 1 + 0gi ( ) 1

(12.20)

for all i: If (12.20) fails, the stepsize needs to be decreased.

Outer Loop

The outer loop is the minimization (12.6). This can be done by the modi…ed Newton method described in the previous section. The gradient for (12.6) is

R = @@ R( ) = @@ R( ; ) = R + 0 R = R

since R ( ; ) = 0 at = ( ); where

= @@ 0 ( ) = R 1R ;

the second equality following from the implicit function theorem applied to R ( ; ( )) = 0: The Hessian for (12.6) is

R =

@

 

R( )

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

@ @ 0

 

 

 

 

 

 

 

 

 

 

 

 

=

@

 

R ( ; ( )) + 0 R ( ; ( ))

 

 

 

 

 

 

 

 

 

@ 0

 

 

 

 

 

 

 

 

 

( ; ( )) + R0

 

 

+ 0

 

 

+ 0

 

 

 

=

R

 

 

 

R

 

R

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

= R0 R 1R

 

R :

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

It is not guaranteed that R > 0: If not, the eigenvalues of R should be adjusted so that all are positive. The Newton iteration rule is

j+1 = j R 1 R

where is a scalar stepsize, and the rule is iterated until convergence.

Chapter 13

Endogeneity

We say that there is endogeneity in the linear model y = x0i + ei if is the parameter of interest and E(xiei) 6= 0: This cannot happen if is de…ned by linear projection, so requires a structural interpretation. The coe¢ cient must have meaning separately from the de…nition of a conditional mean or linear projection.

Example: Measurement error in the regressor. Suppose that (yi; xi ) are joint random variables, E(yi j xi ) = xi 0 is linear, is the parameter of interest, and xi is not observed. Instead we observe xi = xi + ui where ui is an k 1 measurement error, independent of yi and xi : Then

yi = xi 0 + ei

 

 

 

 

= (xi ui)0 + ei

 

 

 

 

= xi0 + vi

where

 

 

 

vi = ei ui0 :

The problem is that

 

E(xivi) = E (xi + ui) ei ui0 = E uiui0 6= 0

 

 

 

if = 0 and

E

(uiu0) = 0: It follows that if ^ is the OLS estimator, then

6

i

6

= E xixi0 1 E uiui0 6= :

 

 

 

^ !

 

 

 

p

 

This is called measurement error bias.

Example: Supply and Demand. The variables qi and pi (quantity and price) are determined jointly by the demand equation

and the supply equation

 

qi = 1pi + e1i

 

 

 

 

 

 

 

 

qi = 2pi + e2i:

 

e1i

is iid, Eei = 0; 1 + 2

 

 

Assume that ei = e2i

= 1 and Eeiei0 = I2 (the latter for simplicity).

The question is, if we regress qi on pi; what happens?

 

It is helpful to solve for qi and pi in terms of the errors. In matrix notation,

 

1

2

pi

= e2i

 

 

1

1

qi

e1i

 

211

CHAPTER 13. ENDOGENEITY

 

 

 

 

 

 

 

 

 

 

 

 

 

 

212

so

 

=

1 2

 

e2i

 

 

 

 

pi

 

 

 

qi

 

 

 

 

1

1

1

 

e1i

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

=

12

1

e2i

 

 

 

 

 

 

 

 

 

 

 

1

 

e1i

 

 

 

 

 

 

 

 

 

2e1i + 1e2i

 

 

 

 

 

The projection of qi on pi yields

 

=

(e1i e2i)

:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

qi

=

pi + "i

 

 

 

 

 

 

E(pi"i) = 0

 

 

 

 

 

 

 

 

where

 

 

E(piqi)

 

2 1

 

 

 

 

 

 

 

=

=

 

 

 

 

 

 

 

E pi2

 

 

 

 

 

 

 

 

 

 

 

2

 

 

 

 

 

 

 

^

p

 

 

which does not equal either

 

or

: This is called

Hence if it is estimated by OLS, ! ;

1

 

 

 

 

 

 

 

 

 

2

 

simultaneous equations bias.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

13.1 Instrumental Variables

Let the equation of interest be

 

yi = xi0 + ei

(13.1)

where xi is k 1; and assume that E(xiei) 6= 0 so there is endogeneity.

We call (13.1) the

structural equation. In matrix notation, this can be written as

 

y = X + e:

(13.2)

Any solution to the problem of endogeneity requires additional information which we call instruments.

De…nition 13.1.1 The ` 1 random vector zi is an instrumental variable for (13.1) if E(ziei) = 0:

In a typical set-up, some regressors in xi will be uncorrelated with ei (for example, at least the intercept). Thus we make the partition

 

 

 

 

xi =

x1i

k1

(13.3)

x2i

k2

 

 

where E(x1iei) = 0 yet E(x2iei) 6= 0: We call x1i

exogenous and x2i endogenous. By the above

de…nition, x1i is an instrumental variable for (13.1); so should be included in zi: So we have the partition

x1i

 

k1

 

zi = z2i

`2

(13.4)

where x1i = z1i are the included exogenous variables, and z2i are the excluded exogenous variables. That is z2i are variables which could be included in the equation for yi (in the sense that they are uncorrelated with ei) yet can be excluded, as they would have true zero coe¢ cients in the equation.

The model is just-identi…ed if ` = k (i.e., if `2 = k2) and over-identi…ed if ` > k (i.e., if

`2 > k2):

We have noted that any solution to the problem of endogeneity requires instruments. This does not mean that valid instruments actually exist.

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]