## The data set in VOUCHER, which is a subset of

The data set in VOUCHER, which is a subset of the data used in Rouse (1998), can be used to estimate the effect of school choice on academic achievement. Attendance at a choice school was paid for by a voucher, which was determined by a lottery among those who applied. The data subset was chosen so that any student in the sample has a valid 1994 math test score (the last year available in Rouse’s sample). Unfortunately, as pointed out by Rouse, many students have missing test scores, possibly due to attrition (that is, leaving the Milwaukee public school district). These data include students who applied to the voucher program and were accepted, students who applied and were not accepted, and students who did not apply. Therefore, even though the vouchers were chosen by lottery among those who applied, we do not necessarily have a random sample from a population where being selected for a voucher has been randomly determined. (An important consideration is that students who never applied to the program may be systematically different from those who did—and in ways that we cannot know based on the data.)

Rouse (1998) uses panel data methods of the kind we discussed in Chapter 14 to allow student fixed effects; she also uses instrumental variables methods. This problem asks you to do a cross-sectional analysis which winning the lottery for a voucher acts as an instrumental variable for attending a choice school. Actually, because we have multiple years of data on each student, we construct two variables. The first, choiceyrs, is the number of years from 1991 to 1994 that a student attended a choice school; this variable ranges from zero to four. The variable selectyrs indicates the number of years a student was selected for a voucher. If the student applied for the program in 1990 and received a voucher then selectyrs 5 4; if he or she applied in 1991 and received a voucher then selectyrs 5 3; and so on. The outcome of interest is mnce, the student’s percentile score on a math test administered in 1994.

(i) Of the 990 students in the sample, how many were never awarded a voucher? How many had a voucher available for four years? How many students actually attended a choice school for four years?

(ii) Run a simple regression of choiceyrs on selectyrs. Are these variables related in the direction you expected? How strong is the relationship? Is selectyrs a sensible IV candidate for choiceyrs?

(iii) Run a simple regression of mnce on choiceyrs. What do you find? Is this what you expected? What happens if you add the variables black, hispanic, and female?

(iv) Why might choiceyrs be endogenous in an equation such as

(v) Estimate the equation in part (iv) by instrumental variables, using selectyrs as the IV for choiceyrs. Does using IV produce a positive effect of attending a choice school? What do you
make of the coefficients on the other explanatory variables?

(vi) To control for the possibility that prior achievement affects participating in the lottery (as well as predicting attrition), add mnce90—the math score in 1990—to the equation in part (iv). Estimate the equation by OLS and IV, and compare the results for β1. For the IV estimate, how much is each year in a choice school worth on the math percentile score? Is this a practically large effect?

(vii) Why is the analysis from part (vi) not entirely convincing? [Hint: Compared with part (v), what happens to the number of observations, and why?]

(viii) The variables choiceyrs1, choiceyrs2, and so on are dummy variables indicating the different number of years a student could have been in a choice school (from 1991 to 1994). The dummy variables selectyrs1, selectyrs2, and so on have a similar definition, but for being selected from the lottery. Estimate the equation

by IV, using as instruments the four selectyrs dummy variables. (As before, the variables black, hispanic, and female act as their own IVs.) Describe your findings. Do they make sense?

## Use the data in CATHOLIC to answer this question. The

Use the data in CATHOLIC to answer this question. The model of interest is

where cathhs is a binary indicator for whether a student attends a Catholic high school.

(i) How many students are in the sample? What percentage of these students attend a Catholic high school?

(ii) Estimate the above equation by OLS. What is the estimate of b1? What is its 95% confidence interval?

(iii) Using parcath as an instrument for cathhs, estimate the reduced form for cathhs. What is the t statistic for parcath? Is there evidence of a weak instrument problem?

(iv) Estimate the above equation by IV, using parcath as an IV for cathhs. How does the estimate and 95% CI compare with the OLS quantities?

(v) Test the null hypothesis that cathhs is exogenous. What is the p-value of the test?

(vi) Suppose you add the interaction between cathhs ? motheduc to the above model. Why is it generally endogenous? Why is pareduc ? motheduc a good IV candidate for cathhs ? motheduc?

(vii) Before you create the interactions in part (vi), first find the sample average of motheduc and create cathhs · (motheduc – motheduc) and parcath · (motheduc – motheduc). Add the first interaction to the model and use the second as an IV. Of course, cathhs is also instrumented. Is the interaction term statistically significant?

(viii) Compare the coefficient on cathhs in (vii) to that in part (iv). Is including the interaction important for estimating the average partial effect?

## Use the data in LABSUP to answer the following questions.

Use the data in LABSUP to answer the following questions. These are data on almost 32,000 black or Hispanic women. Every woman in the sample is married. It is a subset of the data used in Angrist and Evans (1998). Our interest here is in determining how weekly hours worked, hours, changes with number of children (kids). All women in the sample have at least two children. The two potential instrumental variables for kids, which is suspected as being endogenous, work to generate exogenous variation starting with two children. See the original article for further discussion.

(i) Estimate the equation

by OLS and obtain the heteroskedasticity-robust standard errors. Interpret the coefficient on kids. Discuss its statistical significance.

(ii) A variable that Angrist and Evans propose as an instrument is samesex, a binary variable equal to one if the first two children are the same biological sex. What do you think is the argument for why it is a relevant instrument for kids?

(iii) Run the regression

and see if the story from part (ii) holds up. In particular, interepret the coefficient on samesex. How statistically significant is samesex?

(iv) Can you think of mechanisms by which samesex is correlated with u in the equation in part (i)? (It is fine to assume that biological sex is randomly determined.) [Hint: How might a family’s finances be affected based on whether they have two children of the same sex or two children of opposite sex?]

(v) Is it legitimate to check for exogeneity of samesex by adding it to the regression in part (i) and testing its significance? Explain.

(vi) Using samesex as an IV for kids, obtain the IV estimates of the equation in part (i). How does the kids coefficient compare with the OLS estimate? Is the IV estimate precise?

(vii) Now add multi2nd as an instrument. Obtain the F statistic from the first stage regression and determining whether samesex and multi2nd are sufficiently strong.

(viii) Using samesex and multi2nd both as instruments for kids, how does the 2SLS estimate compare with the OLS and IV estimates from the previous parts?

(ix) Using the estimation from part (viii), is there strong evidence that kids is endogenous in the hours equation?

(x) In part (viii), how many overidentification restrictions are there? Does the overidentification test pass?

## The data in CENSUS2000 is a random sample of individuals

The data in CENSUS2000 is a random sample of individuals from the United States. Here we are interested in estimating a simple regression model relating the log of weekly income, lweekinc, to schooling, educ. There are 29,501 observations. Associated with each individual is a state identifier (state) for the 50 states plus the District of Columbia. A less coarse geographic identifier is puma, which takes on 610 different values indicating geographic regions smaller than a state.

Running the simple regression of lweekinc on educ gives a slope coefficient equal to .1083 (to four decimal places). The heteroskedasticity-robust standard error is about .0024. The standard error clustered at the puma level is about .0027, and the standard error clustered at the state level is about .0033. For computing a confidence interval, which of these standard errors is the most reliable? Explain.

## The data set HAPPINESS contains independently pooled cross sections for

The data set HAPPINESS contains independently pooled cross sections for the even years from 1994 through 2006, obtained from the General Social Survey. The dependent variable for this problem is a measure of “happiness,” vhappy, which is a binary variable equal to one if the person reports being “very happy” (as opposed to just “pretty happy” or “not too happy”).

(i) Which year has the largest number of observations? Which has the smallest? What is the percentage of people in the sample reporting they are “very happy”?

(ii) Regress vhappy on all of the year dummies, leaving out y94 so that 1994 is the base year. Compute a heteroskedasticity-robust statistic of the null hypothesis that the proportion of very happy people has not changed over time. What is the p-value of the test?

(iii) To the regression in part (ii), add the dummy variables occattend and regattend. Interpret their coefficients. (Remember, the coefficients are interpreted relative to a base group.) How would you summarize the effects of church attendance on happiness?

(iv) Define a variable, say highinc, equal to one if family income is above \$25,000. (Unfortunately, the same threshold is used in each year, and so inflation is not accounted for. Also, \$25,000 is hardly what one would consider “high income.”) Include highinc, unem10, educ, and teens in the regression in part (iii). Is the coefficient on regattend affected much? What about its statistical significance?

(v) Discuss the signs, magnitudes, and statistical significance of the four new variables in part (iv). Do the estimates make sense?

(vi) Controlling for the factors in part (iv), do there appear to be differences in happiness by gender or race? Justify your answer.

## Using the “cluster” option in the econometrics package Stata® 11,

Using the “cluster” option in the econometrics package Stata® 11, the fully robust standard errors for the pooled OLS estimates in Table 14.2—that is, robust to serial correlation and heteroskedasticity in the composite errors,

(i) How do these standard errors generally compare with the nonrobust ones, and why?

(ii) How do the robust standard errors for pooled OLS compare with the standard errors for RE? Does it seem to matter whether the explanatory variable is time-constant or time-varying?

(iii) When the fully robust standard errors for the RE estimates are computed, Stata® 11 reports the following (where we look at only the coefficients on the time-varying variables):

[These are robust to any kind of serial correlation or heteroskedasticity in the idiosyncratic errors {uit: t = 1, . . . . , T} as well as heteroskedasticity in αi.] How do the robust standard errors generally compare with the usual RE standard errors reported in Table 14.2? What conclusion might you draw?

(iv) Comparing the four standard errors in part (iii) with their pooled OLS counterparts, what do you make of the fact that the robust RE standard errors are all below the robust pooled OLS standard errors?

## The data set DRIVING includes state-level panel data (for the

The data set DRIVING includes state-level panel data (for the 48 continental U.S. states) from 1980 through 2004, for a total of 25 years. Various driving laws are indicated in the data set, including the alcohol level at which drivers are considered legally intoxicated. There are also indicators for “per se” laws—where licenses can be revoked without a trial—and seat belt laws. Some economics and demographic variables are also included.

(i) How is the variable totfatrte defined? What is the average of this variable in the years 1980, 1992, and 2004? Run a regression of totfatrte on dummy variables for the years 1981 through 2004, and describe what you find. Did driving become safer over this period? Explain.

(ii) Add the variables bac08, bac10, perse, sbprim, sbsecon, sl70plus, gdl, perc14_24, unem, and vehicmilespc to the regression from part (i). Interpret the coefficients on bac8 and bac10. Do per se laws have a negative effect on the fatality rate? What about having a primary seat belt law? (Note that if a law was enacted sometime within a year the fraction of the year is recorded in place of the zero-one indicator.)

(iii) Reestimate the model from part (ii) using fixed effects (at the state level). How do the coefficients on bac08, bac10, perse, and sbprim compare with the pooled OLS estimates? Which set of estimates do you think is more reliable?

(iv) Suppose that vehicmilespc, the number of miles driven per capita, increases by 1,000. Using the FE estimates, what is the estimated effect on totfatrte? Be sure to interpret the estimate as if explaining to a layperson.

(v) If there is serial correlation or heteroskedasticity in the idiosyncratic errors of the model then the standard errors in part (iii) are invalid. If possible, use “cluster” robust standard errors for the fixed effects estimates. What happens to the statistical significance of the policy variables in part (iii)?

## Use the data in COUNTYMURDERS to answer this question. The

Use the data in COUNTYMURDERS to answer this question. The data set covers murders and executions (capital punishment) for 2,197 counties in the United States. See also Computer Exercise C16 in Chapter 13.

(i) Consider the model

where θt represents a different intercept for each time period, αi is the county fixed effect, and uit is the idiosyncratic error. Why does it make sense to include lags of the key variable, execs, in the equation?

(ii) Apply OLS to the equation from part (i) and report the estimates of δ0, δ1, δ2, and δ3, along with the usual pooled OLS standard errors. Do you estimate that executions have a deterrent effect on murders? Provide an explanation that involves αi.

(iii) Now estimate the equation in part (i) using fixed effects to remove αi. What are the new estimates of the δj? Are they very different from the estimates from part (ii)?

(iv) Obtain the long-run propensity from estimates in part (iii). Using the usual FE standard errors, is the LRP statistically different from zero?

(v) If possible, obtain the standard errors for the FE estimates that are robust to arbitrary heteroskedasticity and serial correlation in the 5uit6. What happens to the statistical significance of the δ̂j? What about the estimated LRP?

## Use the data in WAGEPAN.DTA to answer the following questions. (i)

Use the data in WAGEPAN.DTA to answer the following questions.

(i) Using lwage as the dependent variable, estimate a model that only contains an intercept and the year dummies d81 through d87. Use pooled OLS, RE, FE, and FD (where in the latter case you difference the year dummies, along with lwage, and omit an overall constant in the FD regression). What do you conclude about the coefficients on the year dummies?

(ii) Add the time-constant variables educ, black, and hisp to the model, and estimate it by OLS and RE. How do the coefficients compare? What happens if you estimate the equation by FE?

(iii) What do you conclude about the four estimation methods when the model includes only variables that change just across t or just across i?

(iv) Now estimate the equation

by random effects. Do the coefficients seem reasonable? How do the nonrobust and clusterrobust standard errors compare?

(v) Now estimate the equation

by fixed effects, being sure to include the full set of time dummies to reflect the different interecepts. How do the estimates of b1 and b2 compare with those in part (iv)? Compute the
usual FE standard errors and the cluster-robust standard errors. How do they compare?

(vi) Obtain the time averages unioni and marriedi. Along with educ, black, and hisp, add these to the equation from part (iv). Verify that the CRE estimates of b1 and b1 are identical to the FE estimates.

(vii) Obtain the robust, variable addition Hauman test. What do you conclude about RE versus FE?

(viii) Let educ have an interactive effect with both union and married and estimate the model by fixed effects. Are the interactions individually or jointly significant? Why are the coefficients on union and married now imprecisely estimated?

(ix) Estimate the average partial effects of union and married for the model estimated in part (viii). How do these compare with the FE estimates from part (v)?

(x) Verify that for the model in part (viii) the CRE estimates are the same as the FE estimates when they should be.

## Use the data set in AIRFARE to answer this question.

Use the data set in AIRFARE to answer this question. The estimates can be compared with those in Computer Exercise 10, in this Chapter.

(i) Compute the time averages of the variable concen; call these concenbar. How many different time averages can there be? Report the smallest and the largest.

(ii) Estimate the equation

(iii) If you drop ldist and ldistsq from the estimation in part (i) but still include concenbari, what happens to the estimate of β1? What happens to the estimate of γ1?

(iv) Using the equation in part (ii) and the usual RE standard error, test H0: γ1 = 0 against the two-sided alternative. Report the p-value. What do you conclude about RE versus FE for estimating β1 in this application?

(v) If possible, for the test in part (iv) obtain a t-statistic (and, therefore, p-value) that is robust to arbitrary serial correlation and heteroskedasticity. Does this change the conclusion reached in part (iv)?