Evaluating endogenous treatment effects when the decision to treat is made at an aggregate level and outcomes are observed at the individual level

Dr. Lung-Fei Lee, Department of Economics
Rank at time of award: Professor
and
Dr. Patricia Reagan, Department of Economics
Rank at time of award: Professor

Background

There is a large literature in economics and public policy that seeks to evaluate the effect of various treatments and programs on outcomes of individuals. Attention has been given to two general econometric issues in estimating treatment effects: (i) How to obtain consistent estimates of treatment effects when treatment is endogenous because individuals or groups who have the most to gain from the treatment are most likely to seek treatment; (ii) How to correct standard errors when treatment occurs at an aggregate level, such as a state policy regarding Medicaid generosity, and outcomes, such as take-up rates, occur at the individual level. The purpose of this research proposal is to develop an estimator which both accounts for the endogeneity of treatments and provides appropriate standard errors when the decision to treat occurs at an aggregate level and outcomes occur at the individual level.

The classic example of endogeneity of treatment involves evaluating the effect of job training on individual employment and earnings. In this case the decision to obtain treatment Gob training) occurs at the same level as the outcome (employment and earnings). Imbens and Angrist (1994) and Heckman (1997) discuss the identification of treatment effects when treatment is endogenous in the case where treatment and outcomes occur at the same level. Other evaluation studies look at the effect of government programs, such as the generosity of welfare benefits, on individual outcomes. It has become quite popular in the economics literature to use difference-in-difference (DD) techniques to estimate the impact of these programs. Many of these studies use state level differences in treatment, for example the passage of a law, on outcomes before and after treatment and then compare this difference to the intertemporal difference in outcomes for the groups that did not receive treatment (for example Currie and Gruber (1996)). DD estimators are only appropriate when interventions, for example state laws, are random conditional on time and group fixed effects. Thus, one concern about difference in difference estimators is the endogeneity of the interventions.

Another problem with difference in difference estimators is that they do not explicitly recognize that the decision making unit that makes choices about public policy is different from the level at which the outcomes are observed. For example a state legislature can set generosity of the state Medicaid benefits, but outcomes on health care utilization are observed at the individual level. Thus in an analysis involving passage of state laws, there are 51 decisions made regarding treatment (one decision for each state and the District of Columbia). But we observe outcomes on thousands of individuals. Conventional difference estimators understate the true standard errors of estimated treatment effects, because they do not recognize that the effective sample size of the decision to treat is much smaller than the sample size of the outcomes. Bertrand, Duflo and Mullainathan (2004) describe methods to correct the standard errors in difference in difference estimators, but their correction method is applicable only to random interventions, conditional on time and group fixed effects.

Objectives

The purpose of this research is to develop a method to estimate treatment effects when treatment is endogenous, i.e. if is not random conditional on time and group fixed effects, and the sample size of the decision to treat is smaller than the sample size of outcomes. The decision to treat occurs at what is referred to for ease of exposition as the state level, but could be any aggregate decision making unit such as an Indian tribe, a county government or firm. Outcomes are observed at the individual level both before and after state decisions to treat. The simplest case involves a four-by four design. There is a pre and a post observation period. There are some groups who receive treatment, but not at random, and some who do not receive treatment. There is serial correlation in outcomes within a state. Furthermore the populations of the states can change as a result of policies so that DD estimators are not consistent.

The decision to treat is modeled as a function of the pre-treatment aggregate characteristics. States perceive different benefits from introducing the treatment based on differences in their populations in the pre-treatment period. These pre-treatment aggregate factors, such as state average poverty rates, are used as instruments to identify the state decision to treat. Post treatment outcomes are observed at the individual level.

We shall develop models which incorporate post treatment outcomes which can be continuous, discrete, or censored variables, while the treatment is likely an endogenous discrete variable. The possible models will be simultaneous equation systems in group setting. We shall develop estimation methods for such models. The method of maximum likelihood is feasible but, because likelihood functions of such models might involve multiple integrals, computationally tractable and efficient methods of simulation estimation need to be carefully developed. An alternative estimation method is a two step method when the number of sample observations in each group is large enough. In a group setting, as the treatment variable is invariant across individuals in a group, it may become a component in an overall group dummy variable. Other components of the overall group dummy variable may capture observed and unobserved group specific variables. In the first step, as the individual outcome equation shall be treated as a model with fixed effects, all the fixed effects and common parameters can be estimated by the method of maximum likelihood. In the second step, the treatment effect can be identified and estimated by the method of instrumental variables from the estimated fixed effects. One may compare the efficiency of these methods and their computationally simplicity in empirical studies. Statistical properties of the estimators shall also be rigorously studied.