Doing Economics What You Should Have Learned in Grad School But
2.5.3 Treatment Heterogeneity It is rare that the treatment effect we are interested in estimating is
homogeneous across the population of interest. After assessing the
robustness of your results, you may be interested in looking at whether the
treatment varies for various subgroups (e.g., men vs. women, rural vs.
urban, black vs. white, by income quintile, etc.) This section is where this is
assessed. Keeping with the contract farming example, suppose you were
interested in whether the impacts of contract farming differ between male
and female respondents. This alone would bring the number of estimated
specifications up to 28 (i.e., seven measures of welfare, two treatment
measures, and male vs. female respondents). From this, it is rather it easy to
see why the average applied paper is now typically 50 pages—if not longer.
One good thing about exploring treatment heterogeneity is this: doing so
can salvage a null finding (i.e., an effect that is statistically insignificant)
because average effects can mask a tremendous amount of heterogeneity.
So before calling it quits, saying that an intervention or treatment has had
“no effect” and abandoning an entire research project, it is well worth
thinking about whether the treatment effect might be heterogeneous, and
whether said heterogeneity is of interest for policy or business.
When I write that exploring treatment heterogeneity can salvage a null
finding, you should not conclude from that statement that when you have a
null finding, you should explore treatment heterogeneity. If you wish to
explore treatment heterogeneity, you need to have a good reason (usually
stemming from your theoretical framework) for doing so. Anything else
will reek of p-hacking (i.e., the phenomenon whereby researchers slice their
data until they find something significant to report in their paper), which
leads to plain bad science.