know whether you condition on colliders or include as control a
variable that lies on the causal path
between the treatment and
outcome variables.
11
• The last lines of the table should list the number of observations,
the
R
2
(I prefer the usual
R
2
to the adjusted one, because this tells
me how much of the variation in
y is explained by the variables
on the right-hand side, without any arbitrary correction for the
number of observations and parameters), maybe the results of a
test of joint significance of the variables on the right-hand side,
and various lines indicating which controls are included (e.g.,
state fixed effects, a linear time trend, year fixed effects,
state-
specific linear trends, state-specific quadratic trends, region–year
fixed effects, and so on).
• Finally, the notes to the table should present all symbols for
statistical significance (typically, * for statistical significance at
less than the 10 percent level, ** at less than the 5 percent level,
and *** at less than the 1 percent level;
for completeness and
transparency, none should be omitted), and additional symbols if
necessary.
12
For instance, you may have adjusted your
p-values
for multiple comparisons, bootstrapped
your standard errors, or
done some randomization inference, all of which would lead to
different inferences and critical levels of statistical significance,
in which case you might use the symbols
†
,
††
, and
†††
to denote
significance at less than the 10, 5, and 1
percent level for this
additional version of the standard errors.
• Present estimation results for the same estimation sample. That is,
as the number of control variables increases, the sample size is
nonincreasing due to missing variables. If the sample size
decreases as you throw controls on the right-hand side, this
involves an apples-to-oranges comparison (different estimation
samples are representative of different populations). Instead, take
your smallest sample size (as dictated by missing observations)
and use that sample for all specifications.
•
For variable names, use plain English words like “Years of
education,” “Age squared,” and “Female” and not Stata or R
codenames like “Edu,” “AGE_2,” or “SEX.”
• Ultimately, it always helps to put yourself in your reader’s shoes,
and the right question to ask yourself (or a friend who owes you a
favor) is this: When given only the tables, can one write down the
exact regression that was estimated? Or is one left with more
questions than one has answered after looking at the tables?
Dostları ilə paylaş: