Thursday, 30 March 2017

Loken and Gelman

In a recent Science paper, Eric Loken and Andrew Gelman discuss an interesting statistical nuance relevant to measurement error. Loken and Gelman begin by pointing out statisticians/econometricians/people with data and computers have the mindset that measurement error in one of our covariates biases our point estimates toward zero—the infamous attenuation bias. Pushing this attenuation-based mindset further, researchers infer that a statistically significant point estimate would be even larger in the absence of measurement error. Loken and Gelman that this second logical leap is not, in fact, logical. Why? While attenuation bias is a real thing, once we begin conditioning on statistically significant results, we add a second dimension that we must consider: power. In settings with low power—e.g., small \(N\) and subtantial variance—in order to “achieve” a statistically significant result, the point estimate must be quite large. Thus, the attenuation effect may be reversed by a low-power effect. The big idea here is that conditioning on significant results—a very real thing in the presence of p-hacking and publication bias—changes the behavior of something we all thought we understood—measurement error and attentuation bias.

Loken and Gelman’s takeaway?

A key point for practitioners is that surprising results from small studies should not be defended by saying that they would have been even better with improved measurement.


I thought it would be fun to replicate the results in a simple simulation.

Function: Generating data

One of Loken and Gelman’s big points is that sample size (specifically, power) matters for attenuation bias, once we start conditioning on statistically significant results. So let’s write a function that generates a sample of size n for a simple linear regression. The function will also take as inputs the true intercept alpha and the true slope coefficient beta. For simplicity, we will generate each variable from a standard normal distribution. Finally, the function will accept a third parameter gamma that dictates the degree of measurement error in our covariance.

Formally, the population data-generating process is

\[ y_i = \alpha + \beta x_i + \varepsilon_i \]

but instead of observing \(x_i\), the researcher observes

\[ w_i = x_i + \gamma u_i \]

where \(\gamma u_i\) is the “noise” that we add to \(x_i\) to generate measurement error. Again, we will assume \(u_i\) comes from a standard normal distribution.

Function: Running the simulation

In the simulation, we will regress \(\mathbf{y}\) on \(\mathbf{w}\) (and a column of ones) and then calculate the standard errors, t statistics, and p-values. The base installation’s lm() function works just fine in this context. We’ll ignore inference/estimates for the intercept.

Now we’ll make a wrapper function that applies sim_data() and sim_reg(), effectively running a single iteration of the simulation.

And now a function to run the simulation n_iter times for the given set of parameters (n_iter times for each sample size n).

Run the simulation, three sample sizes

Now let’s actually run the simulation. Let’s start with three sample sizes: 30, 100, 1000.

Examine results, three sample sizes

A few quick changes for plotting/description:

Examine the results:

##       n mean_coef pct_sample
## 1:   30 0.9376291          1
## 2:  100 0.9406927          1
## 3: 1000 0.9405517          1
##       n   sig mean_coef pct_sample
## 1:   30  TRUE 0.9415805     0.9926
## 2:   30 FALSE 0.4075973     0.0074
## 3:  100  TRUE 0.9406927     1.0000
## 4: 1000  TRUE 0.9405517     1.0000
##       n   sig pct_exceed pct_sample
## 1:   30  TRUE  0.3693331     0.9926
## 2:   30 FALSE  0.0000000     0.0074
## 3:  100  TRUE  0.2791000     1.0000
## 4: 1000  TRUE  0.0293000     1.0000

Let’s plot the distribution of significant coefficient estimates.

We can also make a nice comparison of the estimated coefficient compared with the “ideal” estimate (without measurement error), following Loken and Gelman’s figure.

We again see Loken and Gelman’s point: for smaller sizes, (conditional on a significant result) we frequently see point estimates from the “measure with error” regression that exceed the corresponding point estimates from regressions measured without error—not exactly the attenuation story we typically have in mind.

Run the simulation, many sample sizes

Now let’s run the simulation for a bunch of sample sizes.

Note: You may want to drop the number of iterations per sample size to 1,000 (from 10,000) so it finishes in a reasonable amount of time.

To replicate Loken and Gelman’s final figure, we need to generate a quick summary table (after pairing the results). In this table, for each sample size, we will calculate the percent of studies where

Now we can create the figure.

This figure does not quite match the figure from Loken and Gelman. Why? One important observation that I don’t think Loken and Gelman really discuss: their results vary by the parameters in the simulation: the treatment effect (\(\beta\)), the degree of measurement error \(\gamma\) (maps to the variance of the measurement error), and the variance of the disturbance \(\varepsilon\).


Loken and Gelman make a pretty interesting observation: attenuation bias may not have the same effect in small samples that it has in large samples.1 I think the big picture is that some of our econometric/statistical intuition changes in the presence of p-hacking or publication bias—the theory behind our estimation and inference typically does not take into account p-hacking or publication bias.

  1. As discussed above, this result depends upon several parameters.