A Brief Tour of CUPED and Related Methods (Pt. 2)

Author

Demetri Pananos

Published

March 10, 2025

Introduction

In a previous post, I demonstrated that Deng et. al’s CUPED is formally equivalent to regression adjustment as justified through the Frisch Waugh Lovell theorem. As a consequence, analysts can feel justified fitting a regression model as opposed to performing the calculations outlined in Deng 2013.

Deng et. al recommend estimating \(\theta\) – the coefficient for the control variate, or equivalently a regression coefficient – from the pooled population of treatment and control, writing “The impact on variance reduction will likely be negligible”. This recommendation is akin to recommending the following model be estimated

\[ Y = \beta_0 + \beta_1 X + \Delta D + \epsilon \]

where \(D\) is a randomly assigned treatment indicator. Most statisticians will recognize this as an ANCOVA model (though the names are moot here, its all regression). In this post, we discuss potential biases problems with the model above when \(D\) is randomly assigned, as well as how the model can be improved, and how this all relates to CUPED (which again, is just regression).

Freedman 2008