Let’s say that we want to measure the effect of a phone call encouraging people to register to vote on voting. Let’s define compliance as a person taking the call. And let’s assume that the compliance rate is low. The traditional way to estimate the effect of the phone call is via an RCT: randomly split the sample into Treatment and Control, call everyone in the Treatment Group, wait till after the election, and calculate the difference in the proportion who voted. Assuming that the treatment doesn’t affect non-compliers, etc., we can also estimate the Complier Average Treatment Effect.

But one way to think about non-compliance in the example above is as follows: “Buddy, you need to reach these people using another way.” That is a super useful thing to know, but it is an observational point. You can fit a predictive model for who picks up phone calls and who doesn’t. The experiment is useful in answering how much can you persuade the people you reach on the phone. And you can learn that by randomizing conditional on compliance.

For such cases, here’s what we can do:

- Call a reasonably large random sample of people. Learn a model for who complies.
- Use it to target people who are likelier to comply and randomize post a person picking up.

More generally, Average Treatment Effect is useful for global rollouts of one policy. But when is that a good counterfactual to learn? Tautologically, when that is all you can do or when it is the optimal thing to do. If we are not in that world, why not learn about—and I am using the example to be concrete—a) what is a good way to reach me, b) what message do you want to show me. For instance, for political campaigns, the optimal strategy is to estimate the cost of reaching people by phone, mail, f2f, etc., estimate the probability of reaching each using each of the media, estimate the payoff for different messages for different kinds of people, and then target using the medium and the message that delivers the greatest benefit. (For a discussion about targeting, see here.)

But technically, a message could have the greatest payoff for the person who is least likely to comply. And the optimal strategy could still be to call everyone. To learn treatment effects among people who are unlikely to comply (using a particular method), you will need to build experiments to increase compliance. More generally, if you are thinking about multi-arm bandits or some such dynamic learning system, the insight is to have treatment arms around both compliance and message. The other general point, implicit in the essay, is that rather than be fixated on calculating ATE, we should be fixated on an optimization objective, e.g., the additional number of people persuaded to turn out to vote per dollar.