The idea for this post came from a tweet sent out by David Henderson a couple of weeks ago.
Lack of data literacy is biggest barrier to #eval. Everyone in sector should know what a counterfactual is – yet many “evaluators” don’t.
— David Henderson (@david_henderson) April 23, 2013
That sounded like a challenge. So I reached out to David to ask if he would like to collaborate on a cartoon post covering the subject; he graciously accepted.
About the Illustrations
The tips and expertise are courtesy of David. The cartoons were developed by yours truly.
Could you do me a favor? If you like the cartoons, could you connect with me on LinkedIn and endorse me for cartoons?
A few notes:
- If you like the post, write a comment and let me know.
- Share it with colleagues. Seeing people sharing my cartoons inspires me to create more cartoons.
- If you think I’m missing critical pieces to the overall discussion, let me know in the comments.
- Please feel free to use my cartoons in presentations, training materials, etc.
What is a Counterfactual?
Impact is the difference between the outcomes of an individual who participates in a program versus the outcomes for that same individual at the same point in time had they not participated in the program. Since the same person cannot simultaneously be enrolled and not enrolled in a given program at the same time, impact evaluations are concerned with estimating the missing counterfactual, which is an estimate of what would have happened to an individual had they not participated in a program.
In robust evaluations, counterfactuals are estimated by randomly assigning some people to a program treatment group, and others to a control group. For a brief discussion of impact evaluation and the role of counterfactuals in evaluation, see this page by The World Bank.
Developing a Counterfactual
Economy
Your program is likely not to blame for a sputtering economy, nor does it deserve credit for an economic upswing. The key to a good evaluation is to try to isolate the effect of your program, irrespective of external factors.
Imperfections
If you don’t acknowledge the imperfections in your data and try to estimate a counterfactual, your program officer will. You’re better off poking holes in your own data before any one else gets the chance.
Results Too Good to Be True
If your results sound too good to be true, you are more likely to have made a mistake in your evaluation than to have discovered you’re a genius.
Outliers
Outliers do not make for compelling client testimonials. Use your metrics to identify what the average experience in your program looks like, and get testimonials from people who fit this profile.
Comparison Group
100% of successful people succeed. Losers are a terrible comparison group for winners. Make sure your comparison group is identical to your treatment group, with the only difference being participation in your program.
Your Thoughts?
What thoughts do you have on the subject of counterfactuals? Is there anything you would like to add to the discussion? Let us know in the comments.
thidamony
awesome article!
Tracy Wharton
I love the one titled “What they don’t teach you in grad school!” So true!
I think that sometimes we forget that evaluation stretches across many professional orientations, and we don’t all necessarily use the same terms. For me, the term counterfactual was not familiar, but the concept certainly is, and as a social work evaluator, I am always trying to consider this issue, even if I didn’t know the term that David used. I always worry that people are ready to dismiss other evaluators if they don’t use the same terms, under the assumption that they must not know or use the concept.
*Note – This one came in via email, I posted -Chris*
Sheila B Robinson
I think it’s important to teach the concept of counterfactuals to evaluation students. It’s a tricky concept to understand, though. Here’s my example:
If 200 smokers sign up for your smoking cessation program and you only have room for 100, you can randomly assign people to the program and follow them, along with those who don’t get in. What’s important to remember is that all 200 of these smokers are qualitatively different from any other smoker who didn’t sign up for a smoking cessation program.
You can measure the difference between program participants and non-program participants, but keep in mind they all came with the desire to quit smoking. Non-program participants may seek other ways to quit (willpower, family support, other programs, medical aids, etc.) and it may be difficult to control for all of these conditions.
So, just looking at the differences between the groups may give you an estimate of the counterfactual, but doesn’t give you a true picture of program effect.
Comparing 100 randomly assigned program participants to 100 smokers who did NOT apply to be in the program will give you a better estimate of the counterfactual (what happens to smokers in the absence of the program), but you still have the challenge of understanding program effects due to the fact that program participants purportedly came with the desire to quit. Unless of course, your program conclusions (your statements about program effect) are limited to being generalized ONLY to the populations of smokers who express a desire to quit.
Is this how other evaluators understand this concept?
Sometimes thinking about evaluation makes my head explode…and THAT’s why we need evaluation cartoons to make us laugh! 😀
alam zeb
Hi Chris,
Here I am stuck ………….
the government imposed ban on green felling of timber in 1993 in whole country expect one state (AJ&K) which is still in act. while state imposed the ban on green felling later in 1997. I want to assess the effectiveness of this ban on forest conservation in AK&K state and impact on the livelihood of forest communities. I am planning to consider the 1993 general ban (except AJ&K) as control group while the state as treatment group.
your technical inputs on this………..
zeb
Chris Lysy
Hello Zeb,
Unfortunately I’m not an expert in this area so my personal technical input would not be helpful. The web, data collection, and visualization are my personal domains. For the many other parts of the evaluation world, I am still very much a student. I address them on this blog, but only through the help of those who know more than I do.