Debate Rages: Response to TFA’s Supposition that New Brief is a “Retreat from Evidence”

The debate rages. Yesterday, the National Education Policy Center (NEPC) released Teach For America: A Return to the Evidence. It is the sequel to the 2010 Teach For America: A Review of the Evidence. TFA has responded. Raegen Miller, TFA’s VP of Research Partnerships, has asked us to keep the dialogue going, so I will respond to his comments (See their comments linked and included in full below. We hope they respond in kind by pasting our response on their blog) about our NEPC brief on TFA. I focus on addressing Miller’s comments that are relevant for the general public who want to truly understand the impact of TFA.

Miller brings up three main concerns: How we chose which studies we reviewed, our review of Xu, Hannaway, and Taylor’s article, and our review of the Mathematica study. Let me address these concerns.

Miller implies that we cherry-picked studies to review. We make it clear in the brief that we focus on peer-reviewed articles, and we review all the peer-reviewed studies that have been published since our 2010 NEPC brief. This standard is not one we made up— peer-reviewed articles are widely seen as the gold-standard for quality of research. Peer-reviewed articles are refereed by other scholars (usually double blind) to ensure that the standards of the field and of the journal publishing the article are maintained. An article is published only after it has met these standards, usually after the reviewers’ comments for revision have been addressed, and after the article has been approved by multiple scholars and by the publication’s editor. This is in contrast to reports that are not peer-reviewed, which can be published by anyone. Although the latter can potentially be of higher quality than the former, the more likely scenario favors peer-reviewed publications. This is why we opted to go with the scholarly standard of peer-review articles. We acknowledge that TFA has commissioned and paid for many research studies. For example, Cloaking Inequity and NEPC reviewed the Edvance study mentioned by Raegen Miller previously (See Update: “Fatal Flaws” in Edvance Texas TFA study).

Miller calls our treatment of Xu, Hannaway, and Taylor’s article “puzzling”, referring to a “technical issue to do with North Carolina data”. This “technical issue” is the authors’ acknowledged difficulty in being able to connect students with their teachers. When your goal is to evaluate teachers based on how the students performed, it is essential to be able to link teachers with their students. Moreover, it is the US Department of Education who takes issue with this methodological flaw. We simply report the US Department of Education’s critique that this could lead to “imprecise” and possibly misleading estimates of the impact of TFA.

Miller seeks to dismiss our critiques of the Mathematica study by claiming we “draw on non-scholarly blog material.” (See the statistical critiques of Mathematics study New Mathematica TFA Study is Irrational Exuberance and “Does Not Compute”: Teach For America Mathematica Study is Deceptive? to judge for yourself the validity of the study) This straw man argument seeks to distract the public from valid and important critiques of the study:

  • That the TFA teachers in the study do not look like TFA teachers in general, so it’s unclear how applicable the results are for the public trying to understand the impact of TFA in general;
  • That the study’s findings on other factors that impact student test scores counter what peer-reviewed education research has concluded over decades of study, which calls into question the study’s findings on TFA; and
  • That the study’s findings even run counter to what TFA’s own model touts.

Despite all this, we note that the impact Mathematica found is quite small (.07 of a standard deviation) – much smaller than other educational reforms that have a much stronger and consistent evidence base (Pre-K has an effect size=0.85—1214% more impact, and reducing class size has an effect size=0.20—286% more impact). We don’t understand Miller’s complaint about our calling readers’ attention to small effect sizes; this is extremely important information. As an example that Miller may agree with, in an NEPC review ( of the much-ballyhooed CREDO study that found charters to be doing less well than non-charters, Professor Gary Miron pointed out that the actual effect size differences were very small – that the charters were not really doing much worse. This is the sort of information we should always heed, whether or not it fits a given policy agenda.

Finally, we note that Mathematica’s findings on TFA echo the findings of other studies from our 2010 NEPC brief – all to say, nothing new here folks! I believe Miller and I have the same goals – to improve the educational outcomes of our most underserved students. While Miller believes that TFA is a part of improving these outcomes, I believe that we should instead focus our resources on educational interventions that we know have large impact on student achievement – such as high-quality early childhood education – or focus our resources on new innovations in education that show significant promise. TFA is currently neither of these. TFA has been shown to have zero or a small impact on the few students who have such a teacher. If high-quality teaching is important, and we all agree that it is, then why invest so much in a program that relies on transient, cursorily prepared teachers and that has not shown itself to be particularly effective? TFA must reform its reform.

Here are Raegen Miller, TFA’s VP of Research Partnerships, full remarks from TFA’s blog.  (We promised we would publish TFA’s response here on Cloaking Inequity. We hope they respond in kind by pasting our response on their blog):

The National Education Policy Center recently published a policy brief written by scholars Julian Vasquez Heilig and Su Jin Jez. The brief updates the authors’2010 review of research on Teach For America. I did not expect a terribly fair and balanced treatment of new evidence, given the prior effort’s distortions and the authors’ signals in the non-scholarly blog, Cloaking Inequity. Nevertheless, the new review offers grist for a healthy dialogue about important subjects and serious empirical challenges. This post represents my top-line views about the new review, especially with respect to its handling of research on the instructional impact of Teach For America teachers.

Let me start with the good news. The 2010 piece began with an overtly biased framing of the matter in question: “whether Teach For America is a panacea or a problem for low-income communities.” Since there are no panaceas for complex social problems, the authors left only one possible conclusion for their review of evidence: Teach For America is a problem. Happily, the new piece, “A Return to the Evidence,” employs a more reasonable framing: the evidence of instructional impact is “mixed.” This is fine in the sense that the relevant literature sports material drawing on data from a variety of data sources, employing a variety of analytic strategies, making a variety of specific comparisons, and offering a variety of impact estimates.

Yet unpacking this literature to characterize objectively the story it tells is a little tricky to do. Instead of “returning to the evidence,” the authors retreat to previously published views that don’t have much business in a brief that offers policy recommendations with a bearing on educational opportunities for low-income students across the country. This retreat is a three-step routine.

The authors restrict the scope of their review in a way that excludes important evidence. Notably, this means readers learn nothing of a report by Harvard’s Strategic Data Project. This piece involves data from Los Angeles Unified School District and yields high-quality evidence that first year corps members are more effective in teaching math and reading in grades three through eight than other first year teachers in that district. The authors also shied away from a methodologically interesting report by Edvance, Inc. using data from Texas to offer evidence that Teach For America teachers produce a net boost in student achievement in that state. The criterion for dismissal is that these reports haven’t undergone formal peer-review, but it’s hard for me to separate this quintessentially academic distinction from pedantry or bias. Leaving aside the scientific merits of these reports, the retreating authors find it acceptable to introduce a range of sketchy information on program costs elsewhere in the brief. Is the authors’ analysis of new data exempt from the burden of peer review? Perhaps not formally, but bear in mind that National Education Policy Center reviewers let through the overtly biased 2010 effort. Moreover, it re-posts blog entries highly critical of Teach For America, implicitly lending credibility to many unsubstantiated assertions.

The two peer-reviewed studies that survive the cull receive the most puzzling treatment of all. The authors dismiss one, a  Xu, Hannaway, and Taylor paperpublished in 2011 by the Journal for Policy Analysis and Management by appealing to a technical issue to do with North Carolina data. I have trouble seeing how the issue, addressed in footnotes in numerous high-profile papers, could have eluded the journal’s editors or reviewers. The profound deference to peer review shown by Vasquez Heilig and Jin Jez vanishes when it comes to handling the Xu et al. paper. It’s ironic that authors genuinely concerned about equity wield a double standard with such dexterity.

After excluding and dismissing inconvenient evidence, the authors finally deign to discuss an actual study. Ignoring the 2013 Mathematica Policy Research study commissioned by the Institute for Education Sciences wouldn’t be good form for scholars, and here’s where they draw on non-scholarly blog material to beat back evidence of the highest order, namely from a randomized control study funded by the Department of Education with the explicit purpose of informing debate about the merits of highly selective alternative certification programs.

I’ll have more to say about the various attacks on this study in a future post, but I can’t close this one out without mentioning two things. First, faced with highly credible estimates of positive, differential impact of Teach For America teachers, the authors dispute the estimates’ educational significance. Statistically significant estimates considered small by the standards of psychologists are a permanent feature of the post-NCLB policy research landscape. I’m over it, and I have a lot of company in the research community. But I’m not over my curiosity about whether the authors would have made the same argument had Mathematica’s findings been negative? I’d have a better guess if they’d “returned” to a broader, more complex assortment of evidence, perhaps including reports on efficacy of teacher education programs produced annually in North Carolina and Tennessee, in which Teach For America consistently figures among the most potent, plus the Strategic Data Project and Edvance reports mentioned above.

Finally, I’d like to express my gratitude to the authors for not revisiting the nasty ad-hominem attack featured in Cloaking Inequity last fall after the release of the Mathematica study. The less said about this post the better, really. This omission augurs well for an open, honest, and civil debate about the research on Teach For America.

For all of Cloaking Inequity’s posts on TFA go here.

Please Facebook Like, Tweet, etc below and/or reblog to share this discussion with others.

Want to know about Cloaking Inequity’s freshly pressed conversations about educational policy? Click the “Follow blog by email” button in the upper left hand corner of this page.

Twitter: @SuJinJez

Please blame Siri for any typos.


  • Pingback: Teach For America is Barking Up the Wrong Tree | Cloaking Inequity

  • This report is one of many that will show careful readers the difference between spin from think tanks and PR departments that parade as “objective” and well-informed scholarship. Hurray for this report and for the excellent work exposing the huge flaws in Bill Gates’ Measures of Effective Thinking Project. The problem with TFA is that it has no respect for any knowledge about teaching other than tricks of the trade in managing a classroom and planning lessons all crammed into five-weeks.


  • Has anyone factored in that many new TFA recruits also have backgrounds in the subject areas they are teaching or even education degrees? While their model is on anyone can become a teacher with 5 weeks of training regardless of their major and regardless of the subjects and grade levels they teach, I’ve found anecdotal evidence in ones Ive hosted of fewer poli sci majors and more math majors teaching math, more english majors teaching english and actual education majors getting supplementing their education with 5 week TFA courses. These kids are nor randomly assigned but go to job fairs where they interview with principals and accept specific assignments they are comfortable with. This is a vast improvement over how TFA operated in 2005 after Katrina, and even a few years ago when every kid was a class president, head of a sororiety or a poli sci major (that we hosted) my wife is ex TFA so we host these kids from time to time and I informally chat with them about their backgrounds and goals. Ironically they seem to be moving away from their stated model and beliefs. Now they are operating more like a supplemental training program than a primary one, while taking stand publicly that those factors don’t matter, just the TFA special sauce. Why do you think that is? I bet if you studied difference between those types of TFA recruits you would find a great disparity, and also the TFAs in non/less tested subjects are harder to ifentify how well they are performing. Our state agency does not collect data on who is TFA or not, and studies are dependent on TFA supplied data for longevity of careers and test scores. Data I saw for that was obviously missing many of the shorttimers which skewed the average of time spent in a classroom setting toward a longer timeframe. Just some observations i thought I’d share.


    • There is a big difference between having a math or English degree and being able to teach that subject to kids in school—that is, having a developed and in-depth pedagogy. Further, the issues with TFA extend beyond being hired. Districts contract with TFA making the hiring process all but a shoe-in, but the real issue has to do with the effect of TFA on building a sound and sustainable platform for school improvement. High-need schools need sustainable teaching, not high turn-over, and when we talk about effectiveness then we move away from the purposes of education, where assessment for accountability and funding have little if any real value. Data, data, data… How do you measure good citizenship or authentic novel interpretation of learned material? You can’t, but does that make it not important? Authentic and novel thinking is the only way societies progress, yet the only success measure encouraged by TFA is thinking by replication. No society has progressed by doing what has already been done.


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s