Debate Rages: Response to TFA’s Supposition that New Brief is a “Retreat from Evidence”
The debate rages. Yesterday, the National Education Policy Center (NEPC) released Teach For America: A Return to the Evidence. It is the sequel to the 2010 Teach For America: A Review of the Evidence. TFA has responded. Raegen Miller, TFA’s VP of Research Partnerships, has asked us to keep the dialogue going, so I will respond to his comments (See their comments linked and included in full below. We hope they respond in kind by pasting our response on their blog) about our NEPC brief on TFA. I focus on addressing Miller’s comments that are relevant for the general public who want to truly understand the impact of TFA.
Miller brings up three main concerns: How we chose which studies we reviewed, our review of Xu, Hannaway, and Taylor’s article, and our review of the Mathematica study. Let me address these concerns.
Miller implies that we cherry-picked studies to review. We make it clear in the brief that we focus on peer-reviewed articles, and we review all the peer-reviewed studies that have been published since our 2010 NEPC brief. This standard is not one we made up— peer-reviewed articles are widely seen as the gold-standard for quality of research. Peer-reviewed articles are refereed by other scholars (usually double blind) to ensure that the standards of the field and of the journal publishing the article are maintained. An article is published only after it has met these standards, usually after the reviewers’ comments for revision have been addressed, and after the article has been approved by multiple scholars and by the publication’s editor. This is in contrast to reports that are not peer-reviewed, which can be published by anyone. Although the latter can potentially be of higher quality than the former, the more likely scenario favors peer-reviewed publications. This is why we opted to go with the scholarly standard of peer-review articles. We acknowledge that TFA has commissioned and paid for many research studies. For example, Cloaking Inequity and NEPC reviewed the Edvance study mentioned by Raegen Miller previously (See Update: “Fatal Flaws” in Edvance Texas TFA study).
Miller calls our treatment of Xu, Hannaway, and Taylor’s article “puzzling”, referring to a “technical issue to do with North Carolina data”. This “technical issue” is the authors’ acknowledged difficulty in being able to connect students with their teachers. When your goal is to evaluate teachers based on how the students performed, it is essential to be able to link teachers with their students. Moreover, it is the US Department of Education who takes issue with this methodological flaw. We simply report the US Department of Education’s critique that this could lead to “imprecise” and possibly misleading estimates of the impact of TFA.
Miller seeks to dismiss our critiques of the Mathematica study by claiming we “draw on non-scholarly blog material.” (See the statistical critiques of Mathematics study New Mathematica TFA Study is Irrational Exuberance and “Does Not Compute”: Teach For America Mathematica Study is Deceptive? to judge for yourself the validity of the study) This straw man argument seeks to distract the public from valid and important critiques of the study:
- That the TFA teachers in the study do not look like TFA teachers in general, so it’s unclear how applicable the results are for the public trying to understand the impact of TFA in general;
- That the study’s findings on other factors that impact student test scores counter what peer-reviewed education research has concluded over decades of study, which calls into question the study’s findings on TFA; and
- That the study’s findings even run counter to what TFA’s own model touts.
Despite all this, we note that the impact Mathematica found is quite small (.07 of a standard deviation) – much smaller than other educational reforms that have a much stronger and consistent evidence base (Pre-K has an effect size=0.85—1214% more impact, and reducing class size has an effect size=0.20—286% more impact). We don’t understand Miller’s complaint about our calling readers’ attention to small effect sizes; this is extremely important information. As an example that Miller may agree with, in an NEPC review (http://nepc.colorado.edu/files/TTR-MIRON-CREDO-FINAL.pdf) of the much-ballyhooed CREDO study that found charters to be doing less well than non-charters, Professor Gary Miron pointed out that the actual effect size differences were very small – that the charters were not really doing much worse. This is the sort of information we should always heed, whether or not it fits a given policy agenda.
Finally, we note that Mathematica’s findings on TFA echo the findings of other studies from our 2010 NEPC brief – all to say, nothing new here folks! I believe Miller and I have the same goals – to improve the educational outcomes of our most underserved students. While Miller believes that TFA is a part of improving these outcomes, I believe that we should instead focus our resources on educational interventions that we know have large impact on student achievement – such as high-quality early childhood education – or focus our resources on new innovations in education that show significant promise. TFA is currently neither of these. TFA has been shown to have zero or a small impact on the few students who have such a teacher. If high-quality teaching is important, and we all agree that it is, then why invest so much in a program that relies on transient, cursorily prepared teachers and that has not shown itself to be particularly effective? TFA must reform its reform.
Here are Raegen Miller, TFA’s VP of Research Partnerships, full remarks from TFA’s blog. (We promised we would publish TFA’s response here on Cloaking Inequity. We hope they respond in kind by pasting our response on their blog):
The National Education Policy Center recently published a policy brief written by scholars Julian Vasquez Heilig and Su Jin Jez. The brief updates the authors’2010 review of research on Teach For America. I did not expect a terribly fair and balanced treatment of new evidence, given the prior effort’s distortions and the authors’ signals in the non-scholarly blog, Cloaking Inequity. Nevertheless, the new review offers grist for a healthy dialogue about important subjects and serious empirical challenges. This post represents my top-line views about the new review, especially with respect to its handling of research on the instructional impact of Teach For America teachers.
Let me start with the good news. The 2010 piece began with an overtly biased framing of the matter in question: “whether Teach For America is a panacea or a problem for low-income communities.” Since there are no panaceas for complex social problems, the authors left only one possible conclusion for their review of evidence: Teach For America is a problem. Happily, the new piece, “A Return to the Evidence,” employs a more reasonable framing: the evidence of instructional impact is “mixed.” This is fine in the sense that the relevant literature sports material drawing on data from a variety of data sources, employing a variety of analytic strategies, making a variety of specific comparisons, and offering a variety of impact estimates.
Yet unpacking this literature to characterize objectively the story it tells is a little tricky to do. Instead of “returning to the evidence,” the authors retreat to previously published views that don’t have much business in a brief that offers policy recommendations with a bearing on educational opportunities for low-income students across the country. This retreat is a three-step routine.
The authors restrict the scope of their review in a way that excludes important evidence. Notably, this means readers learn nothing of a report by Harvard’s Strategic Data Project. This piece involves data from Los Angeles Unified School District and yields high-quality evidence that first year corps members are more effective in teaching math and reading in grades three through eight than other first year teachers in that district. The authors also shied away from a methodologically interesting report by Edvance, Inc. using data from Texas to offer evidence that Teach For America teachers produce a net boost in student achievement in that state. The criterion for dismissal is that these reports haven’t undergone formal peer-review, but it’s hard for me to separate this quintessentially academic distinction from pedantry or bias. Leaving aside the scientific merits of these reports, the retreating authors find it acceptable to introduce a range of sketchy information on program costs elsewhere in the brief. Is the authors’ analysis of new data exempt from the burden of peer review? Perhaps not formally, but bear in mind that National Education Policy Center reviewers let through the overtly biased 2010 effort. Moreover, it re-posts blog entries highly critical of Teach For America, implicitly lending credibility to many unsubstantiated assertions.
The two peer-reviewed studies that survive the cull receive the most puzzling treatment of all. The authors dismiss one, a Xu, Hannaway, and Taylor paperpublished in 2011 by the Journal for Policy Analysis and Management by appealing to a technical issue to do with North Carolina data. I have trouble seeing how the issue, addressed in footnotes in numerous high-profile papers, could have eluded the journal’s editors or reviewers. The profound deference to peer review shown by Vasquez Heilig and Jin Jez vanishes when it comes to handling the Xu et al. paper. It’s ironic that authors genuinely concerned about equity wield a double standard with such dexterity.
After excluding and dismissing inconvenient evidence, the authors finally deign to discuss an actual study. Ignoring the 2013 Mathematica Policy Research study commissioned by the Institute for Education Sciences wouldn’t be good form for scholars, and here’s where they draw on non-scholarly blog material to beat back evidence of the highest order, namely from a randomized control study funded by the Department of Education with the explicit purpose of informing debate about the merits of highly selective alternative certification programs.
I’ll have more to say about the various attacks on this study in a future post, but I can’t close this one out without mentioning two things. First, faced with highly credible estimates of positive, differential impact of Teach For America teachers, the authors dispute the estimates’ educational significance. Statistically significant estimates considered small by the standards of psychologists are a permanent feature of the post-NCLB policy research landscape. I’m over it, and I have a lot of company in the research community. But I’m not over my curiosity about whether the authors would have made the same argument had Mathematica’s findings been negative? I’d have a better guess if they’d “returned” to a broader, more complex assortment of evidence, perhaps including reports on efficacy of teacher education programs produced annually in North Carolina and Tennessee, in which Teach For America consistently figures among the most potent, plus the Strategic Data Project and Edvance reports mentioned above.
Finally, I’d like to express my gratitude to the authors for not revisiting the nasty ad-hominem attack featured in Cloaking Inequity last fall after the release of the Mathematica study. The less said about this post the better, really. This omission augurs well for an open, honest, and civil debate about the research on Teach For America.
For all of Cloaking Inequity’s posts on TFA go here.
Please Facebook Like, Tweet, etc below and/or reblog to share this discussion with others.
Want to know about Cloaking Inequity’s freshly pressed conversations about educational policy? Click the “Follow blog by email” button in the upper left hand corner of this page.
Please blame Siri for any typos.