In the medical world, an iatrogenic disease is defined as one caused, albeit accidentally, by medical treatment. Think, for example, of a prescription drug that causes unintended side effects, or patients who undergo successful surgery but then contract a staph infection or pneumonia during their hospital stay, completely unrelated to the purpose of the surgical intervention. And such effects are not uncommon: Barbara Starfield's notable article in the Journal of the American Medical Association in 2000 implicated our nation's medical system as the third leading cause of death in the United States, after cancer and heart disease.
In other professional fields, too, well-intended practices can have unintended consequences, some of them quite negative. In education, we are already experiencing, or will soon experience, an array of iatrogenic consequences in the current accountability drive to prove the worth of our students, teachers, schools, and teacher preparation programs.
The overreliance on questionable measurement tactics is bound to cause a wave of problems. The intent of the accountability drive, of course, is well meaning – who can truly be against the goal of determining whether all children are being served well in order to promote adjustments that level the playing field? But as numerous others have warned, there will be unintended and unfortunate consequences for such actions. Value-added measures, for example, may be useful for assessments of large groups and populations, but individualizing results from such measures is highly problematic. Campbell's Law, developed decades ago in the field of evaluation research, suggests that overreliance on single measures will lead to distortions and dishonesty that will have significant negative effects. Research documents shrinking curricula, teaching to tests, and multiple misuses of results. Tests designed for specific purposes are being utilized in ways that the developers truly never intended. Professionally, we know that all tests have limits, there is always potential for errors, and interpreting results is a tricky business. Sadly, many policy makers aren't willing or savvy enough to heed such cautions.
In our field, we are all aware of the pending rating of our work by the National Council on Teacher Quality (NCTQ), and of the likelihood that the U.S. Department of Education will require the use of value-added techniques to assess teacher preparation by gauging the performance of our graduates once on the job. The idea of a “teacher prep scorecard” is attractive enough that U.S. News & World Report will publish the NCTQ ratings with great fanfare. Accuracy concerns aside, such a scorecard may force some institutions to make programmatic alterations to get better scores, regardless of the fact that NCTQ doesn't have an iota of evidence that meeting their standards will make program graduates better.
Similarly, if the Department of Education gets its way, all teacher preparation programs will be subject to accountability using assessments that may contain significant measurement error. While it is hard to argue against the idea of holding programs accountable for the quality of their product, it is appropriate to ask what the correct measures should be. Should standardized test scores be the overarching focus for schools, where their emphasis, even in systems that capture multiple measures, dominates? Think about it this way: We can all identify teachers who made a big impact on our lives, but typically we relate to their inspirational character, their personal care and support, their love of students and their subject, their going above and beyond the call of duty to help. Research suggests that teachers who care, respect, and praise students end up with students who like school more, which likely affects their overall school outcomes. Although standardized tests may be a proxy for some of these characteristics, they certainly don't capture the real essence of the most significant outcome a teacher can have: a meaningful impact on students' lives.
Of course, schools of education want information about how their graduates are performing, including information about student achievement. Other professions don't consider the impact their graduates have with their clients when gauging their preparation programs. Medicine, law, business, engineering, and others look at passage rates on licensure and similar examinations and require graduates to meet specific programmatic objectives with some style of assessment process. But should schools of education be judged predominantly on test scores of our graduates' students? Or are there are so many confounding variables and potential for harm that we should avoid reliance on such data?
To minimize the iatrogenic nature of our accountability systems in education, we must be vigilant in emphasizing that whatever measures are employed, they must be utilized as designed or else questioned, highlighting the potential for misuse or misunderstanding that is rampant today. We should never ignore the reality that schools and teaching are about more than what one-shot standardized test scores can capture. Multiple measures are required to do service to the complex nature of teaching and learning – which will also provide meaningful feedback for program improvement. Along with capturing evidence of impact on student learning, these measures should include professional observations, portfolio examinations, satisfaction surveys, and perhaps even assessments of student well-being. Creative ideas and solutions should be sought, with the understanding that all measures have limitations to be addressed and should undergo the scrutiny of research before being tied to high-stakes consequences.
Any accountability system needs to take care that its measures don't produce serendipitous outcomes that are harmful. Blindly walking down the path of misunderstood measurement responses to complex questions is a formula for disaster, with likely iatrogenic accountability “diseases” that will demand cures.