Daniel Koretz is the Henry Lee Shattuck Professor of Education at Harvard Graduate School of Education. He focuses his research primarily on educational assessment, particularly as a tool of education policy. A primary emphasis in his work has been the effects of high-stakes testing, including effects on schooling and the validity of score gains. His research has included studies of the effects of testing programs, the assessment of students with disabilities, international differences in the variability of student achievement, the application of value-added models to educational achievement, and the development of methods for validating scores under high-stakes conditions. His current work focuses on the design and evaluation of test-focused educational accountability systems. Dr. Koretz founded and chairs the International Project for the Study of Educational Accountability, an international network of scholars investigating improved approaches to educational accountability. Dr. Koretz is a member of the National Academy of Education. His doctorate is in developmental psychology from Cornell University. Before obtaining his degree, Dr. Koretz taught emotionally disturbed students in public elementary and junior high schools.
When you look at something like the SAT, for example, the question is, how would you know if it’s still biased? The question I start with is when people attack the SAT or other standardized tests because of racial and class differences, I ask, do we know something about the quality of schools that disadvantaged kids go to in this country? Of course we do, they’re not as good. Well, do those differences matter? Of course they matter. If they matter, that means that the kids coming out of those schools will no less and will be less well-prepared for college. So any test that is working properly will show differences between advantaged and disadvantaged kids. The only question is whether it’s showing the right difference. Is it too big? Is it too small? The way this question has been handled at the SAT most often is to predict how well students will do in college and to see whether that prediction is off in one direction or the other for different groups of people. And it is. It’s slightly underestimates the performance of white women. It’s slightly overestimates, very slightly, the performance of African-American and Latino men. It’s a very slight difference. Now, this is not rock hard evidence, because what you’re using is a criteria and it’s the grades that students get in college. Students go into different kinds of classes. Some of the faculty may themselves be biased in assigning grades. I actually trust grades less than I trust tests. But it’s a logical first step. And that research which is going on now for decades consistently fails to show, if you pardon my [IB], fails to show bias against minority students. Could still be there but that particular piece of evidence doesn’t support it. Now, on the other hand, the SAT is not a strong predictor of performance in college. It is a predictor, but just like high school grades, it’s a fairly weak predictor. So you can find lots of kids not just minority kids who do poorly on the SAT and do well in college, you can also find kids who have high grades in high school and do poorly in college. There’s a lot of slippage and a lot of error in that. In the case of test given to younger kids, it’s much harder. There are no criteria like performance in college to use, so it’s much harder to say… to pin down whether there’s bias. But the results of different kinds of tests are really very consistent and have been for a long time with one exception I’ll get to. And if you actually look at the content of tests, it’s not clear to me where the bias would come from. But there are two exceptions to what I just said. One is that there has been a consistent trend with an interruption in the early 1990s, consistent for perhaps 40 years that test show that the gap between white and black kids is narrowing. It’s not… It’s still substantial, it’s still worrisome, but it is a great deal smaller than it was in 1968. We don’t know why that’s happening. There’s some research that suggests some answers. There’s some suggestion, for example, that it maybe an echo of the fact that the earlier generations of African-American parents were completing more schooling themselves than their parents. And more educated parents made for higher scoring kids. But we don’t fully understand it. We just know that it’s very consistent across sources of data. So that’s one exception. The other exception is that there’s certain categories of kids for whom bias is an extremely [pressing] problem. Some kids with disabilities and some kids who don’t speak English as a native language, and that latter problem is getting bigger every single year because of immigration. We don’t have a sensible way to deal with, in my view, with kids whose native language is not English. The current laws are very harsh. It basically gives a short amount of time before they have to be tested like everyone else. But particularly for kids who come here, say, as adolescence, it takes a very long time to be fully fluent. So we have a very difficult time even if we have freedom to test them anyway we wanted to, it’s very difficult to get a good estimate, really, as good an estimate of the knowledge and skills of kids who are speaking a second language. And that’s often a bias.