Recent policy interest in tying student learning to teacher evaluation has led to growing use of value-added methods for assessing student learning gains linked to individual teachers. VAM analyses rely on complex assumptions about the roles of schools, multiple teachers, student aptitudes and efforts, homes and families in producing measured student learning gains. This article reports on analyses that examine the stability of high school teacher effectiveness rankings across differing conditions. We find that judgments of teacher effectiveness for a given teacher can vary substantially across statistical models, classes taught, and years. Furthermore, student characteristics can impact teacher rankings, sometimes dramatically, even when such characteristics have been previously controlled statistically in the value-added model. A teacher who teaches less advantaged students in a given course or year typically receives lower effectiveness ratings than the same teacher teaching more advantaged students in a different course or year. Models that fail to take student demographics into account further disadvantage teachers serving large numbers of low-income, limited English proficient, or lower-tracked students. We examine a number of potential reasons for these findings, and we conclude that caution should be exercised in using student achievement gains and value-added methods to assess teachers’ effectiveness, especially when the stakes are high.
teacher evaluation, value-added modeling, teacher effectiveness