Increasingly, school districts have been mandated or have elected to adopt teacher evaluation systems that combine observations of teacher performance with other indicators of effectiveness. While classroom observations can provide pertinent information on teaching practices, little is known about which raters and training conditions produce better ratings and why. Bell and her colleagues will study the Los Angeles Unified School District’s (LAUSD) three-year roll-out of a consequential teacher evaluation system. LAUSD is the second-largest district in the country, including more than 800 schools and nearly 670,000 students; the district uses local personnel to conduct ratings and uses those ratings to inform high-stakes decisions. Bell and her team will examine what factors affect rating decisions and the sources of variation when two raters observe the same teacher and classroom but assign different scores. The study sample includes 1,600 raters who will be trained as part of the LAUSD’s teacher evaluation system. Among the 1,600, 40 raters will be involved in a more intensive substudy of their practices and views. The raters include principals, assistant principals, central office staff, and lead teachers from the district’s elementary, middle, and high schools. Data will be collected in two waves aligned with the beginning of the school year and will include surveys, training documents, and information on raters’ background and district policy, interviews. Teachers’ scores will be recorded electronically and will be conducted two to three times a year by a principal or assistant principal and another staff member.
Under what conditions do observers most reliably distinguish between levels of teaching quality?