This study builds upon the work of Allison Tracy’s Foundation-supported study of the Assessment of Program Practice Tool (APT), a widely used instrument to assess the quality of after-school programs. The accuracy of such assessments is important: APT scores are associated with youth outcomes and impact funding and staffing. In their initial study, Tracy and Charmaraman established the APT’s validity and reliability and developed a training protocol to yield rating accuracy. However, findings revealed potential cultural biases within the rating tool. The team suspects that the bias was the result of having primarily White raters generate the reference ratings. Bias was detected when Black raters demonstrated consistently lower accuracy scores, and reviews of video clips revealed youth behaviors that could be interpreted differently based on rater’s cultural background. In the new study, Charmaraman and her team will attempt to enhance APT’s reliability with racially and ethnically diverse populations. First, they will generate reference scores for videos of youth program observations by hiring and training a racially diverse team. Second, they will create an online training system that clarifies master scores and provides “range-finding” tools to assist raters in the evaluation of ambiguous situations. Finally, to check if these steps reduce disparities in rater accuracy levels between racial and ethnic groups, the team will purposefully recruit, train, and assess ratings for a diverse sample of 300 APT trainees (including 30 percent Black participants). .
How can cultural bias in observations of youth program quality be minimized?