Policymakers and practitioners who believe that research evidence should inform policy and practice face several challenges. These include debates about the standards of evidence for allocating resources to programs, weak information on how to produce change at scale, and concerns that a few, well-evaluated programs will drive out others that deserve support.
"Social scientists are frequently interested in assessing the qualities of social settings such as classrooms, schools, neighborhoods, or day care centers. The most common procedure requires observers to rate social interactions within these settings on multiple items and then to combine the item responses to obtain a summary measure of setting quality. A key aspect of the quality of such a summary measure is its reliability. In this paper we derive a confidence interval for reliability, a test for the hypothesis that the reliability meets a minimum standard, and the power of this test against alternative hypotheses. Next, we consider the problem of using data from a preliminary field study of the measurement procedure to inform the design of a later study that will test substantive hypotheses about the correlates of setting quality."
"From Soft Skills to Hard Data reviews ten youth outcome measurement tools that are appropriate for use in after-school and other settings. For each tool, it provides sample items and crucial information about usability, cost, and evidence of reliability and validity. A companion to the Forum’s Measuring Youth Program Quality, the guide can help providers select conceptually grounded, psychometrically sound measures appropriate for programs that serve upper-elementary- through high school-aged youth."
Published by the Russell Sage Foundation, "this pioneering volume casts a stark light on the ways rising inequality may now be compromising schools’ functioning, and with it the promise of equal opportunity in America."
Many youth development programs aim to improve youth outcomes by raising the quality of social interactions occurring in groups such as classrooms, athletic teams, therapy groups, after-school programs, or recreation centers. As a result, evaluators are increasingly interested in determining whether such programs significantly improve “group quality.” This paper consider methods for studying the reliability of measures of group quality, with implications for the design of evaluation studies, and illustrates these methods using a large-scale data set on classroom observations.
This software includes a series of empirical estimates of plausible parameter values for determining the minimum effect size that can be detected by a given number, size, and treatment/group mix of randomized groups.