The Institute of Education Sciences has funded more than 100 experiments to evaluate educational interventions in an effort to generate scientific evidence of program effectiveness on which to base education policy and practice. In general, these studies are designed with the goal of having adequate statistical power to detect the average treatment effect. However, the average treatment effect may be less informative if the treatment effects vary substantially from site to site or if the intervention effects differ across context or subpopulations. This article considers the precision of studies to detect different types of treatment effect heterogeneity. Calculations are demonstrated using a set of Institute of Education Sciences funded cluster randomized trials. Strategies for planning future studies with adequate precision for estimating treatment effect heterogeneity are discussed.