|| Checking for direct PDF access through Ovid
The purpose of this investigation was to examine quantitative methods used to determine reliability in developmental research. Procedures used to compute reliability estimates in 30 studies published in three developmental journals were examined. Four types of reliability studies were identified and analyzed. These included interrater reliability, stability (test-retest and intrarater reliability), equivalence reliability, and internal consistency. Interrater reliability investigations were the most frequently reported in the developmental literature reviewed (45%). The Pearson product moment correlation (r) was the most commonly reported reliability statistic. The findings reveal that researchers in developmental pediatrics frequently analyze reliability data using the Pearson product moment correlation and interpret the results as indicating consensus (agreement) among raters or across instruments. The Pearson product moment correlation (r) provides information on covariation among variables but does not indicate agreement. Thus, the findings suggest that developmental researchers may be misinterpreting the statistical results of reliability investigations. The argument is made that the intraclass correlation coefficient (ICC) is a more appropriate method of analysis when the purpose of the research is to examine consensus. J Dev Behav Pediatr 16:177–182, 1995. Index terms: consistency, measurement, data analysis.