Maintaining Interrater Agreement of Core Assessment Instruments in a Multisite Randomized Controlled Clinical Trial: The Randomized Evaluation of Sedation Titration for Respiratory Failure (RESTORE) Trial

    loading  Checking for direct PDF access through Ovid



RESTORE (Randomized Evaluation of Sedation Titration for Respiratory Failure) was a cluster randomized clinical trial evaluating a sedation strategy in children 2 weeks to <18 years of age with acute respiratory failure supported on mechanical ventilation. A total of 31 U.S. pediatric intensive care units (PICUs) participated in the trial. Staff nurse rater agreement on measures used to assess a critical component of treatment fidelity was essential throughout the 4-year data collection period.


The purpose of the study is to describe the method of establishing and maintaining interrater agreement (IRA) of two core clinical assessment instruments over the course of the clinical trial.


IRA cycles were carried out at all control and intervention sites and included a minimum of five measurements of the State Behavioral Scale (SBS) and Withdrawal Assessment Tool-Version 1 (WAT-1). Glasgow Coma Scale scores were also obtained. PICUs demonstrating <80% agreement repeated their IRA cycle. Fleiss’s kappa coefficient was used to assess IRA.


Repeated IRA cycles were required for 8% of 226 SBS cycles and 2% of 222 WAT-1 cycles. Fleiss’s kappa coefficients from more than 1,350 paired assessments were .86 for SBS and .92 for WAT-1, demonstrating strong agreement and similar to .91 for the Glasgow Coma Scale. There was no difference in Fleiss’s kappa for any of the instruments based on unit size or timing of assessment (earlier or later in the study). For SBS scores, Fleiss’s kappa was significantly different in larger and smaller PICUs (.82 vs. .92, p = .003); however, Fleiss’s kappa for both groups indicated excellent agreement.


Monitoring measurement reliability is an essential step in ensuring treatment fidelity and, thus, the validity of study results. Standardization on the use of these core assessment instruments among participating sites was achieved and maintained throughout the trial.

Related Topics

    loading  Loading Related Articles