The Fragility of Statistically Significant Results in Pediatric Orthopaedic Randomized Controlled Trials as Quantified by the Fragility Index: A Systematic Review

    loading  Checking for direct PDF access through Ovid



The randomized controlled trial (RCT) is the gold standard study design allowing critical comparison of clinical outcomes while minimizing bias. Traditionally clinical trials are evaluated through statistical significance, expressed by P-values and confidence intervals. However, until recently, the robustness of a study’s conclusions has been given little attention. A new metric, the fragility index, quantifies the number of patients theoretically required to switch outcomes in order to reverse the study conclusions. The primary aim of our work was to determine the fragility index of RCTs in the pediatric orthopaedic literature. The secondary aim was to determine study factors associated with lower fragility index.


Pubmed and Embase were systematically searched for pediatric orthopaedic RCTs published September 1, 2006 to September 1, 2016. Two independent reviewers screened titles, abstracts, and manuscripts to identify studies published in English involving 2 treatment arms. Trials without dichotomous primary or secondary outcomes or with patients >18 years were excluded. Data were extracted from each eligible article in duplicate and the fragility index was determined using Fisher exact test, with previously published methods. Univariate analysis was used to determine factors associated with lower fragility index.


Seventeen trials were eligible for inclusion. The median treatment arm size was 58 and overall sample size was 116 patients. The median fragility index was 3 (range, 0 to 18). A fragility index of 3 means that just 3 patients would need to switch treatment outcomes in order for the trial results to become statistically nonsignificant. In 1 study, the number of patients lost to follow-up exceeded the fragility index, such that the study conclusions could be completely reversed purely depending on the outcomes of the patients lost to follow-up. Lower fragility index was associated with smaller patient sample sizes and greater P-values.


The fragility index is a useful adjunct metric to the P-value and confidence intervals, allowing analysis of the robustness of study conclusions. RCTs in pediatric orthopaedics often have small sample sizes, many with low fragility indices. Future efforts could focus on encouraging institutional collaboration and patient recruitment with the ultimate goal of improving RCT sample sizes, and potentially improving the robustness of RCT results.

Level of Evidence:

Level I.

Related Topics

    loading  Loading Related Articles