The Results of Randomized Controlled Trials in Emergency Medicine Are Frequently Fragile

      Study objective

      Randomized controlled trials govern evidence-based clinical practice, and it is therefore critical that their results be robust. We aim to investigate the fragility of randomized controlled trials in emergency medicine by determining how often significance would be nullified with small changes in outcomes using the fragility index.


      We conducted a methodological systematic review of randomized controlled trials in emergency medicine published in the top 10 general medicine journals and the top 10 emergency medicine journals. Inclusion criteria required that trials be emergency medicine studies structured with a 2-arm or 2-by-2 factorial design and report at least 1 statistically significant dichotomous outcome.


      A total of 180 trials met inclusion criteria. The median fragility index across all trials in emergency medicine was 4 (interquartile range [IQR] 2 to 10) and the median sample size was 140 (IQR 69.5 to 286). For trials from general medicine journals (n=32), the median fragility index was 9 (IQR 4 to 16.5) and the median sample size was 415.5 (IQR 219.5 to 901); for trials from emergency medicine journals (n=148), the median fragility index was 4 (IQR 1 to 9) and the median sample size was 119 (IQR 60 to 227.25). One third of all trials (62/180) had a loss to follow-up that was greater than or equal to the fragility index. There was a modest correlation between fragility index and total number of events (r=0.36; 95% confidence interval [CI] 0.23 to 0.48) and a weak correlation between fragility index and total sample size (r=0.26; 95% CI 0.12 to 0.39). There was no correlation between fragility index and either P value (r=–0.14; 95% CI –0.28 to –0.006) or Science Citation Index (r=0.07; 95% CI –0.08 to 0.22).


      The statistical significance of the results of randomized controlled trials in emergency medicine was often contingent on a small number of events. Until frequentist interpretation of clinical trials is replaced with Bayesian analysis, the fragility index may have utility as a tool to aid clinicians in assessing the robustness of randomized controlled trials in emergency medicine when considered in conjunction with the fragility quotient and other reported metrics.
      To read this article in full you will need to make a payment
      ACEP Member Login
      ACEP Members, full access to the journal is a member benefit. Use your society credentials to access all journal content and features.
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Purchase one-time access:

      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Cantrill S.V.
        • Brown M.D.
        • Brecher D.
        Clinical policy: use of intravenous tissue plasminogen activator for the management of acute ischemic stroke in the emergency department.
        Ann Emerg Med. 2015; 66: 322-333
        • Diercks D.B.
        • Mehrotra A.
        • Nazarian D.J.
        • et al.
        Clinical policy: critical issues in the evaluation of adult patients presenting to the emergency department with acute blunt abdominal trauma.
        Ann Emerg Med. 2011; 57: 387-404
        • Fesmire F.M.
        • Bernstein D.
        • Brecher D.
        Clinical policy: critical issues in the evaluation and management of adult patients presenting to the emergency department with seizures.
        Emerg Med. 2014; 63: 437-447
        • Jagoda A.S.
        • Bazarian J.J.
        • Bruns Jr., J.J.
        • et al.
        Clinical policy: neuroimaging and decisionmaking in adult mild traumatic brain injury in the acute setting.
        J Emerg Nurs. 2009; 35: e5-e40
        • Sterne J.A.C.
        • Cox D.R.
        • Smith G.D.
        Sifting the evidence—what’s wrong with significance tests? another comment on the role of statistical methods.
        BMJ. 2001; 322: 226-231
        • Rozeboom W.W.
        The fallacy of the null-hypothesis significance test.
        Psychol Bull. 1960; 57: 416-428
        • Feinstein A.R.
        P-values and confidence intervals: two sides of the same unsatisfactory coin.
        J Clin Epidemiol. 1998; 51: 355-360
        • Rothman K.J.
        Significance questing.
        Ann Intern Med. 1986; 105: 445-447
        • McIlroy D.
        Seduced by a P-value.
        Anaesth Intensive Care. 2014; 42: 551-554
        • Ioannidis J.P.A.
        The proposal to lower P value thresholds to .005.
        JAMA. 2018; 319: 1429-1430
        • Walsh M.
        • Srinathan S.K.
        • McAuley D.F.
        • et al.
        The statistical significance of randomized controlled trial results is frequently fragile: a case for a fragility index.
        J Clin Epidemiol. 2014; 67: 622-628
        • Aufderheide T.P.
        • Frascone R.J.
        • Wayne M.A.
        • et al.
        Standard cardiopulmonary resuscitation versus active compression-decompression cardiopulmonary resuscitation with augmentation of negative intrathoracic pressure for out-of-hospital cardiac arrest: a randomised trial.
        Lancet. 2011; 377: 301-311
        • Ahmed W.
        • Fowler R.A.
        • McCredie V.A.
        Does sample size matter when interpreting the fragility index?.
        Crit Care Med. 2016; 44: e1142-e1143
        • Gerhardt R.T.
        • Hermstad E.
        • Crawford D.M.
        • et al.
        Postdischarge secobarbital after ED migraine treatment decreases pain and improves resolution.
        Am J Emerg Med. 2011; 29: 86-90
        • Leung J.
        • Duffy M.
        • Finckh A.
        Real-time ultrasonographically-guided internal jugular vein catheterization in the emergency department increases success rates and reduces complications: a randomized, prospective study.
        Ann Emerg Med. 2006; 48: 540-547
      1. US Department of Health and Human Services. Protection of human subjects. Available at: Accessed October 2, 2017.

        • Guyatt G.H.
        • Oxman A.D.
        • Kunz R.
        • et al.
        GRADE guidelines: 2. Framing the question and deciding on important outcomes.
        J Clin Epidemiol. 2011; 64: 395-400
        • Kim H.-Y.
        Statistical notes for clinical researchers: chi-squared test and Fisher’s exact test.
        Restor Dent Endod. 2017; 42: 152-155
        • Chae J.
        • Taylor D.M.
        • Frauman A.G.
        Tropisetron versus metoclopramide for the treatment of nausea and vomiting in the emergency department: a randomized, double-blinded, clinical trial.
        Emerg Med Australas. 2011; 23: 554-561
        • Baş M.
        • Greve J.
        • Stelter K.
        • et al.
        A randomized trial of icatibant in ACE-inhibitor-induced angioedema.
        N Engl J Med. 2015; 372: 418-425
        • Spiro D.M.
        • Tay K.-Y.
        • Arnold D.H.
        • et al.
        Wait-and-see prescription for the treatment of acute otitis media: a randomized controlled trial.
        JAMA. 2006; 296: 1235-1241
        • Sun B.C.
        • McCreath H.
        • Liang L.-J.
        • et al.
        Randomized clinical trial of an emergency department observation syncope protocol versus routine inpatient admission.
        Ann Emerg Med. 2014; 64: 167-175
        • Feinstein A.R.
        The unit fragility index: an additional appraisal of “statistical significance” for a contrast of two proportions.
        J Clin Epidemiol. 1990; 43: 201-209
        • Walter S.D.
        Statistical significance and fragility criteria for assessing a difference of two proportions.
        J Clin Epidemiol. 1991; 44: 1373-1378
        • Matics T.J.
        • Khan N.
        • Jani P.
        • et al.
        The fragility index in a cohort of pediatric randomized controlled trials.
        J Clin Med Res. 2017; 6: 79
        • Ridgeon E.E.
        • Young P.J.
        • Bellomo R.
        • et al.
        The fragility index in multicenter randomized controlled critical care trials.
        Crit Care Med. 2016; 44: 1278-1284
        • Shochet L.R.
        • Kerr P.G.
        • Polkinghorne K.R.
        The fragility of significant results underscores the need of larger randomized controlled trials in nephrology.
        Kidney Int. 2017; 92: 1469-1475
        • Evaniew N.
        • Files C.
        • Smith C.
        • et al.
        The fragility of statistically significant findings from randomized trials in spine surgery: a systematic survey.
        Spine J. 2015; 15: 2188-2197
        • Edwards E.
        • Wayant C.
        • Besas J.
        • et al.
        How fragile are clinical trial outcomes that support the CHEST clinical practice guidelines for venous thromboembolism?.
        Chest. 2018; 154: 512-520
        • Kruse C.B.
        • Vassar M.B.
        Unbreakable? an analysis of the fragility of randomized trials that support diabetes treatment guidelines.
        Diabetes Res Clin Pract. 2017; 134: 91-105
        • Docherty K.F.
        • Campbell R.T.
        • Jhund P.S.
        • et al.
        How robust are clinical trials in heart failure?.
        Eur Heart J. 2017; 38: 338-345
        • Levine A.C.
        • Becker J.
        • Lippert S.
        • et al.
        Emergency Medicine Resident Association International Emergency Medicine Literature Review Group. International emergency medicine: a review of the literature from 2007.
        Acad Emerg Med. 2008; 15: 860-865
        • Becker T.K.
        • Hansoti B.
        • Bartels S.
        • et al.
        Global emergency medicine: a review of the literature from 2016.
        Acad Emerg Med. 2017; 24: 1150-1160
        • Lee D.K.
        Alternatives to P value: confidence interval and effect size.
        Korean J Anesthesiol. 2016; 69: 555-562
        • Schriger D.L.
        Problems with current methods of data analysis and reporting, and suggestions for moving beyond incorrect ritual.
        Eur J Emerg Med. 2002; 9: 203-207
        • Kruschke J.K.
        • Liddell T.M.
        The Bayesian new statistics: hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective.
        Psychon Bull Rev. 2018; 25: 178-206
        • Kruschke J.K.
        Bayesian estimation supersedes the t test.
        J Exp Psychol Gen. 2013; 142: 573-603
        • Kruschke J.
        Doing Bayesian Data Analysis: A Tutorial With R, JAGS, and Stan.
        Academic Press, Cambridge, MA2015

      Linked Article

      • In reply:
        Annals of Emergency MedicineVol. 73Issue 6
        • Preview
          We thank Niforatos et al for their contribution to the discussion in regard to the limitations of the fragility index and fragility quotient. In their letter, they expand on some of the limitations to the application of the fragility index and fragility quotient discussed in our original work and ultimately question their utility. Although they highlight some important considerations, we believe that the measures have clinical utility. Here, we explore further some key points about their application and provide insight into how clinicians might use these tools.
        • Full-Text
        • PDF
      • Fragility Measures: More Limitations Considered
        Annals of Emergency MedicineVol. 73Issue 6
        • Preview
          We commend the work by Brown et al1 that, using the fragility index and fragility quotient, assesses the fragility of randomized controlled trials in the emergency medicine literature. Although they provide a good overview of the limitations of the 2 metrics, we would like to further question the utility of fragility measures.
        • Full-Text
        • PDF