The importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial. Survey of 71 "negative" trials
Seventy-one "negative" randomized control trials were re-examined to determine if the investigators had studied large enough samples to give a high probability (greater than 0.90) of detecting a 25 per cent and 50 per cent therapeutic improvement in the response. Sixty-seven of the trials had a greater than 10 per cent risk of missing a true 25 per cent therapeutic improvement, and with the same risk, 50 of the trials could have missed a 50 per cent improvement. Estimates of 90 per cent confidence intervals for the true improvement in each trial showed that in 57 of these "negative" trials, a potential 25 per cent improvement was possible, and 34 of the trials showed a potential 50 per cent improvement. Many of the therapies labeled as "no different from control" in trials using inadequate samples have not received a fair test. Concern for the probability of missing an important therapeutic improvement because of small sample sizes deserves more attention in the planning of clinical trials.
This article has been cited by other articles:
Baker, L. A., Silverstein, M.
(2008). Preventive Health Behaviors Among Grandmothers Raising Grandchildren. J. Gerontol. B Psychol. Sci. Soc. Sci.
63: S304-S311
[Abstract][Full Text]
Singh, A. K., Kelley, K., Agarwal, R.
(2008). Interpreting Results of Clinical Trials: A Conceptual Framework. CJASN
3: 1246-1252
[Full Text]
Yuen, S. Y., Pope, J. E.
(2008). Learning from past mistakes: assessing trial quality, power and eligibility in non-renal systemic lupus erythematosus randomized controlled trials. Rheumatology (Oxford)
47: 1367-1372
[Abstract][Full Text]
Megwalu, U. C., Piccirillo, J. F.
(2008). Methodological and Statistical Problems in Uvulopalatopharyngoplasty Research: A Follow-up Study. Arch Otolaryngol Head Neck Surg
134: 805-809
[Abstract][Full Text]
Kelley, S. D., Manberg, P. J., Sigl, J. C., Myles, P. S., Leslie, K., Forbes, A., Bo, L., Li, J., Deng, X., Aretha, D., Kiekkas, P., Eleftheria, P., Cook, T. M., Avidan, M. S., Searleman, A. C., Evers, A. S., Orser, B. A.
(2008). Anesthesia Awareness and the Bispectral Index. NEJM
359: 427-431
[Full Text]
Olivo, S. A., Macedo, L. G., Gadotti, I. C., Fuentes, J., Stanton, T., Magee, D. J
(2008). Scales to Assess the Quality of Randomized Controlled Trials: A Systematic Review. ptjournal
88: 156-175
[Abstract][Full Text]
Bedard, P. L., Krzyzanowska, M. K., Pintilie, M., Tannock, I. F.
(2007). Statistical Power of Negative Randomized Controlled Trials Presented at American Society for Clinical Oncology Annual Meetings. JCO
25: 3482-3487
[Abstract][Full Text]
de Lemos, M. L.
(2007). Communicating With Patients About Chemotherapy Costs. JCO
25: 2142-2142
[Full Text]
WINGO, A. P., GHAEMI, S. N.
(2007). STAR*D Level IV Methodology. Am. J. Psychiatry
164: 681-681
[Full Text]
Grapow, M. T.R., von Wattenwyl, R., Guller, U., Beyersdorf, F., Zerkowski, H.-R.
(2006). Randomized controlled trials do not reflect reality: Real-world analyses are critical for treatment guidelines!. J. Thorac. Cardiovasc. Surg.
132: 5-7
[Full Text]
de Lemos, M. L
(2006). Defining the clinical improvement in cancer drug therapy: implications for priority setting in healthcare. J Oncol Pharm Pract
12: 91-94
[Abstract]
Achrafi, H.
(2005). DECOPI (DEsobstruction COronaire en Post-Infarctus): a randomized multi-centre trial of occluded artery angioplasty after acute myocardial infarction: DECOPI or NOT DECOPI: more smoke on the horizon. Eur Heart J
26: 1566-1567
[Full Text]
Greenfield, M. L. V. H., Rosenberg, A. L., O'Reilly, M., Shanks, A. M., Sliwinski, M. J., Nauss, M. D.
(2005). The Quality of Randomized Controlled Trials in Major Anesthesiology Journals. Anesth. Analg.
100: 1759-1764
[Abstract][Full Text]
Tooth, L., Ware, R., Bain, C., Purdie, D. M., Dobson, A.
(2005). Quality of Reporting of Observational Longitudinal Research. Am J Epidemiol
161: 280-288
[Abstract][Full Text]
Morrison, D. A.
(2004). Cardiac revascularization of the medically refractory elderly patient: it is TIME to pay the piper. Eur Heart J
25: 2180-2182
[Full Text]
Stolberg, H. O., Norman, G., Trop, I.
(2004). Randomized Controlled Trials. Am. J. Roentgenol.
183: 1539-1544
[Full Text]
Pincus, T, Sokka, T
(2004). Should contemporary rheumatoid arthritis clinical trials be more like standard patient care and vice versa?. Ann Rheum Dis
63: ii32-ii39
[Abstract][Full Text]
Dimick, J. B., Welch, H. G., Birkmeyer, J. D.
(2004). Surgical Mortality as an Indicator of Hospital Quality: The Problem With Small Sample Size. JAMA
292: 847-851
[Abstract][Full Text]
Kent, D. M., Fendrick, A. M., Langa, K. M.
(2004). New and Dis-Improved: On the Evaluation and Use of Less Effective, Less Expensive Medical Interventions. Med Decis Making
24: 281-286
[Abstract]
Porthouse, J., Torgerson, D. J.
(2004). The Need for Randomized Controlled Trials in Podiatric Medical Research. J. Am. Podiatr. Med. Assoc.
94: 221-228
[Abstract][Full Text]
Delucchi, K. L.
(2004). Sample Size Estimation in Research With Dependent Measures and Dichotomous Outcomes. Am. J. Public Health
94: 372-377
[Abstract][Full Text]
Reed, S. D., Dillingham, P. W., Briggs, A. H., Veenstra, D. L., Sullivan, S. D.
(2003). A Bayesian Approach to Aid in Formulary Decision Making: Incorporating Institution-Specific Cost-Effectiveness Data with Clinical Trial Results. Med Decis Making
23: 252-264
[Abstract]
Eng, J.
(2003). Sample Size Estimation: How Many Individuals Should Be Studied?. Radiology
227: 309-313
[Abstract][Full Text]
Kusuoka, H., Hoffman, J. I.E.
(2002). Advice on Statistical Analysis for Circulation Research. Circ. Res.
91: 662-671
[Abstract][Full Text]
Halpern, S. D., Karlawish, J. H. T., Berlin, J. A.
(2002). The Continuing Unethical Conduct of Underpowered Clinical Trials. JAMA
288: 358-362
[Abstract][Full Text]
Lochner, H. V., Bhandari, M., Tornetta, P. III
(2001). Type-II Error Rates (Beta Errors) of Randomized Trials in Orthopaedic Trauma. JBJS
83: 1650-1655
[Abstract][Full Text]
Sterne, J. A C, Smith, G. D., Cox, D R
(2001). Sifting the evidence--what's wrong with significance tests?. ptjournal
81: 1464-1469
[Full Text]
Bercovy, M., Callaghan, J. J., Greenwald, A. S., Bourne, R. B., Rorabeck, C. H., Dorr, L. D.
(2001). Mobile-Bearing versus Fixed-Bearing Knees. JBJS
83: 1113-1114
[Full Text]
Dimick, J. B., Diener-West, M., Lipsett, P. A.
(2001). Negative Results of Randomized Clinical Trials Published in the Surgical Literature: Equivalency or Error?. Arch Surg
136: 796-800
[Abstract][Full Text]
Lewis, S., Clarke, M.
(2001). Forest plots: trying to see the wood and the trees. BMJ
322: 1479-1480
[Full Text]
Ruiz-Canela, M., de Irala-Estevez, J., Martinez-Gonzalez, M. A., Gomez-Gracia, E., Fernandez-Crehuet, J.
(2001). Methodological quality and reporting of ethical requirements in clinical trials. J. Med. Ethics
27: 172-176
[Abstract][Full Text]
Gotzsche, P C
(2001). Reporting of outcomes in arthritis trials measured on ordinal and interval scales is inadequate in relation to meta-analysis. Ann Rheum Dis
60: 349-352
[Abstract][Full Text]
Smith, G. D., Ebrahim, S.
(2001). Epidemiology--is it time to call it a day?. Int J Epidemiol
30: 1-11
[Full Text]
Sterne, J. A C, Smith, G. D., Cox, D R
(2001). Sifting the evidence{---}what's wrong with significance tests? Another comment on the role of statistical methods. BMJ
322: 226-231
[Full Text]
Briggs, A.
(2000). Economic evaluation and clinical trials: size matters. BMJ
321: 1362-1363
[Full Text]
Greene, W. L., Concato, J., Feinstein, A. R.
(2000). Claims of Equivalence in Medical Research: Are They Supported by the Evidence?. ANN INTERN MED
132: 715-722
[Abstract][Full Text]
Adetugbo, K., Williams, H.
(2000). How Well Are Randomized Controlled Trials Reported in the Dermatology Literature?. Arch Dermatol
136: 381-385
[Abstract][Full Text]
Rigby, A. S.
(1999). Getting past the statistical referee: moving away from P-values and towards interval estimation. Health Educ Res
14: 713-715
[Full Text]
Moore, A. D, Joseph, L.
(1999). Sample size considerations for superiority trials in systemic lupus erythematosus (SLE). Lupus
8: 612-619
[Abstract]
FREEDMAN, K. B., BERNSTEIN, J.
(1999). Current Concepts Review - Sample Size and Statistical Power in Clinical Orthopaedic Research. JBJS
81: 1454-60
[Full Text]
Morello, C. M., Leckband, S. G., Stoner, C. P., Moorhouse, D. F., Sahagian, G. A.
(1999). Randomized Double-blind Study Comparing the Efficacy of Gabapentin With Amitriptyline on Diabetic Peripheral Neuropathy Pain. Arch Intern Med
159: 1931-1937
[Abstract][Full Text]
Scharf, S., Mander, A., Ugoni, A., Vajda, F., Christophidis, N.
(1999). A double-blind, placebo-controlled trial of diclofenac/misoprostol in Alzheimer's disease. Neurology
53: 197-197
[Abstract][Full Text]
Hawkins, B. S.
(1999). The CONSORT Statement: Will It Lead to Improved Reporting of Clinical Trials in Ophthalmology?. Arch Ophthalmol
117: 677-680
[Full Text]
Chuu, W.-M., Wang, N.-Y., Perry, D., Murphy, T. P., Trerotola, S. O., Vogelzang, R. L., Greenfield, L. J., Proctor, M. C., Decousus, H., Mismetti, P., Tardy, B., The Prevention du Risque d'Embolie Pulmonaire par,
(1998). Vena Caval Filters for the Prevention of Pulmonary Embolism. NEJM
339: 46-48
[Full Text]
Johnson, T.
(1998). Clinical trials in psychiatry: background and statistical perspective. Stat Methods Med Res
7: 209-234
[Abstract]
Briggs, A. H., Gray, A. M.
(1998). Power and Sample Size Calculations for Stochastic Cost-Effectiveness Analysis. Med Decis Making
18: S81-S92
[Abstract]
Greenfield, M. L. V. H., Kuhn, J. E., Wojtys, E. M.
(1998). A Statistics Primer: Confidence Intervals. Am J Sports Med
26: 145-149
[Full Text]
WYSER, C. P., van SCHALKWYK, E. M., ALHEIT, B., BARDIN, P. G., JOUBERT, J. R.
(1997). Treatment of Progressive Pulmonary Sarcoidosis with Cyclosporin A . A Randomized Controlled Trial. Am. J. Respir. Crit. Care Med.
156: 1371-1376
[Abstract][Full Text]
Witorsch, R. J., Witorsch, P.
(1996). Environmental Tobacco Smoke and Birthweight of Offspring: A Critical Review and Analysis of the Epidemiological Literature. Indoor and Built Environment
5: 219-231
[Abstract]
Califf, R. M., Jollis, J. G., Peterson, E. D.
(1996). Operator-Specific Outcomes : A Call to Professional Responsibility. Circulation
93: 403-406
[Full Text]
Coombs, W. T., Algina, J., Oltman, D. O.
(1996). Univariate and Multivariate Omnibus Hypothesis Tests Selected to Control Type I Error Rates When Population Variances Are Not Necessarily Equal. REVIEW OF EDUCATIONAL RESEARCH
66: 137-179
[Abstract]
Campbell, M J, Julious, S A, Altman, D G
(1995). Estimating sample sizes for binary, ordered categorical, and continuous outcomes in two group comparisons. BMJ
311: 1145-1148
[Full Text]
Weijer, C., Elliott, C.
(1995). Pulling the plug on futility. BMJ
310: 683-684
[Full Text]
Boehmk, S., Schlenk, E. A., Raleigh, E., Ronis, D.
(1993). Behavioral Analysis and Behavioral Strategies to Improve Self-Management of Type II Diabetes. Clin Nurs Res
2: 327-344
[Abstract]
Harvard, T. C., Lau, J.
(1993). Meta-analytic stimulus for changes in clinical trials. Stat Methods Med Res
2: 161-172
[Abstract]
Leidy, N. K., Weissfeld, L. A.
(1991). Sample Sizes and Power Computation for Clinical Intervention Trials. West J Nurs Res
13: 138-144
Sobal, J., Ferentz, K. S.
(1990). Response. Eval Health Prof
13: 367-372
Selkowitz, D. M.
(1989). High frequency electrical stimulation in muscle strengthening: A review and discussion. Am J Sports Med
17: 103-111
[Abstract]
Bufalino, R., Morabito, A.
(1988). Combined Effect of Transfusion and Blood Groups on the Survival of Patients With Breast Cancer. A Clinical Study of 901 Patients {dagger}. VASC ENDOVASCULAR SURG
22: 402-412
[Abstract]
Wollersheim, H., Thien, T.
(1988). Hemodynamic Effects of Prostacyclin Infusions in Healthy Volunteers. ANGIOLOGY
39: 227-233
[Abstract]
Li, T. C.M., Greenes, R. A., Weisberg, M., Millan, D., Flatley, M., Goldman, L.
(1988). Data Assessing the Usefulness of screening Obstetrical Ultrasonography for Detecting Fetal and Placental Abnormalities in Uncomplicated Pregnancy: Effects of Screening a Low-risk Population. Med Decis Making
8: 48-54
[Abstract]
Yeaton, W. H., Sechrest, L.
(1987). Assessing Factors Influencing Acceptance of No-Difference Research. Eval Rev
11: 131-142
[Abstract]
Wagner, D. P., Draper, E. A., Abizanda Campos, R., Nikki, P., Roger Le Gall, J., Loirat, P., Knaus, W. A.
(1984). Initial International Use of APACHE: An Acute Severity of Disease Measure. Med Decis Making
4: 297-313
Rifkin, R. D.
(1983). Statistical Considerations in Medical Decision Models. Med Decis Making
3: 197-214
Boruch, R. F., Wortman, P. M.
(1979). 8: Implications of Educational Evaluation for Evaluation Policy. REVIEW OF RESEARCH IN EDUCATION
7: 309-361