The New England Journal of Medicine
e-mail icon  FREE NEJM E-TOC    HOME   |   SUBSCRIBE   |   CURRENT ISSUE   |   PAST ISSUES   |   COLLECTIONS   |    Advanced Search
Sign in | Get NEJM's E-Mail Table of Contents — Free | Subscribe
 
Correspondence
PreviousPrevious
Volume 358:1199-1200 March 13, 2008 Number 11
NextNext

Subgroup Analyses in Clinical Trials

 

This Article
- PDF
-PDA Full Text

Tools and Services
-Add to Personal Archive
-Add to Citation Manager
-Notify a Friend
-E-mail When Cited
-E-mail When Letters Appear

More Information
-Related Article
 by Wang, R.
To the Editor: Wang et al. (Nov. 22 issue)1 provide a well-reasoned assessment of the statistical issues related to subgroup analyses. However, one important point that should be made is that significance testing during subgroup analyses is seldom appropriate. The majority of subgroup analyses are exploratory in nature, and no significance testing should be performed unless an alpha level needed to achieve significance is attributed to the comparison of interest in advance. Therefore, although a P value may appropriately be calculated to assess the degree of imbalance during such exploratory analyses, no subsequent significance testing should be allowed. On the basis of the degree of imbalance observed and consideration of the previous probability of a given outcome, we might reasonably decide that a result is not due to chance. However, when we do this, we are on our own, and we are no longer working within the confines of the frequentist statistical model. We should not pretend otherwise.


Scott Proestel, M.D.
Food and Drug Administration
Silver Spring, MD 20993
scott.proestel{at}fda.hhs.gov

The views expressed in this letter are those of the author and do not represent those of the Food and Drug Administration.

References

  1. Wang R, Lagakos SW, Ware JH, Hunter DJ, Drazen JM. Statistics in medicine -- reporting of subgroup analyses in clinical trials. N Engl J Med 2007;357:2189-2194. [Free Full Text]

 
To the Editor: Wang et al. remind us that subgroup analyses are often underpowered to detect true differences in treatment effects and, conversely, may yield spurious false positive results from multiple comparisons. An additional fundamental limitation of subgroup analyses is that they typically consider patient characteristics one variable at a time, whereas patients have multiple characteristics simultaneously. By considering each variable separately, subgroup analyses sequentially divide patients into two groups that are more similar than dissimilar, frequently giving the (misleading) impression of a consistent treatment effect across patients.

Research has shown that important subgroups with extreme differences in the risk of the primary outcome, differing across many variables simultaneously, are often concealed within these analyses, sometimes obscuring subgroups of patients who are harmed by treatment.1,2,3,4 Since the heterogeneity in the risk of the primary outcome is ubiquitous, is typically large, can make overall trial results difficult to interpret, frequently gives rise to important differences in risk–benefit trade-offs, and can most often be adequately captured by simple risk models, multivariate risk stratification of results, with tests of interaction between treatment effect and risk strata, should become routine.1,2,3,4 Journals should strongly consider requiring such analyses.


David Kent, M.D.
Tufts–New England Medical Center
Boston, MA 02111
dkent1{at}tufts-nemc.org


Rodney Hayward, M.D.
Veterans Affairs Ann Arbor Healthcare System
Ann Arbor, MI 48105

Dr. Kent reports receiving research funding from Pfizer. No other potential conflict of interest relevant to this letter was reported.

References

  1. Ioannidis JP, Lau J. Heterogeneity of the baseline risk within patient populations of clinical trials: a proposed evaluation algorithm. Am J Epidemiol 1998;148:1117-1126. [Free Full Text]
  2. Hayward RA, Kent DM, Vijan S, Hofer TP. Multivariable risk prediction can greatly enhance the statistical power of clinical trial subgroup analysis. BMC Med Res Methodol 2006;6:18-18. [CrossRef][Medline]
  3. Rothwell PM, Mehta Z, Howard SC, Gutnikov SA, Warlow CP. From subgroups to individuals: general principles and the example of carotid endarterectomy. Lancet 2005;365:256-265. [Web of Science][Medline]
  4. Kent DM, Hayward RA. Limitations of applying summary results of clinical trials to individual patients: the need for risk stratification. JAMA 2007;298:1209-1212. [Free Full Text]

 
The authors reply: We agree with Proestel's cautions about testing for the equality of treatment groups within the individual levels (say, males and females) of a baseline factor. Claims of heterogeneity of the treatment effect across the levels of a baseline factor should not be based on such tests. Furthermore, we in general do not recommend such testing after an interaction test, regardless of whether the latter is significant or not. One exception is if there was a prespecified reason to assess the treatment effect within a specific subgroup of the patients. A good way to present information about plausible treatment effects with the levels of a baseline factor is by means of a forest plot.1,2 The confidence intervals in such plots should not be used to indirectly assess "statistical significance" based on whether they exclude a null effect (say, a relative risk of 1), since doing so creates the same problems noted by Proestel for significance tests.

One important consideration for subgroup analysis is how subgroups are formed. Kent and Hayward recommend a specific way of forming subgroups on the basis of multiple, rather than individual, baseline characteristics. In their approach, patients are divided into separate groups according to their risks of a disease outcome, which are calculated from a prespecified, externally validated formula involving multiple baseline characteristics. The purpose of such subgroup analyses is to assess whether the treatment effect is homogenous across patients with different risks. We agree that such an approach can provide valuable information to guide individualized patient care. Moreover, when a specific risk-score algorithm is unavailable, it still could be appropriate to assess the heterogeneity of the treatment effect with the use of a prespecified clinically meaningful categorization based on multiple baseline characteristics. In other settings, interest in the heterogeneity of treatment effects may be motivated by metabolic, physiological, anatomical, genetic, or other independently identifiable features of the patients or their disease, not by their risk of the disease outcome under study. These considerations should be the main determinants of how subgroups are formed.

Finally, we do not believe that journals should dictate the scientific questions that investigators address, including whether and how they undertake subgroup analyses of any specific type. Rather, investigators should use a well-reasoned and fully described approach to subgroup analyses and report them in accordance with the guidelines offered in our article.


Rui Wang, M.S.
Stephen W. Lagakos, Ph.D.
Harvard University
Boston, MA 02115


Jeffrey M. Drazen, M.D.

References

  1. Cuzick J. Forest plots and the interpretation of subgroups. Lancet 2005;365:1308-1308. [CrossRef][Web of Science][Medline]
  2. Wactawski-Wende J, Kotchen JM, Anderson GL, et al. Calcium plus vitamin D supplementation and the risk of colorectal cancer. N Engl J Med 2006;354:684-696. [Free Full Text]

 

This Article
- PDF
-PDA Full Text

Tools and Services
-Add to Personal Archive
-Add to Citation Manager
-Notify a Friend
-E-mail When Cited
-E-mail When Letters Appear

More Information
-Related Article
 by Wang, R.


HOME  |  SUBSCRIBE  |  SEARCH  |  CURRENT ISSUE  |  PAST ISSUES  |  COLLECTIONS  |  PRIVACY  |  TERMS OF USE  |  HELP  |  beta.nejm.org

Comments and questions? Please contact us.

The New England Journal of Medicine is owned, published, and copyrighted © 2010 Massachusetts Medical Society. All rights reserved.