|
| |||||||||||||||||||||||||||||||||||||||||||||||
Background The cumulative risk of a false positive result of a breast-cancer screening test is unknown.
Methods We performed a 10-year retrospective cohort study of breast-cancer screening and diagnostic evaluations among 2400 women who were 40 to 69 years old at study entry. Mammograms or clinical breast examinations that were interpreted as indeterminate, aroused a suspicion of cancer, or prompted recommendations for additional workup in women in whom breast cancer was not diagnosed within the next year were considered to be false positive tests.
Results A total of 9762 screening mammograms and 10,905 screening clinical breast examinations were performed, for a median of 4 mammograms and 5 clinical breast examinations per woman over the 10-year period. Of the women who were screened, 23.8 percent had at least one false positive mammogram, 13.4 percent had at least one false positive breast examination, and 31.7 percent had at least one false positive result for either test. The estimated cumulative risk of a false positive result was 49.1 percent (95 percent confidence interval, 40.3 to 64.1 percent) after 10 mammograms and 22.3 percent (95 percent confidence interval, 19.2 to 27.5 percent) after 10 clinical breast examinations. The false positive tests led to 870 outpatient appointments, 539 diagnostic mammograms, 186 ultrasound examinations, 188 biopsies, and 1 hospitalization. We estimate that among women who do not have breast cancer, 18.6 percent (95 percent confidence interval, 9.8 to 41.2 percent) will undergo a biopsy after 10 mammograms, and 6.2 percent (95 percent confidence interval, 3.7 to 11.2 percent) after 10 clinical breast examinations. For every $100 spent for screening, an additional $33 was spent to evaluate the false positive results.
Conclusions Over 10 years, one third of the women screened had abnormal test results requiring additional evaluation, even though no breast cancer was present. Techniques are needed to decrease false positive results while maintaining high sensitivity. Physicians should educate women about the risk of a false positive result of a screening test for breast cancer.
If a woman undergoes annual screening beginning at the age of 40, she will have had 60 opportunities for a false positive result by the age of 70, with 30 mammograms and 30 clinical breast examinations. The cumulative lifetime risk of her having a result from a screening test that requires further workup, even though no breast cancer is present, is not known. An estimate of 25 percent has been given for the cumulative risk of a false positive result after 10 mammograms and 10 clinical breast examinations.4 It is important to determine the cumulative risk of false positive tests, because women are advised to have breast-cancer screening every one to two years over several decades of their lifetimes, and false positive results can provoke anxiety, increase costs, and cause morbidity.5,6,7,8,9,10,11,12,13
Using the computerized clinical records of a health maintenance organization (HMO) for a group of women over a 10-year period, we determined the cumulative risk of a false positive result of breast-cancer screening, the number and type of subsequent diagnostic workups resulting from the false positive results, and the costs of the false positive results. The HMO we studied has long encouraged women who are 40 or older to undergo routine breast-cancer screening. By studying the medical records, we ascertained the 10-year cumulative rates of false positive results for both mammography and clinical breast examination. We then determined the number of diagnostic examinations generated by the false positive results and estimated their costs.
Methods
Setting
This retrospective cohort study was conducted at 11 staff-model health centers of Harvard Pilgrim Health Care, a large HMO in New England. The health centers serve nearly 300,000 adults in and around Boston. Although the majority of members belong to the HMO through an employer or a spouse's employer, approximately 5 percent are enrolled through the state Medicaid program for low-income persons. This study was approved by the Human Studies Committee of Harvard Pilgrim Health Care and the institutional review board of the University of Washington School of Medicine.
Breast-cancer screening for the members of Harvard Pilgrim Health Care is encouraged by internal guidelines and a computerized reminder system that prompts health care providers to perform clinical breast examinations and order mammograms for screening. Beginning in 1984 and throughout the study period, the HMO recommended that all women from 40 to 49 years of age be screened with mammography every two years and that women 50 years of age or older be screened annually. Most of the women were referred to sites outside the HMO for mammography, including local community and academic radiology centers. All of the radiologists who read the mammograms were board certified and worked in groups that contracted with the HMO.
Study Population
All 14,382 women who were members of the HMO and who were between 40 and 69 years of age on July 1, 1983, were potentially eligible for the study. Women were excluded for the following reasons: a lapse in enrollment in the HMO between July 1, 1983, and June 30, 1995 (8816 women); health coverage from a source other than Harvard Pilgrim Health Care or from a noncomputerized HMO center during the study period (1093 women); and a history of breast cancer or a prophylactic mastectomy or breast implants before July 1, 1983, (146 women) or a prophylactic mastectomy or breast implants during the study period (8 women). From the cohort of 4319 remaining eligible subjects, a random sample was chosen, consisting of 1200 women 40 to 49 years of age, 600 women 50 to 59 years of age, and 600 women 60 to 69 years of age, for a total sample of 2400 women.
Review of Medical Records
Harvard Pilgrim Health Care keeps computerized records of patients' visits for ambulatory care services.14,15 Data on demographic characteristics, risk factors for breast cancer, screening clinical breast examinations, screening mammography, diagnostic testing performed as a result of breast-cancer screening, and breast cancers diagnosed were abstracted from these records onto standardized forms. If information was missing or clarification was needed, the original test reports were reviewed. Household income was estimated by matching each patient's address on December 1, 1995, with census-tract data.16
For development and training purposes, the data on the first 581 patients were extracted independently by a research assistant and one of the authors. To ensure quality, the data on a randomly selected sample of 5 percent of the next 1443 patients were extracted by a second person who was unaware of the first research assistant's results. Inconsistencies between the reviewers were found for 175 (0.9 percent) of the 19,407 variables reviewed among the 1443 patients. Forty of these inconsistencies (0.2 percent) were considered clinically important because they concerned the specific reasons for ordering the test. In view of this low rate of discrepancy, a 5 percent sample of the remaining 376 charts was reviewed by two different persons, who were not blinded to each other's results. In this final review, 1 inconsistency was noted among 4093 variables (0.02 percent). All inconsistencies were resolved by consensus.
Information was recorded for each appointment at which screening occurred. The diagnostic impressions for both mammography and clinical breast examination were classified as normal; abnormal and probably benign; abnormal, indeterminate; or abnormal and arousing a suspicion of cancer. Recommendations for additional testing were recorded, including diagnostic mammography within the next 12 months or second-opinion review of the screening films, ultrasound examination, physical examination (by the primary care provider or a surgeon), and biopsy (including fine-needle aspiration, open, and core biopsies). Information was recorded on all diagnostic procedures and follow-up visits resulting from positive breast-cancer screening tests.
The computerized medical records, the HMO's tumor registry, and, if needed, the original paper copy of the test results were searched to identify incident cases of breast cancer. In addition, the study participants' records for two years after completion of the study (from July 1, 1993, to June 30, 1995) were searched to be certain that all breast-cancer cases were identified.
Definitions of Screening Tests and False Positive Results
Mammography or clinical breast examinations performed on asymptomatic women without previously noted abnormalities were classified as screening tests. Mammography or clinical breast examinations performed because of abnormalities previously noted by clinicians or patients were classified as diagnostic tests.
A false positive result was defined in a manner consistent with current recommendations regarding mammography audits17,18,19 and reported by other investigators.3,20,21,22 A test was classified as positive if the results were indeterminate or aroused a suspicion of cancer, or if there was a recommendation for nonroutine follow-up, including physical examination, diagnostic mammography within the next 12 months, ultrasound examination, or biopsy. A positive test was classified as true positive if breast cancer (invasive or ductal carcinoma in situ) was diagnosed in the patient on the basis of pathological findings within one year of the test, and as false positive otherwise. False positive results were independent of each other.
Analysis and Assessment of Costs
The data were double-entered and verified for computer analysis. Initial comparisons were made with use of the chi-square test for categorical data and Student's t-test or analysis of variance for continuous data. Tests for trend were made with use of the MantelHaenszel chi-square statistic with one degree of freedom. These analyses were performed with SAS software.23
Estimates of the cumulative risk of having a false positive test are based on a Bayesian version of a product or an estimate of the KaplanMeier type, in which screening events (mammography or clinical breast examinations) are used instead of time (see Appendix 1).
A current procedural and technical code24 was assigned to all workups resulting from the breast-cancer screening. The national Medicare fee schedule25 and the average HMO payment were used to estimate the average payment. Inpatient care was assigned a payment specific to the diagnosis-related group (see Appendix 2).
Results
Characteristics of the Patients
Most of the 2400 women were white (75 percent); 11 percent were black, 3 percent were of other races, and 11 percent were of unknown race. The median household income was $47,940 (range, $13,230 to $161,710). Eighteen percent of the women had a family history of breast cancer recorded, and 28 percent used estrogen-replacement therapy at some time during the study period.
Frequency of Breast-Cancer Screening
On average, the women underwent screening mammography and a screening clinical breast examination every two years. A small number of women (88 [3.7 percent]) had no documented breast-cancer screening, either by clinical examination or by mammography, during the 10 years of the study.
Over the 10-year period, 9762 screening mammograms were obtained for 2227 women; 173 women (7.2 percent) underwent no screening mammography. For those who had at least one mammogram, the median number of mammograms obtained was four (range, one to nine). The mammograms were read by 93 radiologists at 28 radiology facilities, consisting of 17 community, 7 HMO, and 4 academic sites. Four radiologists each read more than 1000 mammograms, one radiologist read 624 mammograms, and the others each read fewer than 500 mammograms.
Also during these 10 years, 10,905 screening clinical breast examinations were performed on 2245 women; 155 women (6.5 percent) had no screening breast examination. For those who had at least 1 breast examination, the median number of examinations performed was 5 (range, 1 to 16). The breast examinations were performed by 381 health care providers; 9290 were performed by internists, 1385 by registered nurses, nurse practitioners, or physician's assistants, 160 by obstetrician-gynecologists, 50 by surgeons, and 20 by providers with unknown credentials.
Detection of Cancer
Between July 1, 1983, and June 30, 1994, breast cancer was diagnosed in 88 women. The mean age of the women was 59 years (range, 42 to 76). Local disease was present in 67 women, and regional disease in 21. Ductal carcinoma in situ was diagnosed in 15 of the 88 women. In 58 women, the breast cancer was diagnosed as a result of an abnormality first noted on a screening mammogram (50 cancers were diagnosed within 12 months after mammography and 8 after more than 12 months). In 7 women, the cancer was diagnosed as a result of a clinical breast examination (4 cancers were diagnosed within 12 months after the breast examination and 3 after more than 12 months). In the remaining 23 women, the cancers were diagnosed after the women themselves noted an abnormality and sought medical evaluation.
False Positive Results
False positive results occurred in 6.5 percent of the mammograms and 3.7 percent of the clinical breast examinations (Table 1). Among the women who were screened, 23.8 percent had at least one false positive mammogram and 13.4 percent had at least one false positive breast examination during the 10 years. A false positive test due to either type of screening was noted in 31.7 percent of the women. In the majority of these women, there was only one false positive result; 89 women had two or more false positive mammograms, 72 had two or more false positive breast examinations, and 96 had false positive results on both a mammogram and a breast examination.
|
The false positive rates were higher for younger women than for older women (Table 2). The percentage of mammograms that were false positive decreased from 7.8 percent for women 40 to 49 years of age to 4.4 percent for women 70 to 79 years of age (P = 0.001). The false positive rate for clinical breast examination was highest for women 40 to 49 years of age (6.0 percent) and decreased to 2.2 percent for women 70 to 79 years of age (P = 0.001).
|
The risk of a woman's ever having a false positive result increased as she underwent more screening. The estimated cumulative risk of having at least one false positive result after 10 screenings was 49.1 percent (95 percent confidence interval, 40.3 to 64.1 percent) for mammograms and 22.3 percent (95 percent confidence interval, 19.2 to 27.5 percent) for clinical breast examinations (Figure 1 and Figure 2).
|
|
Diagnostic Evaluations Performed
The relevant diagnostic evaluations performed within one year of the false positive result are shown in Table 3. False positive mammograms led to more outpatient visits, diagnostic imaging examinations, and biopsies than false positive clinical breast examinations. In one patient, cellulitis requiring hospitalization for surgical débridement and intravenous antibiotic therapy developed after a biopsy prompted by a false positive mammogram. In addition to the workups shown in Table 3, we found documentation of 260 telephone calls to patients, 32 letters to patients, and 64 second opinions obtained from radiologists after false positive mammograms. These additional events were less common after false positive clinical breast examinations, which resulted in 23 telephone calls, 5 letters, and 9 second opinions from radiologists.
|
A woman's estimated cumulative risk of having at least one biopsy (open, core, or fine-needle aspiration biopsy) as a result of a false positive test also increased with repeated screenings. The risk was 6.2 percent (95 percent confidence interval, 5.1 to 7.3 percent) after 5 screening mammograms and 18.6 percent (95 percent confidence interval, 9.8 to 41.2 percent) after 10 screening mammograms. For clinical breast examinations, the risk was 2.4 percent (95 percent confidence interval, 1.8 to 3.2 percent) after 5 examinations and 6.2 percent (95 percent confidence interval, 3.7 to 11.2 percent) after 10 examinations.
Cost of Diagnostic Evaluations
The payment allowances for the initial screenings (mammography and clinical breast examinations) were $993,870 according to HMO payment-allowance estimates and $1,042,311 according to Medicare estimates. The payment allowances for the diagnostic workups undertaken as a result of the false positive tests shown in Table 3 were $329,649 according to HMO estimates and $309,755 according to Medicare payment allowances. The payment allowances for workups after false positive mammograms were three times as high as those for workups after false positive clinical breast examinations.
Discussion
On average, the women in this community-based cohort underwent breast-cancer screening every two years. Over a 10-year period, the result in one third of these women required additional evaluation when no breast cancer was present. The risk of a false positive test increased with the number of breast-cancer screening tests, so that by the time a woman had undergone 10 tests, the estimated cumulative risk of at least one false positive mammogram was about 50 percent and the estimated cumulative risk of at least one false positive breast examination was about 25 percent.
The HMO setting provided complete follow-up for each woman, and the computerized medical records allowed simplified data extraction for a large, community-based cohort. Results in managed-care populations do not necessarily apply to other populations; however, our findings are likely to be applicable to other communities, since 93 different radiologists read the mammograms and 381 different health care providers performed the clinical breast examinations.
The cumulative risk of false positive results that we estimated from community-based data is twice the risk estimated by Eddy in 1989.4 In addition, our estimates of the cumulative risk of false positive results for screening mammography may be low, because the overall percentage of abnormal screening mammograms in our study was 6.5 percent, whereas the national rate is nearly twice as high.3
Our definition of a false positive result is consistent with current recommendations regarding mammography audits.17,18,19 We classified as positive results all instances in which screening mammograms were interpreted as indeterminate or as arousing a suspicion of cancer or in which additional workup was recommended. This definition, which has been used by others,3,20,21,22,26 may be considered too broad. However, we found that even with this definition of false positive results, some follow-up diagnostic procedures were not counted. For example, five women who did not have cancer had breast biopsies prompted by screening mammograms that were interpreted by the radiologist as "abnormalbenign" but with no specific recommendations made. These cases were not included in our definition of a false positive test. Some authors have calculated false positive rates by counting as positive only impressions that led to breast biopsy.27 When this definition was applied to our data, the estimated cumulative risk of having at least one biopsy as a result of a false positive test was 19 percent after 10 mammograms and 6 percent after 10 clinical breast examinations.
We advocate a broad definition of false positive tests, because in formulating policy for the delivery of health care, it is important to determine the type and cost of all follow-up procedures, not just breast biopsies. In addition, it is increasingly clear that being told of an abnormal mammogram can cause increased anxiety in women for extended periods, regardless of whether a biopsy is performed.5,6,7,8,9,10,11,12,13 In the United States, Lerman et al.10 found that three months after they had false positive results on mammography, 47 percent of women who had highly suspicious readings reported that they had substantial anxiety related to the mammogram, 41 percent reported that they had worries about breast cancer, 26 percent reported that the worry affected their daily mood, and 17 percent reported that it affected their daily function.10 In Norway, 18 months after screening mammography, 29 percent of women with false positive results reported anxiety about breast cancer, as compared with 13 percent of women with negative results.7 Two studies in Britain also found that women with false positive mammograms had more anxiety than those with normal mammograms.5,6
The standard definition of false positive tests in breast-cancer screening uses a one-year cutoff date,17,18,19,20,27 so that women without a diagnosis of breast cancer in the 12 months after a positive test are counted as having had false positive tests. However, a patient's actual clinical course does not always fit into the one-year period. In our study, eight breast cancers were diagnosed after a series of evaluations that took longer than one year after a positive mammogram. This situation can arise when a radiologist recommends follow-up mammography, which is then repeated for several six-month periods before the diagnosis is made. Similarly, three breast cancers were diagnosed more than 12 months after positive clinical breast examinations. If the cutoff date were changed to two years, the false positive rates we reported would change very little; the number of false positive mammograms would decrease from 631 to 624, and the number of false positive clinical breast examinations would decrease from 402 to 397.
Abnormal mammographic readings are more common in the United States than in other countries (for example, in the United States approximately 11 percent of mammograms are read as abnormal, as compared with 2 to 5 percent in Sweden), whereas the sensitivity is about the same.2,3 The possibility that radiologists in the United States are interpreting too many mammograms as abnormal should be investigated. Little is known about the accuracy of clinical breast examinations in a community setting, despite the fact that these examinations are recommended for all women over the age of 40.28
The cost of working up patients with false positive results in this study was approximately one third the cost of performing the screening. The costs of evaluating women with false positive mammograms in the Stockholm randomized clinical trial were about one fourth of the costs of the initial screening.26
If our rates are representative, the number of breast-cancer screenings in the United States in which abnormalities are noted that require additional testing in women who do not have cancer may be substantial. For example, if 32 million American women who are 40 to 79 years old received breast-cancer screening annually for 10 years, 16 million women would have at least one false positive mammogram and 7 million would have at least one false positive clinical breast examination.
This study was a retrospective review, and therefore some missing data were inevitable. However, because we required all patients to be enrolled for at least two years beyond the study period and we searched for data from multiple sources, we are confident we did not misclassify the breast-cancer status of patients with abnormal test results. The sensitivity of breast-cancer screening cannot be calculated from these data, because eligibility for the study was confined to women enrolled continuously in the HMO for 12 years. Women who died during the study period (some of whom may have died of advanced breast cancer) were therefore ineligible. Exclusion of these women may have led to an artificial lowering of the stage of the breast cancers noted. It is theoretically possible that women might have quit the health plan because they had false positive breast-cancer screening results; these women would also not have been eligible for the study.
Although much is known about breast-cancer screening from randomized clinical trials and academic settings, little information is available about the effect of repeated breast-cancer screening on women in a community setting. This study indicates that we need to develop ways to reduce the false positive rates of breast-cancer screening and their associated psychological and economic costs. One possibility for reducing the psychological sequelae is to use on-site radiologists to obtain immediate workups instead of requiring women to return for follow-up. In the meantime, women should be educated about their chances of having an abnormality noted on breast-cancer screening tests, and health care providers should be trained to deal with positive results when they occur.
Supported by a Research Development Grant from the Yale Claude Pepper Aging Center (to Dr. Elmore), the American Cancer Society (to Dr. Elmore), a Robert Wood Johnson Generalist Faculty Scholar Award (to Dr. Elmore), and the Harvard Pilgrim Health Care Foundation (to Drs. Barton and Fletcher).
We are indebted to Ms. Rhoda Demiany-Pahl for assistance in data abstraction; to H. Gilbert Welch, M.D., John Grabowski, Pamela Okura, Thomas S. Inui, M.D., and Eric Larson, M.D., for their suggestions; to Heidi Lowrey and Margaret Oppenheimer for assistance in the preparation of the manuscript; and to Alan Gelfand, Ph.D., and Fei Wang, who developed the Bayesian modeling and conducted the analysis.
Source Information
From the Departments of Medicine (J.G.E.) and Epidemiology (J.G.E., V.M.M.), University of Washington School of Medicine, Seattle; and the Departments of Ambulatory Care and Prevention (M.B.B., S.P., S.W.F.) and Diagnostic Radiology (P.J.A.), Harvard Pilgrim Health Care and Harvard Medical School, Boston. Presented in part at the national meeting of the Society of General Internal Medicine, Washington, D.C., May 13, 1997.
Address reprint requests to Dr. Elmore at the Division of General Internal Medicine, University of Washington School of Medicine, 1959 N.E. Pacific, Rm. BB527E, Box 356429, Seattle, WA 98195-6429.
References
Appendix 1. The Bayesian Model Used to Estimate the Cumulative Risk of a False Positive Result
The cumulative risk of a false positive result is estimated by using a Bayesian version of an estimator of the product or KaplanMeier type in which the number of screening events (mammography or clinical breast examinations) is used instead of time. For the i th subject in the study, we define Wi as the "time," measured by the number of screenings, until the first false positive result. That is, Wi = j if her first j-1 tests were negative and a false positive result occurred at the j th screening. If the i th subject had a total of ki screenings during the study, with none of them false positive, we denote this by Wi >ki. Then
|
|
|
For a sample of n women, the likelihood for this model takes the form
|
|
|
Appendix 2. The 1997 Medicare Payment Allowance and the 1995 HMO Average Payment Used to Estimate Costs
The values used to estimate the total costs of breast-cancer screening and the evaluations performed after false positive results are shown below. Current procedural and technical codes (CPT) and, if applicable, ambulatory-surgical-center (ASC) payment levels were assigned for each procedure. The relative-value units for each CPT code were multiplied by $36. No costs were assessed for the documented telephone calls and letters noted after the false positive tests.
|
| |||||||||||||||||||||||||||||||||||||||||||||||
Related Letters:
False Positive Rate of Screening Mammography
Olivotto I. A., Kan L., Coldman A. J., Paci E., Giorgi D., del Turco M. R., Roux S., Markle L., Diamond A., Sickles E. A., Fishbein M., Gross T. L., Kopans D. B., Feig S. A., Elmore J. G., Barton M. B., Arena P. J., Sox H. C.
Extract |
Full Text
N Engl J Med 1998;
339:560-564, Aug 20, 1998.
Correspondence
This article has been cited by other articles:
HOME | SUBSCRIBE | SEARCH | CURRENT ISSUE | PAST ISSUES | COLLECTIONS | PRIVACY | HELP | beta.nejm.org Comments and questions? Please contact us. The New England Journal of Medicine is owned, published, and copyrighted © 2008 Massachusetts Medical Society. All rights reserved. |