|
Background Approximately 90% of persons with amyotrophic lateral sclerosis (ALS) have the sporadic form, which may be caused by the interaction of multiple environmental factors and previously unknown genes.
Methods We performed a genomewide association analysis using 766,955 single-nucleotide polymorphisms (SNPs) found in 386 white patients with sporadic ALS and 542 neurologically normal white controls (the discovery series). Associations of SNPs with sporadic ALS were confirmed in two independent replication populations: replication series 1, with 766 case patients with the disease and 750 neurologically normal controls, and replication series 2, with 135 case patients and 275 controls.
Results We identified 10 genetic loci that are significantly associated (P<0.05) with sporadic ALS in three independent series of case patients and controls and an additional 41 loci that had significant associations in two of the three series. The most significant association with disease in white case patients as compared with controls was found for a SNP near an uncharacterized gene known as FLJ10986 (P=3.0x10–4; odds ratio for having the genotype in patients vs. controls, 1.35; 95% confidence interval, 1.13 to 1.62). The FLJ10986 protein was found to be expressed in the spinal cord and cerebrospinal fluid of patients and of controls. Specific SNPs seem to be associated with sex, age at onset, and site of onset of sporadic ALS.
Conclusions Variants of FLJ10986 may confer susceptibility to sporadic ALS. FLJ10986 and 50 other candidate loci warrant further investigation for their potential role in conferring susceptibility to the disease.
Little is known about the specific genes that contribute to the development of sporadic ALS. Moreover, despite extensive study of familial ALS–causing mutations in vitro and in animal models of the disease, the key events in the initiation and progression of sporadic ALS remain unclear. Pathologically, sporadic ALS is characterized by loss of motor neurons from the motor cortex, brain stem, and ventral horns of the spinal cord. Ubiquitinated inclusions (covalent bonds between ubiquitin and other proteins that mark them for degradation) have been observed in the lower motor neurons, although their role in the initiation and progression of disease is unclear.3 Numerous mechanisms have been implicated in the selective degeneration of motor neurons in patients with sporadic ALS, including oxidative damage, excitotoxicity, apoptosis, cytoskeletal dysfunction, axonal-transport defects, inflammation, protein-processing and degradation defects, and mitochondrial dysfunction.1,4
Identification of the specific genetic variants associated with sporadic ALS will improve our understanding of fundamental disease mechanisms. To this end, genomewide association studies provide a comprehensive, unbiased approach to screen groups of patients with sporadic ALS and groups of controls for genetic markers that are more common in patients with ALS and thus may reside in or near predisposition genes. To identify such markers, we carried out a genomewide case–control association study.
Methods
Acquisition of Samples
The overall study was approved at the Translational Genomics Research Institute by the Western Institutional Review Board and by the appropriate institutional review board at each participating site. Written informed consent was obtained from all participants. Between April 27 and October 6, 2006, we prospectively collected 1251 DNA samples from patients with a diagnosis of laboratory-supported probable, probable, or definite sporadic ALS, according to the El Escorial diagnostic criteria, and used the Motor Neuron Disease Clinical Data Elements form of the National Institute of Neurological Disorders and Stroke to facilitate data sharing with the community (see www.alsrg.org and Table 1 in the Supplementary Appendix, available with the full text of this article at www.nejm.org).5 Patients and controls were recruited and enrolled from all participating clinical sites.
We prospectively collected 1251 DNA samples from patients with sporadic ALS. Among these were 231 DNA samples obtained from the Coriell Cell Repositories. The 231 samples were cross-referenced with those collected from our prospectively enrolled patients, and we removed three samples from patients for whom we already had a sample. All clinical information for every enrolled patient was entered — in an anonymous, coded format — and tracked in a custom online database (designed by 5AM Solutions) that is fully compliant with the Health Insurance Portability and Accountability Act of 1996. A total of 1152 DNA samples were of sufficient quality to be genotyped. These were from 824 whites, 87 Hispanics, 35 blacks, 8 Asians, 3 American Indians, 3 Pacific Islanders, and 192 persons of unknown ethnic group (as self-reported in the presence of the physician; see the Supplementary Appendix for details). Of these patients, 692 were men and 460 were women, with a mean age of 59 years and a mean Amyotrophic Lateral Sclerosis Functional Rating Scale–Revised score of 30.37 (range, 4.00 to 48.00, with higher scores indicating less severe disease (see Table 1 in the Supplementary Appendix for more clinical details).
We divided these 1152 patients into a discovery series and an independent replication series (replication series 1). The discovery series was made up of 386 white patients and 542 controls who all were white, older than 65 years of age, and neurologically normal on clinical assessment. There was no evidence of population stratification in the control group.6 Replication series 1 consisted of 766 patients, 308 of whom were women and 458 of whom were men; there were 438 whites, 136 nonwhites, and 192 patients of unknown ancestral origin. The mean age was 59 years and the mean Amyotrophic Lateral Sclerosis Functional Rating Scale–Revised score was 29.94. The 750 control DNA samples for series 1 were obtained from neurologically normal elderly persons (353 white women and 397 white men, with a mean age of 66.1 years) and were purchased from the Rutgers University Cell and DNA Repository. For the second independent validation series (replication series 2), we obtained data from Schymick et al.7 (see below).
Study Design and Genotyping
We first carried out a pooled analysis of DNA samples extracted from the blood or cell lines from each of the 386 white patients with sporadic ALS and 542 white controls in the discovery series. (Details about DNA extraction are given in the Supplementary Appendix.) We divided the case subgroup of 386 patients in half and pooled the DNA samples in each, ensuring that DNA was present in equimolar amounts in each pair of pooled samples. We prepared three pairs of pooled samples, for a total of six independently created pooled samples. A pooled DNA sample for the 542 controls was also created in triplicate, to control for pipetting errors.
Each of the pooled samples was hybridized to three GeneChip Human Mapping 500K Array Sets (Affymetrix) and two Infinium II HumanHap300 Genotyping BeadChip arrays (Illumina) according to the manufacturers' protocols for genotyping individual DNA samples, yielding a total of 27 Affymetrix arrays and 18 Illumina arrays. The Illumina chip is made up of probes that query HapMap-defined tag single-nucleotide polymorphisms (SNPs), whereas the Affymetrix platform has probes that query relatively evenly spaced SNPs. Using these two platforms, we genotyped 766,955 unique SNPs, with an average intermarker distance of 3.9 kb.
We ranked the SNPs from the Affymetrix arrays and from the Illumina arrays according to the P values for the minor allele frequency for each SNP in the case group as compared with the control group, with the most significant P value corresponding to the highest rank and a cutoff P value of 0.05.8 The top-ranked 192 SNPs, according to P value, from each genotyping platform (a total of 384 SNPs) were selected for validation in replication series 1 (Fig. 1 in the Supplementary Appendix). For each of the 384 validation SNPs, we selected and genotyped in replication series 1 an additional linked SNP — that is, one that resides in the same haplotype block defined by the HapMap CEU data (from persons of Northern and Western European ancestry), as indicated by strong linkage disequilibrium (r2>0.8) — thereby protecting against errors in genotyping or assay failure at any one locus, as well as increasing the odds of having at least one informative SNP in the ethnically diverse subgroup of replication series 1. Thus, 2 SNPs per associated locus (768 SNPs in total) from the initial genome screen were tested in replication series 1. Genotyping of these SNPs was contracted to KBiosciences, which uses a proprietary variation of primer extension referred to as KASPar.
Schymick et al.7 have made their data set (created with the use of the Infinium II 550K platform [Illumina]) publicly available at https://queue.coriell.org/Q/snp_index.asp. They genotyped DNA samples from white patients with sporadic ALS and white controls, all residing in the United States, from the Coriell Cell Repositories. There was no evidence for population stratification in this series, on the basis of analysis with the use of STRUCTURE software. We created our replication series 2 using SNP genotype data from the 135 patients with sporadic ALS who were unique to this data set (after removing Coriell samples that were already represented in our replication series 1), as well as 275 unique white controls.
To confirm the presence of the FLJ10986 protein in patients with sporadic ALS, we immunoprecipitated protein from cerebrospinal fluid, using anti-FLJ10986 antibody, and analyzed the immunoprecipitate on a sodium dodecyl sulfate polyacrylamide gel. The 45-kDa and 48-kDa bands were excised from the gel, the proteins were eluted and digested with trypsin, and the resultant fragments were sequenced. (Details about Western blotting and other assays are given in the Supplementary Appendix.)
Statistical Analysis
Bonferroni correction for multiple testing was performed, with rs6700125 being the sole SNP to retain a significant association in replication series 1. Our method of assessing significance is dependent on defining association signals that rank highest across the genome (given the high-density coverage of the genome) and replicate across two or more independent cohorts. Because both the Affymetrix and Illumina platforms were used for our discovery series and replication series 1, many of our top-ranked SNPs were not analyzed in replication series 2. We therefore used a locus-specific validation method: we identified the SNPs found by Schymick et al. on the Infinium II 550K array that were also present within 25 kb on either side of each of our top-ranked loci and then calculated P values for the minor allele frequency for each SNP in the case group as compared with the control group in replication series 2, reporting the most significant P value (Table 1). Odds ratios were calculated by means of the DeFinetti program (http://ihg.gsf.de/cgi-bin/hw/hwa1.pl), and methods were adapted from Sasieni.9 Further details are given in the Supplementary Appendix. P values less than 0.05 were considered to indicate statistical significance.
|
Genomewide Associations
There was no population stratification in the discovery series (Fig. 2 in the Supplementary Appendix). Genotypes from a screen of the 386 patients with sporadic ALS (155 white women and 231 white men, with a mean age of 59 years and a mean Amyotrophic Lateral Sclerosis Functional Rating Scale–Revised score of 30.80) were ranked in comparison with the genotypes of the 542 controls (all white, with a mean age of 68 years and equal numbers of men and women), with the highest rank assigned to the SNP with most significant P value. The rank-ordered SNPs from both platforms are listed in Tables 2 and 3 in the Supplementary Appendix.
Validation of Significant Associations
Individual genotype data for 768 SNPs in replication series 1 showed a significant overall association of 66 SNPs with sporadic ALS, representing 51 unique loci (Table 1). Numerous loci had significant associations for both tag SNPs, suggesting that error in genotyping did not contribute to false positive associations. Of the 51 loci, 28 are intragenic or within approximately 50 kb upstream or downstream of annotated genes. The remaining 23 loci are not associated with a known gene within 50 kb upstream or downstream of the SNP. Of the 28 annotated genes, 9 have functions related to cytoskeletal regulation or neurodevelopment, suggesting that differences in these processes underlie predisposition to sporadic ALS.
We also found no population stratification in replication series 2 (Fig. 2 in the Supplementary Appendix). The most significant P value for each locus, the SNP in replication series 2 associated with that P value, and the chromosomal position of that SNP within the locus are listed in Table 1. The results show that there are 10 loci significantly associated with sporadic ALS among white members of all three independent series. An additional 41 loci are significantly associated in whites in two of the three series.
FLJ10986 was found to be the gene most significantly associated with sporadic ALS, without a specific association with any clinical subclass (see below), suggesting that it is an early and common predisposition gene for the disease, independent of clinical course. There is different haplotype-block structure between whites and persons of other ancestries in replication series 1, as evidenced by the P values for the FLJ10986 SNP rs6700125, which are the same for both the white case patients and the nonwhite case patients (87 Hispanics, 35 blacks, 8 Asians, 3 American Indians, 3 Pacific Islanders, and 192 for whom ethnic group is unknown). In contrast, the second SNP in this gene, rs6690993, which is in strong linkage disequilibrium with rs6700125 (r2>0.8) and is significantly associated with disease (P=3.0x10–4) among whites in replication series 1, is not associated with disease (P=0.11) in the group of nonwhite case patients in replication series 1 (Table 1). This indicates a difference in allele frequency and haplotype-block structure between the two groups and underscores the wisdom of using more than one SNP per locus until the HapMap database is populated with data from all ancestral and admixed populations.
Complete odds-ratio calculations for the significant SNPs from replication series 1 (Table 1), as well as all other SNPs, are presented in Table 4 in the Supplementary Appendix. The genes (and specific SNPs) in white case patients that were significantly associated with sporadic ALS in both replication series were FLJ10986 (rs6700125: odds ratio for having the genotype in patients vs. controls, 1.38; 95% confidence interval [CI], 1.16 to 1.65; rs6690993: odds ratio, 1.35; 95% CI, 1.13 to 1.62), PTPRT (rs13036957: odds ratio, 1.28; 95% CI, 1.04 to 1.56), IL18RAP (rs3771150; odds ratio, 1.21; 95% CI, 1.00 to 1.46), MAGI2 (rs757863: odds ratio, 1.23; 95% CI, 1.04 to 1.46), and LOXHD1 (rs988213: odds ratio, 1.31; 95% CI, 1.10 to 1.55). An additional five chromosomal loci for which no gene has been annotated were significantly associated with the disease in both replication series: 12q12 (rs1027615: odds ratio, 1.18; 95% CI, 0.98 to 1.41), 2q33.1 (rs12473579: odds ratio, 1.29; 95% CI, 1.09 to 1.53), 2q12.1 (rs17027230: odds ratio, 1.23; 95% CI, 1.02 to 1.48), 21q22.13 (rs2836061: odds ratio, 1.41; 95% CI, 1.13 to 1.77), and 12q12 (rs905080: odds ratio, 1.18; 95% CI, 0.98 to 1.41).
The most significant associations in our analyses of whites were with rs6700125 (P=6.0x10–4) and rs6690993 (P=3.0.0x10–4), which lie approximately 60 kb upstream of the uncharacterized gene FLJ10986 (Table 1). To confirm the association of this gene with sporadic ALS, we genotyped 71 additional flanking SNPs representing haplotype blocks defined by the HapMap data for whites from replication series 1 that spanned a total of 500 kb across the locus (Figure 1). We found four additional SNPs that were significantly associated with disease (rs10493256, P=0.003; rs6587852, P=0.001; rs1470407, P=7.0x10–4; rs333662, P=9.0x10–5) and that lie in the promoter region or the first two exons or introns of the FLJ10986 gene.
|
The predicted molecular mass of the FLJ10986 protein is 48 kDa, and we found a protein of this approximate size in kidney, lung, and small-intestine specimens of unaffected persons, with a lower level of expression in the liver (Figure 2). A protein doublet of approximately 48 and 50 kDa was found in human fetal brain specimens, along with species of lower molecular weight. Intense FLJ10986 immunoreactivity was also apparent in cerebrospinal fluid (Figure 2A). Blotting with secondary antibody alone failed to detect these bands.
|
In exploratory analyses, although the amount of FLJ10986 protein in spinal cord samples from controls and patients with sporadic ALS was found to be equal when normalized to the level of actin present in each, the relative ratio of the levels of 48-kDa FLJ10986 and 45-kDa FLJ10986 was greatest in patients with sporadic ALS who carried the risk allele in the rs6700125 or rs6690993 FLJ10986 SNP (Figure 2C) (P=0.049). The ratios in controls and in patients who did not have a FLJ10986 risk genotype were similar.
Clinical Subclasses of Sporadic ALS
Sporadic ALS is clinically heterogeneous, and thus genetic heterogeneity may underpin the disease subclasses. It is therefore likely that the relevance of overall P values for the association of SNPs with sporadic ALS is reduced in analyses of a genetically and clinically diverse series of patients. In post hoc exploratory analyses, we performed association analyses of data for subgroups with early onset of sporadic ALS (at
45 years) as compared with late onset (at
60 years) (Table 2), male patients with sporadic ALS as compared with female patients (Table 3), and sporadic ALS with a bulbar onset as compared with a limb onset (Table 4).
|
|
|
Discussion
We calculated the odds ratios for SNPs using three independent series selected to contain only whites of European descent. We present P values for all case patients in our ancestrally diverse replication series 1, since there will not be stratification at all loci, and the data may therefore assist future efforts at replication. Our replication series 2 was considerably smaller than the other two series and was underpowered to detect subtle allelic associations. We believe that the associations in replication series 2 that were not significant should still be considered, with increased confidence placed in those that were significant.
Biologic factors implicated by our data include cytoskeletal regulation (Table 1), a finding that suggests that aspects of cytoskeletal dysfunction may be central to the initiation or progression of sporadic ALS. This idea is congruous with emerging models of neurodegeneration in sporadic ALS, Alzheimer's disease, and spinal muscular atrophy, in which the loss of synaptic efficacy, aberrant axon pruning, and concomitant "dying back" of the neurons from the synaptic sites inward toward the cell bodies are thought to be among the earliest pathologic events.10,11,12 In particular, in considering the ongoing process of reorganizing the neuromuscular junction, the association of variants of anaplastic lymphoma kinase and nuclear factor 1
are interesting given their roles in neurite outgrowth13 and neuronal differentiation,14 respectively. Anaplastic lymphoma kinase has recently been shown to be critical for pleiotrophin-mediated axonal regeneration in motor neurons in the spinal cord, is necessary for neuroprotection in response to glutamate excitotoxicity, and is not expressed in the spinal cord of patients with sporadic ALS.15 Restoration of the function of anaplastic lymphoma kinase either directly, downstream in its intracellular signaling pathway, or by increasing the amount of its ligand pleiotrophin could result in protective effects. Retinoic acid induces the expression of pleiotrophin and may therefore provide some therapeutic benefit to patients with sporadic ALS, particularly if administered early in the disease process — although this is speculative, at best, and the association between variants of anaplastic lymphoma kinase and sporadic ALS awaits replication and refinement by others. NOX4 has been previously implicated in sporadic ALS,16 which lends support not only to our finding of association between NOX4 and disease but also, by extension, to the other genetic associations that we report (Table 1).
Little is known about the function of FLJ10986. However, about 80% of its 439 amino acids make up the FGGY family of carbohydrate kinase domains. These domains are found in a family of proteins including L-fucolokinase, gluconokinase, glycerol kinase, and xylulokinase, which phosphorylate fucolose, gluconate, glycerol, and xylulose, respectively, and have roles in energy metabolism and glycolysis. The potential substrate or substrates for FLJ10986 are unknown, and how the activity of the protein may be relevant to the pathogenesis of sporadic ALS is also unclear. We found an FLJ10986-protein doublet in tissues of the nervous system. Further work is required to determine whether FLJ10986 is alternatively spliced or is subject to post-translational modifications within the nervous system and to confirm that the higher-molecular-weight species of FLJ10986 is more abundant than the lower-molecular-weight species in patients with sporadic ALS who have a SNP associated with the disease.
Our findings suggest that there is no single, overwhelming genetic association underlying sporadic ALS, and this is consistent with a model in which sporadic ALS results from a complex interplay of environmental factors and numerous low-risk susceptibility loci. Unraveling the network of causes will probably require substantial effort once the genes involved have been identified. However, the identification of the candidate susceptibility loci in this and other studies is an essential first step.
Supported by a grant from the Muscular Dystrophy Association Augie's Quest initiative and the Dorrance Family Foundation.
Dr. Levine reports receiving speaker fees from Eli Lilly, Boehringer Ingelheim, and GlaxoSmithKline; Dr. Bertorini, consulting fees from Serono, Pfizer, Teva, and Allergen; Dr. Graves, consulting fees from Avanir, Novartis, Sanofi-Aventis, the Muscular Dystrophy Association, the Guillain–Barré Syndrome Support Group, and Pharmacia; lecture fees from Avanir; and grant support from Phamacia; Dr. Mozzafar, consulting fees from Celgene, Genzyme, and Allergen and lecture fees from Genzyme and Cresent Healthcare; Dr. Lomen-Hoerth, consulting fees from Kinemed, Neurological Biological Institute, Rinat, Pfizer, Alta Bates, Celgene, Allergen, and Columbia and grant support from Avanir Pharmaceuticals and holding equity interests in Hewlett-Packard; Dr. Mitsumoto, consulting fees from Eisai, lecture fees from Avanir, and grant support from Aventis, Aeolus, and Avanir; Dr. Miller, consulting fees from Celgene and Novartis; and Dr. Appel, consulting fees for his position on the medical advisory board of Vasogen and holding equity interests in Cyberomics. No other potential conflict of interest relevant to this article was reported.
Source Information
From the Translational Genomics Research Inst., Phoenix, AZ (T.D., M.J.H., D.W.C., J.V.P., S.S., K.J., R.F.H., C.S., K.R.J., D.L., D.A.S.); Muscular Dystrophy Association, Tucson, AZ (S.E.H.); Washington Univ. School of Medicine, St. Louis (A.P.); Phoenix Neurological Associates, Phoenix, AZ (T.L.); Univ. of Tennessee, Memphis (T.B.); Univ. of California, Los Angeles (M.C.G.); Univ. of California, Irvine (T.M.); Univ. of Texas Health Science Center, San Antonio (C.E.J.); Mayo Clinic, Scottsdale, AZ (P.B.); Univ. of Kansas Medical Center, Kansas City (A.M., A.D., R.B.); Univ. of California, San Francisco (C.L.-H.); Carolinas Medical Center, Charlotte, NC (J.R.); Univ. of California at San Diego School of Medicine, La Jolla (D.T.O., K.Z.); Mayo Clinic College of Medicine, Jacksonville, FL (R.C., M.H.); Univ. of Pittsburgh Medical Center, Pittsburgh (H.R., R.B.); California Pacific Medical Center, San Francisco (J.K., R.G.M.); Methodist Neurological Inst., Houston (E.P.S., S.H.A.); and Columbia Univ. Medical Center, New York (H.M.).
This article (10.1056/NEJMoa070174) was published at www.nejm.org on August 1, 2007.
Address reprint requests to Dr. Stephan at the Translational Genomics Research Inst., 445 N. Fifth St., Phoenix, AZ 85004, or at dstephan{at}tgen.org.
References
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
This article has been cited by other articles:
HOME | SUBSCRIBE | SEARCH | CURRENT ISSUE | PAST ISSUES | COLLECTIONS | PRIVACY | HELP | beta.nejm.org Comments and questions? Please contact us. The New England Journal of Medicine is owned, published, and copyrighted © 2008 Massachusetts Medical Society. All rights reserved. |