Computer assisted cloning of human neutral ␣-glucosidase C (GANC): A new paralog in the glycosyl family 31

R. Hirschhorn*, M. L. Huie, and J. S. Kasper†

Department of Medicine, Division of Medical Genetics, New York University School of Medicine, 550 First Avenue, New York, NY 10016

Edited by Victor A. McKusick, Johns Hopkins University School of Medicine, Baltimore, MD, and approved August 13, 2002 (received for review June 27, 2002) The exponential expansion of the publicly available human DNA relative to the human lysosomal acid ␣-glucosidase and sequence database has increasingly facilitated cloning by homol- to human glucosidase II as well as to the murine ortholog of ogy of for biochemically defined, functionally similar pro- GANC, and a molecular weight similar to that of the lysosomal teins. We hypothesized that an as-yet uncloned human ␣-glucosi- enzyme acid ␣-glucosidase (4, 5, 17). The gene maps to human dase (human neutral ␣-glucosidase C or GANC) is a previously 15 (5), is designated as GANC on the human gene uncharacterized member of a paralogous human glycosyl hydro- map (ref. 17; OMIM 104180), and exhibits a biochemical genetic lase gene family 31, sharing and related, but polymorphism with four alleles, including a null allele, segre- not identical, functions with other cloned human ␣-. gating in the population (3). The known properties of GANC, We now report both the in silico and physical cloning of two alleles including 100 kDa molecular mass, its ability to degrade glyco- of human neutral ␣-glucosidase (designated GANC on the human gen, and the neutral pH optimum suggested homology to family gene map). This cloning and correct identification and annotation 31 glycosyl and particularly to lysosomal acid ␣- as GANC was successful only because of the application of the glucosidase and to the catalytic unit of glucosidase II. biochemical and genetic information we had previously developed regarding this gene to the results of the in silico method. Of note, Materials and Methods this glucosidase, a member of family 31 glycosyl hydrolases, has Computer-Based Identification of Candidate Clones and Construction multiple alleles, including a ‘‘null’’ allele and is potentially signif- and Analysis of a Computer Contig. The nonredundant sequence icant because it is involved in glycogen metabolism and localizes to database and the human EST database (http:͞͞www.ncbi.nlm. a chromosomal region (15q15) reported to confer susceptibility to nih.gov͞) were screened by using as the initial query sequence diabetes. both human lysosomal acid ␣-glucosidase, glucosidase II, and the original BLAST program with BLAST X or the gapped BLAST he rapid rise in the number of cloned genes has resulted in program. This screen identified three ESTs: GB R95789, clone Tan increasing ability to identify sequence homologies be- 1; GB AA101464͞3, clone 2; and GB AA323316, clone 3. These tween different genes, either within the same species (paralogs) clones were obtained from commercial sources (American Type or between species (orthologs). The publicly available human Culture Collection, Research Genetics, or Genome Research) draft sequence (refs. 1 and 2; http:͞͞www.ncbi.nlm.nih.gov͞) has and sequenced completely in both directions. A contig was provided the potential to identify more easily and clone members constructed on the computer and compared for homology with of gene families in silico, greatly facilitating their physical other glucosidases (see Fig. 1a and legend). cloning. We report one such case, the cloning of a previously described human ␣-glucosidase (neutral ␣-glucosidase C or Construction of cDNAs Containing the Coding Sequence in an Expres- GANC). However, physical cloning and correct annotation of sion Vector. The initial cDNA construct that was expressed full-length GANC was only made possible by applying the contained a 5Ј fragment from clone 2 ligated to a 3Ј fragment biochemical and genetic information we had previously devel- obtained by RT-PCR to eliminate two areas of nonhomology oped regarding this gene to the in silico information (3–5). found by analysis of the computer contig (Fig. 1c). In detail, a 2.1 At least six different human ␣-glucosidases hydrolyzing ͞ ␣ kb EcoRI SacI fragment from clone 2, (deleting the first 263 bp -linked glucose in simple and complex carbohydrates have been of clone 2 and starting 490 bp 5Ј of the putative ATG and described biochemically and genetically. These differ extending 1,619 bp 3Ј of the ATG) was cloned into pUC19 (clone with respect to substrate specificities, pH optima, molecular B). To eliminate the major portion of the 490 bp of 5Ј untrans- weights, sites of expression, and chromosomal localizations. Five lated sequence, an EcoRI site was introduced into clone B by of these human ␣-glucosidases have now been cloned: lyso- Ј ␣ PCR just 5 of the putative ATG (by using a sense amplification somal acid -glucosidase (GAA; refs. 6–8), intestinal - primer containing an added EcoRI site and extending from Ϫ8 isomaltase (SI; refs. 9 and 10); intestinal -glucoamylase to ϩ 10, relative to the ATG, and an antisense primer 3Ј of a (MGA; ref. 11), the catalytic unit of the endoplasmic reticulum unique EcoRV site). The PCR fragment was digested with enzyme glucosidase II (GANAB); see Online Mendelian Inheri- EcoRI and EcoRV and exchange-ligated into the EcoRI͞EcoRV tance in Man (OMIM 601862 and 104160; refs. 12 and 13); and digested clone B to give clone B-7. The remaining 3Ј end of the glucosidase I (GCSI; ref. 14). Four of the five cloned human ␣-glucosidase genes (excluding glucosidase I) share homology, based on sequence similarity and signature sequence for family This paper was submitted directly (Track II) to the PNAS office. 31 glycosyl hydrolases, suggesting that they are members of a Abbreviation: GANC, human neutral ␣-glucosidase C. gene family in a single species or human paralogs for family 31 ͞͞ Data deposition: The sequences reported in this paper have been deposited in the GenBank glycosyl hydrolases (ref. 15; reviewed in ref. 16 and http: database (accession nos. AF545044–AF545047). ͞ϳ ͞ ͞ afmb.cnrs-mrs.fr cazy CAZY index.html). *To whom correspondence should be addressed at: Department of Medicine, New York Neutral ␣-glucosidase C, the remaining human ␣-glucosidase, University School of Medicine, 550 First Avenue (C&D 6), New York, NY 10016. E-mail: had not as yet been cloned. Neutral ␣-glucosidase C (GANC) has [email protected]. activity at neutral pH, a characteristic electrophoretic mobility †Present address: Dana–Farber Institute, Harvard Medical School, Boston, MA 02115.

13642–13646 ͉ PNAS ͉ October 15, 2002 ͉ vol. 99 ͉ no. 21 www.pnas.org͞cgi͞doi͞10.1073͞pnas.202383599 Downloaded by guest on September 29, 2021 Fig. 1. Steps in the in silico and physical cloning and expression of GANC. (a) Computer-based identification of candidate clones and construction and analysis of a computer contig. Sequence for human acid ␣-glucosidase and the catalytic subunit of glucosidase II was used as ‘‘bait’’ in a TBLASTX and BLASTN search of the human EST database for ‘‘orphan’’ ESTs with homology to the two glucosidases. This procedure identified GB R95789 (clone 1) that contained a sequence tagged site (STS) WI-17962 mapping to chromosome 15q15 in radiation hybrid panels 3 and 4, which is consistent with the earlier mapping of GANC to in somatic cell hybrids (5). The search simultaneously identified GB AA101464͞3 (clone 2), a putatively ‘‘reversed’’ EST that contained sequence for the consensus catalytic site for family 31 glycosyl hydrolases (15) and an STS G31167 that also mapped to chromosome 15q15 in the radiation hybrid maps 3 and 4. Clone 2 additionally identified GB AA323316 (clone 3) that overlapped a conserved sequence in clone 2. Complete sequencing of all clones in both directions and comparison for homology against the sequence database revealed two areas that did not show homology. (b) Deletion of these two regions and ‘‘pasting together’’ the computer construct resulted in an ORF of an appropriate size. After deletion of the two areas of apparent nonhomology (confirmed by absence of these areas from sequence of cDNA obtained by RT-PCR of mRNA), the computer-edited cDNA gave an appropriate size ORF from the initiating methionine to a stop codon (2.745 kb and 914 codons) and homology throughout the ORF (see Table 2). (c) To facilitate deletion of the two areas of nonhomology from a cDNA containing the coding region, the initial construct tested for expression of neutral glucosidase was a hybrid. This hybrid contained a 5Ј fragment from clone 2, (beginning 8 bp 5Ј of the ATG with an added EcoRI site), ligated at an endogenous SacI site (2,371 bp) to a 3Ј fragment derived by RT-PCR of mRNA from cell line GM02756 and extending 15 bp 3Љ of the stop codon with an added Xba site. (The first methionine in the construct was at bp 753–755 of the original full-length cDNA). We previously had determined that GM02756 expressed two different electrophoretically distinguishable alleles for GANC and that GM02756, therefore, did not contain a null allele lacking expression of enzyme activity. (d) Two allelic cDNA clones, differing in sequence at several sites, were obtained by RT-PCR of mRNA from cell line GM02756. The differing cDNA clones were found to express electrophoretically distinguishable allozymes (see Fig. 2 and Table 1).

cDNA was obtained by RT-PCR by standard methods with mRNA homology were confirmed by sequence analysis of RT-PCR prod- from a cell line (GM02756) known to contain two different alleles ucts of mRNA from the human cell line that expressed two different expressing enzyme activity for GANC (3) and a sequence-based alleles for GANC (GM02756; National Institutes of Health Mutant sense primer 34 bp 5Ј of the unique SacI site (bp 2,337–2,358) and Repository, Camden, NJ). For cloning of two alleles of GANC, an antisense primer in the 3Ј UT (ending 15 bp 3Ј of the TGA stop mRNA was isolated (by using an Invitrogen micro fast-track kit codon), with an added XbaI site. The clone (B-7) containing the 5Ј according to the manufacturer’s directions) from the cell line end was digested at the endogenous SacIsiteandata3Ј XbaI site GM02756 previously determined to express two different alleles for GENETICS GANC (3). RT-PCR was performed by using an antisense primer in the vector cloning site. The 3Ј RT-PCR product, digested at the ending 15 bp 3Ј of the stop codon with an added XbaI site and a Sac Xba endogenous I site and the I site in the primer, was ligated sense primer 5Ј of the ATG with an added EcoRI site and cloned into the B7 clone to yield B7-7. The full-length insert was released ͞ into the PCR zero blunt vector (Invitrogen). Clones were se- with EcoRI XbaI digestion and cloned into the expression vector quenced with M13 and gene-specific primers, and the insert was pCDNA 3 (Invitrogen) to yield clone B-7-7-9. The PCR-amplified then cloned into PCDNA3 for expression (Fig. 1d). section of the 5Ј clone, all of the ligation joints, and the complete 3Ј end of several clones were sequenced to confirm the absence of Transient Expression of the Cloned cDNAs and In Situ Visualization of cloning artifacts. Differences between the consensus sequence of Neutral ␣-Glucosidase C Activity. The cDNA constructs in the the original EST clones and the absence of the areas of non- pCDNA3 expression vector were transiently expressed in three

Hirschhorn et al. PNAS ͉ October 15, 2002 ͉ vol. 99 ͉ no. 21 ͉ 13643 Downloaded by guest on September 29, 2021 Table 1. Polymorphic differences (SNPs) in sequence of GANC cDNAs Polymorphic differences in cDNA and location Computer construct Hybrid cDNA Type 3* cDNA Type 2** cDNA

883 G͞A (exon 4) CGGϾCAG; Arg44Gln G (Arg) G (Arg) A (Gln) A (Gln) 1013 C͞T (exon 5) AACϾAAT; Asn87Asn C (Asn) C (Asn) C (Asn) T (Asn) 1049 G͞A (exon 5) CCGϾCCA; Pro99Pro G (Pro) G (Pro) A (Pro) G (Pro) 1211 A͞G (exon 6) ATAϾATG; Ile153Met A (Ile) A (Ile) A (Ile) G (Met) 2081 A͞T (exon 13) GATϾGAA; Asp443Glu A (Glu) A (Glu) T (Asp) T (Asp) 3286 T͞C (exon 24) TTTϾTCT; Phe845Ser T (Phe) C (Ser) C (Ser) C (Ser) 3295 A͞G (exon 24) CAGϾCGG; Gln848Arg A (Gln) G (Arg) G (Arg) A (Gln)

different cell lines; murine 3T3 cells, a simian virus 40- classes of clones could be distinguished, initially by appearance transformed human cell line deficient for acid glucosidase of a new site for EcoRI, and then by differences in sequence (see (TR4912; derived from GM 04912, a cell line obtained from the Table 1). Transient expression of completely sequenced clones National Institutes of Health Mutant Cell Repository, Camden, representing each class resulted in the appearance of bands of NJ; ref. 18) and in monkey kidney COS cells. Previously de- neutral ␣-glucosidase enzyme activity that comigrated with scribed standard CaPO4-based methods were used for transfec- either the ‘‘type 2’’ or the ‘‘type 3’’ alleles expressed by the tion (19). Cell lysates from transfected and nontransfected cells GM02756 cell line (refs. 3–5; Fig. 2). Furthermore, transfection and lymphoid line cells were prepared and analyzed as described of 3T3 cells with a hybrid cDNA derived in part from the EST (3, 5, 17). In brief, cell lysates from transfected and nontrans- clone and in part from RT-PCR yielded ␣-glucosidase activity fected cells were electrophoresed in starch gels at 4°C to separate ͞ with biochemical properties consistent with human as opposed isozymes and or allozymes primarily by their charge, overlaid to murine GANC (5). with 3 MM filter paper saturated with substrate (4 methyl ⅐ Comparison of the sequence of the EST-derived cDNA with umbelliferyl D-glucopyranoside) at either acid pH (not shown) that of the two clones derived totally by RT-PCR (and the or neutral pH 7.5 and incubated at 37°C. Areas of enzyme additional hybrid cDNA) revealed seven polymorphic sites, two activity were detected as fluorescent bands, as described (17). of which were silent (Table 1). The two GM02756 clones differed Results from each other at four sites, two of which predicted amino acid Based on the hypothesis that GANC shares homology with other differences (Ile at codon 153 and Arg at codon 848 for type 3 vs. human family 31 paralogs, we searched the EST database for Met at codon 153 and Gln at codon 848 for type 2). Both clones candidate clones that had homology to ␣-glucosidases but did also differed from the computer-derived consensus sequence at not code for any of the four previously cloned family 31 three additional sites predicting amino acid differences (Arg- mammalian or human glycosyl hydrolases. Three candidate 44-Gln; Glu-443-Asp, Phe-845-Ser). All of these sequence vari- human EST clones were identified (Materials and Methods, and ations were true polymorphisms and were not due to cloning, Fig. 1), two of which contained sequence tagged sites (STSs) that sequencing, or PCR errors, because we could define both alleles had been mapped to human chromosome 15q by analysis of for all of these single nucleotide polymorphisms (SNPs) in radiation hybrid panels 3 and 4, the chromosome to which we had genomic DNA of additional individuals. previously mapped GANC, based on segregation of the human neutral glucosidase C isozyme in human-mouse hybrids (5). Complete sequencing and analysis of the three EST clones revealed two small areas lacking homology to other glucosidases, one in clone 2 and one in clone 3 (Fig. 1a). RT-PCR of mRNA from a diploid normal human cell line, GM02756, both of whose GANC alleles were known to encode functional, electrophoreti- cally distinguishable allozymes, revealed that neither of the areas of nonhomology were present in the mRNA. Omitting the two areas of nonhomology and ‘‘ligating’’ the remaining sequence in silico resulted in a 5,022-bp transcript containing a 2,745 bp ORF (including the ATG and stop codon) that exhibited high homol- ogy to ␣-glucosidases throughout (not shown). The ATG of the ORF was preceded by a large, 752 bp 5Ј untranslated region and was followed by a large, 1,525 bp 3Ј untranslated sequence (20) ␣ without homology to other human glucosidases (Fig. 1). The Fig. 2. Transient expression in Cos cells of cloned -glucosidase C alleles. Cos cells were transiently transfected with the various cDNA clones in an expres- derived cDNA predicted a of 914 amino acids and a Ϯ sion vector. Cell lysates were prepared from nontransfected cells and from molecular mass of 104 kDa, consistent with that of 92.2 9.8 cells transfected with the two clones containing differing alleles derived from kDa previously determined for the partially purified GANC (4). mRNA of the GM02756 cell line that is heteroallelic for type 2 and type 3 GANC This molecular mass is also similar in size to that of the paralogs alleles (see Fig. 1d and Table 1). Areas of ␣-glucosidase activity were visualized lysosomal acid ␣-glucosidase and the catalytic unit of glucosi- under long-wave UV light as fluorescent bands. GANAB is the gene name for dase II (952 and 944 aa) and half the size of the duplicated human ␣-chain (catalytic unit) of glucosidase II on chromosome 11; GANC is paralogs sucrase-isomaltase and maltase-glucoamylase. the official gene name for human neutral ␣-glucosidase C on chromosome 15. To demonstrate that the cDNA sequence we had derived in Lane 1: cloned human GANC cDNA expressed in COS cells, showing activity comigrating with the product of human GANC type 3 allele. Lane 2: cloned silico represented the in vivo GANC mRNA, we physically cloned human GANC cDNA expressed in COS cells, showing activity comigrating with and expressed three different cDNAs. Two of the expressed the human type 2 allele. Lane 3: human cell line heterozygous for type 2 and clones were totally derived by RT-PCR of mRNA from the cell type 3 GANC alleles and the source of mRNA for cloning the allelic cDNAs. Lane line GM02756, known to express two functional, electrophoreti- 4: human cell line null for GANC (fourfold greater amount of cell lysate cally distinct allelic forms of GANC (see Fig. 1d). Two different applied).

13644 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.202383599 Hirschhorn et al. Downloaded by guest on September 29, 2021 Table 2. Homology of GANC to other human ␣-glucosidase paralogs

Homology at the catalytic site Catalytic site Lysosomal acid ␣-glucosidase FHDQVPFD GMWIDMNE PSNF Maltase-glucoamylase: maltase FHNQVIFD GIWIDMNE VSNF glucoamylase ----VPFD GMWIDMNE PSSF Sucrase-isomaltase: isomaltase FHQEVQ-D GLWIDMNE VSSF sucrase KFD GLWIDMNE PSSF Glucosidase II (GANAB) YEGSAPNL FVWNDMNE PSVF Neutral ␣-glucosidase C (GANC) YQGSTDIL FLWNDMNE PSVF Consensus: GMWXDMNE* FI S VA L F

Overall homology % Identity % Similarity** Glucosidase II (GANAB) 49 67 Lysosomal acid ␣-glucosidase 31 47 Sucrase-Isomaltase: isomaltase 30 47 sucrase 25 41 Maltase-glucoamylase: maltase 28 45 glucoamylase 27 43

*Prosite: PD0C00120. Consensus based on additional orthologs and paralogs; X indicates numerous different amino acids at the indicated position. **Identity, same amino acid; similarity, conservative substitution.

GANC shows homology at the catalytic site (WXDMNE) to intron 20 (1.264 kb downstream of exon 20), predicting an exon the four previously cloned human ␣-glucosidases: lysosomal acid with two in-frame stop codons (TAA) and, therefore, a non- ␣-glucosidase (the original identified human family 31 glycosyl functional transcript. Although such events can be interpreted as hydrolase), the two duplicated ␣-glucosidases, sucrase- alternate splicing, they can also be found as artifacts during isomaltase and maltase-glucoamylase, and the catalytic unit of construction of cDNA libraries for extensively studied, known glucosidase II (synonymous with GANAB) (Table 2). The genes. Our RT-PCR results would indicate that the inclusions of greatest homology was to the catalytic unit of glucosidase II, an intronic sequence seen were artifacts. The features of the enzyme that is also active at neutral pH. Similarly, GANC intron͞exon structure that may have interfered with correct showed overall homology to the previously cloned ␣-glucosidase prediction of the mRNA based on algorithms were the relatively paralogs, with similarity ranging from 67 to 41%, but again, with small size of several of the exons, (ranging from 46 to 80 bp), the highest overall homology to the catalytic unit of glucosidase some of which were buried within large surrounding introns, II (Table 2). leading to exclusion from the predicted mRNA sequences. Alignment of the full-length cDNA sequence with human Similarly, the ATG, sitting at the terminal portion of a large, genomic DNA sequences from bacteria artificial chromosome primarily noncoding exon appears to cause difficulty for iden- (BAC) clones AC022468 and AC012651 and with the draft tification by current algorithms (ref. 21; GenBank database sequence (http:͞͞genome.ucsc.edu͞) suggested that the GANC accession nos. AF545044–AF545047). gene consists of 25 exons distributed over 80 kb. We confirmed The GANC cDNA maps to human chromosome 15q15.1-.2 on both this proposed exon–intron structure and the accuracy of our the draft sequence (http:͞͞genome.ucsc.edu͞), flanked on the 5Ј cDNA sequences by amplifying and sequencing 23 of the 25 end by a predicted gene (DKFZP564G2022). We confirmed, by exons and their flanking intronic regions from genomic DNAs. sequence analysis of PCR products from genomic DNA, that this The first exon was noncoding, whereas the second 343-bp exon putative gene begins only 168 bp upstream of the start of the was predominantly noncoding, containing only the first 10 GANC cDNA and is encoded on the reverse strand with codons, including the initiation of translation within a Kozac transcription toward the centromere. By contrast, GANC is consensus sequence. The terminal exon 25 was only sequenced encoded on the sense strand with transcription toward the from genomic DNA to just 3Ј of the stop codon, whereas only a telomere, suggesting overlap or close proximity of the two small part of the flanking intron for exon 19 could be sequenced promoter regions On the 3Ј end. GANC is flanked by calpain 3 from PCR products of genomic DNA. The additional SNPs (CAPN3), which begins 6.151 kb 3Ј of GANC, with transcription noted in the sequence of the ESTs and of the newly cloned of both genes directed toward the telomere. cDNAs (Table 1) also were found in genomic DNA from additional individuals (not shown). Discussion GENETICS Correct annotation of GANC present in the draft sequence We have cloned a neutral ␣-glucosidase that is a new member of and even earlier in the two BACs used for the draft sequence was a human paralogous gene family, belonging to the glycosyl probably inhibited by the artifactual portions of the initial EST hydrolase gene family 31, initially by in silico manipulation of sequences as well as by several features of the exonic structure. sequences of EST clones and then physically by the cloning and The area of ‘‘nonhomology’’ in the cDNA of EST clone 2 (Fig. expression of two cDNAs coding for alleles of GANC. Both 1, AA101464) was apparently caused by priming from intron 15 cDNAs express ␣-glucosidase activity at neutral pH, exhibit the on the antisense strand with inclusion of ϩ59 to ϩ1bpfrom expected differences in electrophoretic mobility for two of the intron 15. The area of ‘‘nonhomology’’ in cDNA of the EST known genetic variants, are of appropriate size, and map to clone 3 (Fig. 1, AA332316) represented inclusion of 98 bp of chromosome 15q. These biochemical and genetic properties

Hirschhorn et al. PNAS ͉ October 15, 2002 ͉ vol. 99 ͉ no. 21 ͉ 13645 Downloaded by guest on September 29, 2021 allow us to annotate correctly the cDNAs and the corresponding proposed protein named FLJ00088 was reported and assigned genomic DNA as GANC (OMIM 104180). The molecular basis GenBank accession number BAB84863. However, this cDNA for the allelic differences was identified as single-base changes encompasses 4,476 of the 5,022 bp reported here and lacks exon (SNPs), some of which resulted in amino acid differences. Also, 1 and part of exon 2. The initiating methionine was contained in we have experimentally determined the exon and flanking intron FLJ00088 but was only identified as the initiating first codon by sequence and boundaries and placed the cDNA on the genome the expression studies reported here. The initiating ATG is at sequence of chromosome 15 at 15q15. GANC is flanked on the codon 12 of FLJ00088 (bp 208–210), corresponding to bp753– 5Ј end by DKFZP564G2022, a gene transcribed toward the 755 of the GANC sequence. In the light of our studies, we centromere on the antisense strand and on the 3Ј end by CAPN3, propose that FLJ00088 be renamed GANC. transcribed toward the telomere in tandem with GANC on the The localization of GANC next to calpain 3 is of note because sense strand. linkage studies of type 2 diabetes recently provided evidence for Correct annotation is now a critical next step in analysis of the an interacting susceptibility locus in the Mexican American and, as is evident from the example reported population on chromosome 15 that includes the calpain 3 (CAPN3) locus (22, 23). CAPN3 (mutations in which result in here, is likely to require a more individualized process that will LGMD2) is a paralog of calpain 10, a putative diabetes suscep- benefit from detailed molecular and biochemical approaches. tibility gene on chromosome 2 (24, 25). We note that GANC The observations reported here demonstrate the utility of bio- sitting next to CAPN3 on chromosome 15 is also a candidate for chemical, genetic, and molecular data in the final stages of a role in diabetes. GANC is an enzyme involved in degradation genome annotation, as well as perusal and knowledge of prior of glycogen; there are known multiple alleles for this gene, at published data, annotated in several linked databases such as least one of which codes for absence of GANC activity. Alter- OMIM that, for GANC, contained all of the necessary infor- natively, the known genetic variation in GANC may be associ- mation. Full-length cDNA transcripts facilitated the verification ated with differences in severity of glycogen storage disorders of the correct annotation for GANC and had led to our initial and͞or normal muscle function. report in an abstract.‡ During final preparation of this manu- script, a putative ‘‘full length’’ cDNA (AK074037) encoding a We are grateful for helpful discussions with Drs. J. and K. Hirschhorn, G. Weissmann, S. Brown, and P. D’Eustachio. We thank C.-K. Jiang for her help in the final expression studies. This work was supported by a ‡Hirschhorn, R. & Huie, M. L. (1998) Am. J. Hum. Genet. 63, A1038 (abstr.). grant from the March of Dimes National Foundation (to R.H.).

1. International Human Genome Sequencing Consortium (2002) Nature 409, 14. Kalz-Fuller, B., Bieberich, E. & Bause, E. (1995) Eur. J. Biochem. 231, 860–921. 344–351. 2. Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., 15. Henrissat, B. (1998) Biochem. Soc. Trans. 26, 153. Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., et al. (2001) Science 291, 16. Hirschhorn, R. & Reuser, A. J. J. (2001) in The Metabolic and Molecular Basis 1304–1351. of Inherited Disease, Eighth Edition, eds. Scriver, C. R., Beaudet, A. L., Sly, W. S. 3. Martiniuk, F. & Hirschhorn, R. (1980) Am. J. Hum. Genet. 32, 497–507. & Valle, D. (McGraw–Hill, New York) pp. 3389–3420. 4. Martiniuk, F. & Hirschhorn, R. (1981) Biochem. Biophys. Acta 658, 248–261. 17. Swallow, D. M., Corney, G., Harris, H. & Hirschhorn, R. (1975) Ann. Hum. 5. Martiniuk, F., Hirschhorn, R. & Smith, M. (1980) Cytogenet. Cell. Genet. 27, Genet. 38, 391–406. 168–175. 18. Martiniuk, F., Bodkin, M., Tzall, S. & Hirschhorn, R. (1990) Am. J. Hum. 6. Martiniuk, F., Mehler, M., Pellicer, A., Tzall, S., LaBadie, G., Hobart, C., Genet. 47, 440–445. Ellenbogen, A. & Hirschhorn, R. (1986) Proc. Natl. Acad. Sci. USA 83, 19. Huie, M. L., Hirschhorn, R., Chen, A., Martiniuk, F. & Zhong, N. (1994) Hum. 9641–9644. Mutat. 4, 291–293. 7. Martiniuk, F., Mehler, M., Tzall, S., Meredith, G. & Hirschhorn, R. (1990) 20. Mignone, F., Gissi, C., Liuni, S. & Pesole, G. (2002) Genome Biol. 3 (Feb. 28), DNA Cell Biol. 9, 85–94. http:͞͞genomebiology.com͞search͞results.asp. 8. Hoefsloot, L. H., Hoogeveen-Westerveld, M., Kroos, M. A., van Beeumen, J., 21. Mount, S. M. (2000) Am. J. Hum. Genet. 67, 788–792. Reuser, A. J. J. & Oostra, B. A. (1988) EMBO J. 7, 1697–1704. 22. Hanis, C. L., Boerwinkle, E., Chakraborty, R., Ellsworth, D. L., Concannon, P., 9. Chantret, I., Lacasa, M., Chevalier, G., Ruf, J., Islam, I., Mantei, N., Edwards, Stirling, B., Morrison, V. A., Wapelhorst, B., Spielman, R. S., Gogolin-Ewens, Y., Swallow, D. & Rousset, M. (1992) Biochem. J. 285, 915–923. K. J., et al. (1996) Nat. Genet. 13, 161–166. 10. Wu, G. D., Wang, W. & Traber, P. G. (1992) J. Biol. Chem. 267, 7863– 23. Cox, N. J., Frigge, M., Nicolae, D. L., Concannon, P., Hanis, C. L., Bell, G. I. 7870. & Kong, A. (1999) Nat. Genet. 21, 213–215. 11. Nichols, B. L, Eldering, J., Avery, S., Hahn, D., Quaroni, A. & Sterchi, E. (1998) 24. Horikawa, Y., Oda, N., Cox, N. J., Li, X., Ohro-Melander, M., Hara, M., J. Biol. Chem. 273, 3076–3081. Hinokio, Y., Lindner, T. H., Mashima, H., Schwarz, P. E., et al. (2000) Nat. 12. Trombetta, E. S., Simons, J. F. & Helenius, A. (1996) J. Biol. Chem. 271, Genet. 26, 163–175. 27509–27516. 25. Fullerton, S. M., Bartoszewicz, A., Ybazeta, G., Horikawa, Y., Bell, G. I., Kidd, 13. Trombetta, E. S., Fleming, K. G. & Helenius, A. (2001) Biochemistry 35, K. K., Cox, N. J., Hudson, R. R. & Di Rienzo, A. (2002) Am. J. Hum. Genet. 10717–10722. 70, 1096–1106.

13646 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.202383599 Hirschhorn et al. Downloaded by guest on September 29, 2021