View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

Cell, Vol. 81, 15-25, April 7, 1995, Copyright © 1995 by Cell Press A Cluster of Genes on Xp22.3: Mutations in Chondrodysplasia Punctata (CDPX) and Implications for Warfarin Embryopathy

Brunella Franco, 1, 2 Germana Meroni, 1, 2 arylsulfatase activity was demonstrated in patients Giancarlo Parenti, 3 Jacqueline Levilliers,4 with deletions spanning the CDPX region. These data Loris Bernard, 1, 2 Marinella Gebbia, ~, 2 Liza Cox, 1, 2 indicate that CDPX is caused by an inherited deficiency Pierre Maroteaux, s Leslie Sheffield, 6 of a novel sulfatase and suggest that warfarin era- Gudrun A. Rappold, 7 Generoso Andria, 3 bryopathy might involve drug-induced inhibition of the Christine Petit, 4 and Andrea Ballabio 1, ~, 8 same . ~Department of Molecular and Human Genetics Baylor College of Medicine Introduction Houston, Texas 77030 3Department of Pediatrics The are a group of that hydrolyze sul- University of Naples fate ester bonds in a wide variety of structurally different 80134 Naples compounds, ranging from complex molecules, such as Italy glycosaminoglycans and sulfolipids, to 31B-hydroxysteroid 4Unit~ de Genetique Moleculaire Humaine sulfates (Ballabio and Shapiro, 1995). Nine different hu- Centre National de la Recherche Scientifique man sulfatases have been described and characterized Unit~ A1445 biochemically (for reviews see Ballabio and Shapiro, 1995; Institut Pasteur Kolodny and Fluharty, 1995; Neufeld and Muenzer, 1995). 75724 Paris Cedex 15 The importance of these proteins in human metabolism France is emphasized by the presence of seven distinct inherited ~H6pitat des Enfants Malades disorders resulting from specific sulfatase deficiencies 75743 Paris Cedex 15 (Kolodny and Fluharty, 1995; Neufeld and Muenzer, 1995). France The evidence that sulfatases are members of a family 6The Murdoch Institute of related proteins, probably sharing acom mon ancestor, Royal Children Hospital is both structural and functional. The cloning and sequenc- Melbourne, Victoria 3050 ing of six human sulfatase cDNAs (Ballabio et al., 1987; Australia Yen et al., 1987; Robertson et al., 1988; Stein et al., 1989a; 7Institute for Humangenetik Peters et al., 1990; Schuchman et al., 1990; Wilson et al., Universit&t Heidelberg 1990; Tomatsu et al., 1991), of a sea urchin arylsulfatase 69120 Heidelberg cDNA (Sasaki et al., 1988), and of Escherichia coil (Daniels Federal Republic of Germany et al., 1992) and Klebsiella aerogenes (Murooka et al., 8Department of Molecular Biology 1990) arylsulfatases have revealed a high degree of ho- University of Siena mology along the entire length of these proteins, particu- 53100 Siena larly in the N-terminal regions, suggesting that specific Italy motifs in the amino acid sequence are important for sul- fatase activity (Wilson et al., 1990). Additional evidence that all sulfatases are functionally related comes from the Summary study of a rare and intriguing genetic disease named multi- ple sulfatase deficiency (MSD), in which all known sul- X-linked recessive chondrodysplasia punctata (CDPX) fatases are deficient (Kolodny and Fluharty, 1995). This is a congenital defect of bone and cartilage develop- autosomal recessive disorder may be due to an abnormal- ment characterized by aberrant bone mineralization, ity in a posttranslational event common to all sulfatases. severe underdevelopment of nasal cartilage, and dis- Three of the previously described human sulfatases, tal phalangeal hypoplasia. A virtually identical pheno- known as arylsulfatases A, B, and C (ARSA, ARSB, and type is observed in the warfarin embryopathy, which is ARSC), are able to hydrolyze sulfated artificial substrates due to the teratogenic effects of coumarin derivatives containing a phenolic ring, such as p-nitrocatechol sulfate during pregnancy. We have cloned the genomic region or 4-methylumbelliferyl sulfate (4-MU sulfate) (Kolodny within Xp22.3 where the CDPX gene has been assigned and Fluharty, 1995). ARSC is more specifically known as and isolated three adjacent genes showing highly sig- (STS) because of its ability to hydrolyze nificant homology to the sulfatase gene family. Point steroid sulfates. While ARSA, ARSB, and ARSC each act mutations in one of these genes were identified in five on a distinct natural substrate, they all have a "promiscu- patients with CDPX. Expression of this gene in COS ous" activity against artificial substrates (arylsulfatase ac- cells resulted in a heat-labile arylsulfatase activity that tivity). ARSA and ARSB are lysosomal in their subcellular is inhibited by warfarin. A deficiency of a heat-labile localization and have acidic pH optima, while ARSC is microsomal and has a neutral to alkaline pH optimum (Bal- 2Presentaddress: Telethon Instituteof Geneticsand Medicine(TIGEM), labio and Shapiro, 1995). San Raffaele Biomedical Science Park, Via Olgettina 58, 20132 Mi- The term "chondrodysplasia punctata" refers to a group lano, Italy. of skeletal dysplasias characterized by abnormal calcium Cell 16

deposition in regions of enchondral bone formation. This we report the isolation from this region of a cluster of genes abnormality results in a peculiar radiological finding, com- showing a significant homology to all members of the sul- monly referred to as "stippling" of epiphyses or "paint- fatase gene family. Point mutations in one of these genes, spattered" calcifications. These calcifications tend to disap- named (ARSE), were identified in five pa- pear within the first few years of life as bone development tients with CDPX, indicating that this disorder is caused progresses. Chondrodysplasia punctata is heterogeneous by a sulfatase deficiency. Transfection of ARSE cDNA both at the genetic and clinical levels, and only a few dis- into COS cells results in an arylsulfatase activity that is tinct entities with different patterns of inheritance and vari- inhibited bywarfarin, providing insight into the pathogene- able degrees of severityappear to be fairlywell characterized sis of warfarin embryopathy. (Spranger et al., 1970). A specific type of chondrodyspla- sia punctata is associated with nullisomy of the distal short Results arm of the human X chromosome (Xp22.3) in males with deletions and translocations involving this region (Curry Identification of Three Novel Sulfatase Genes et al., 1984). These patients have contiguous gene syn- on Xp22.3 dromes due to the involvement of adjacent disease genes In a recent study, we described the assignment of the on Xp22.3 (Ballabio and Andria, 1992). Affected individu- CDPX gene to a specific genomic segment of 600 kb lo- als display facial dysmorphism characterized by severe cated approximately 150 kb centromeric to the pseudoau- nasal hypoplasia, depressed nasal bridge, short nasal tosomal boundary on Xp22.3 (Wang et al., 1995). Overlap- septum, and a deep groove between the nasal alae and ping yeast artificial chromosome (YAC) clones spanning the tip. They also have short stature and distal phalangeal this entire region were isolated and partiallycharacterized. hypoplasia, the latter being particularly evident by radio- Two YAC clones, NB6F12 and 455B10, were purified by logical examination. Punctate stippling of multiple epiphy- pulsed-field gel electrophoresis (PFGE) and hybridized to seal centers is observed in the neonatal and early infancy an arrayed X chromosome-specific cosmid library ob- periods and usually disappears with age. There is strong tained from Lawrence Livermore laboratories, as pre- genetic evidence that this mild and rather distinct type of viously described (Wapenaar et al., 1994). Two groups of chondrodysplasia punctata, referred to as X-linked reces- overlapping cosmid clones were identified and used in sive chondrodysplasia punctata (CDPX; MIM302950), is exon trapping experiments in order to identify candidate due to the loss of function of a specific gene on Xp22.3 genes for CDPX (Figure 1). DNA from cosmids 18, 76, 75, (for review see Ballabio and Andria, 1992). 48, 32, 66, 58, 96, and 100 was subcloned into the pSPL3 A phenotype clinically indistinguishable from that of vector (Church et al., 1994), and exon amplification prod- CDPX has also been described in a few patients with no ucts were blotted and hybridized to the corresponding cos- visible or molecularly detectable chromosomal abnormal- mid DNA. Several putative exons were identified using this ity (Sheffield et al., 1976; Maroteaux, 1989). Although strategy and were characterized by reverse transcription- some of these patients may carry point mutations in the polymerase chain reaction (RT-PCR) using RNAs from putative CDPX gene, as most of them lack a positive family several tissues (data not shown), by sequence analysis, history, the possibility of genetic heterogeneity for the and by hybridization to the DNA from patients BA169 and CDPX phenotype should be considered. BA311. Five different exons were found to map back to Interestingly, CDPX shows remarkable phenotypic simi- the CDPX critical region and to be expressed based on larities to two well-characterized disease entities involving RT-PCR. Their position with respect to YAC and cosmid vitamin K metabolism and function: warfarin embryopathy clones from the region is shown in Figure 1. and vitamin K epoxide reductase deficiency. Warfarin em- Sequence analysis of these five exons was determined, bryopathy is caused by the administration of warfarin, an and nonredundant DNA and protein data bases were anticoagulant drug, during a critical period of 6-9 weeks searched using the BLAST-X, BLAST-P, and TBLAST-N of pregnancy (for review see Hall et al., 1980). Vitamin K algorithms. Striking similaritieswere identified with all pre- epoxide reductase deficiency, also known as pseudowar- viously described members of the sulfatase gene family, farin embryopathy, is a rare autosomal recessive disorder including sulfatases from distantly related species such affecting the recycling of vitamin K (Pauli et al., 1987; as bacteria. The highest homology scores for each of the Pauli, 1988). Both of these disorders display abnormalities five exons were with the STS (ARSC) sequence. of cartilage and bone development that are strikingly simi- The relative position of the five exons in the cosmid lar to those observed in CDPX. The phenotypic similarities map and the alignment with previouslydescribed sulfatase among CDPX, warfarin embryopathy, and vitamin K epox- sequences (STS in particular) led us to conclude that these ide reductase deficiency suggest the possibility that ge- exons must be derived from at least two distinct genes. netic and drug-induced defects underlying thase syn- This was confirmed by subsequent cDNA identification dromes may affect the same metabolic pathway. and characterization. PCR products from exons 76.15, Over 50 patients carrying deletions and translocations 18.12, and 48.7 were hybridized to a human kidney cDNA involving the Xp22.3 region and displaying features of library. Several overlapping cDNA clones from two distinct CDPX have been described (for review see Ballabio and transcription units were isolated: exons 76.15 and 18.12 Andria, 1992). Molecular genetic analysis of some of these identified 6 cDNA clones, while exon 48.7 identified 21 patients has defined the critical genomic region that must clones. contain the CDPX gene (Wang et al., 1995). In this study, A consensus full-length sequence was determined for A Novel Sulfatase Implicatedin CDPX and Warfarin Embryopathy 17

CDPX region (~ 600 Kb) Figure 1. PhysicalMapping of the CDPX Criti- I cal Region The CDPX critical region is defined by the PABX .~ ~ ~ ; ~ z~ n breakpoints of patients BA311 and BA169 Xpter .....-~ § (,< -- - : ..... <<- (Wang et al., 1995). YAC clonesare shown as

19H12 (440Kb) thick bars. Closedcircles show the positionof single-copyprobes used for YAC identification YACs NB6F12 (320KU) and for the assemblyof the YAC contig. Thin 455B10 (290Kb) bars indicatecosmid clones used for the exon 481C6 (290kb) amplificationexperiments. Putative exons are

18-- 32 depictedas closedboxes. The positionand ori- 76 -- 66 '-- Cosmids entationof transcriptionof the three novelsut- 75 -- 58 fatase genes isolated from the region are 48 -- 96~ lO0.-- shown at the bottom. The orientationof tran- Trapped ~lU NM scription of the ARSFgene is unknown.PABX, Exons 1812 183 76.15 48.12487 pseudoautosomalboundary on the X chromo- some. G.... ~AF~O ~ARSE AI~F both transcripts. Owing to their high degree of homology codon was identified at nucleotide 68, a polyadenylation with arylsulfatases and based on the results of functional site AATAA at position 1833, and the first in-frame stop analyses (see below), these two genes were named ARSD codon at 1835, resulting in a predicted protein of 590 and ARSE. The orientation of transcription of the genes amino acids. Interestingly, this stop codon (TAA) is part is centromere to telomere, as shown in Figure 1. Both of the polyadenylationsite, resulting in an extremely short genes appear to have a single copy located in the Xp22.3 3' untranslated region. region. However, some cDNA fragments detect Y chromo- Figure 2 shows the results of the alignment of the ARSD, some-specific bands by Southern blot analysis (data not ARSE, and ARSF predicted protein products with previously shown). A detailed characterization of the genomic struc- reported glucosamine-6-sulfatase (G6S), iduronate sul- ture of the two genes will be reported elsewhere. fatase (IDS), N-acetylgalactosamine-6-sulfatase (Gal6S), Hybridization of cDNA probes to cosmid clones 18, 76, ARSA, ARSB, and ARSC/STS proteins. The homology is 75, and 48 (see Figure 1) from the CDPX critical region particularly high near the N-terminus, where a putative revealed that the two genes did not cross-hybridize to each of some of these enzymes is thought to reside other, with the exception of a small region in the vicinity (Wilson et al., 1990). A region with low amino acid identity, of their 3' ends. Probes spanning this region of the cDNA corresponding to a transmembrane domain in ARSC/STS also hybridized to cosmids 58 and 96, suggesting the pres- (Stein et al., 1989b), is found in G6S, IDS, ARSC/STS, ence of a third homologous gene. To investigate this hy- ARSD, and ARSE and is absent in Gal6S, ARSA, and pothesis, cosmid 96 was digested with Sau3A and sub- ARSB. The highest identity scores were found when com- cloned into a plasmid vector. Plasmid clones hybridizing paring ARSD with ARSE and when comparing these two to the cDNA clones were sequenced, and a region of high with ARSC/STS, indicating that these three genes are homology to sulfatase cDNAs was identified. This region much more closely related to each other than to the other of homology was flanked by consensus splice sequences, members of the family. indicating that it corresponded to an exon. Preliminary data obtained by both RT-PCR and Northern blot analysis ARSD and ARSE Expression Pattern indicate that this exon belongs to a transcribed gene, Northern blot analysis of ARSD and ARSE mRNAs re- which we named ARSF because of the extremely high vealed different expression patterns for these two genes. degree of homology with ARSD and ARSE (Figure 2). ARSD cDNA probes detected both a 5 kb transcript and a much less abundant 3.5 kb transcript in pancreas, kidney, Analysis of ARSD and ARSE cDNA Sequences liver, lung, placenta, brain, and heart (Figure 3A). Pan- ARSD consensus cDNA sequence (1945 bp) was assem- creas and kidney showed relatively higher levels of expres- bled by analysis of sequences from four cDNA clones and sion. Under the same conditions, probes from the ARSE partly from the corresponding genomic clones. The au- gene detect a very abundant 2.2 kb and a less abundant thenticity of the 5' end was validated by 5'-RACE-PCR 4 kb transcript (Figure 3B). ARSE mRNA is much more experiments and by sequencing the corresponding geno- abundant compared with ARSD (Figures 3A and 3B have mic region. The putative initiation codon was identified different exposure times) and appears to have a more dis- at position 77 and the first in-frame stop codon (TGA) at crete tissue distribution, being expressed exclusively in nucleotide 1856, resulting in a predicted protein of 594 pancreas, liver, and kidney among all the tissues tested. amino acids. ARSE consensus cDNA sequence (1858 bp) Preliminary data indicate that the different forms of ARSD was assembled from the analysis of 13 cDNA clones and and ARSE transcripts are generated by the use of different of several exon-containing genomic clones. The authentic- polyadenylation sites, resulting in different lengths of the ity of the 5' end was validated by comparing the sequence 3' untranslated region. of several overlapping cDNA clones and by sequencing Detection of ARSD and ARSE mRNAs in cultured cell the corresponding genomic region. The putative initiation lines by RT-PCR allowed us to test the X-inactivation sta- Cell 18

Figure2. Homology among ARSD, ARSE, IDS ...... MPPPRTGRGLL WL LSSVCVALGS ETQANSTTDA II LRPS 55 and ARSF Predicted Protein Productsand Pre- ~sA ...... M~ -- L ~- -~c L .... A~v~ ~~.~ 4o viously Reported Human Sulfatases G6S, glucosamine-6-sulfatase,IDS, iduronate /~SD ...... MRSAA RRGRAAPAARDS L-- FLC LLKTCEP ---KTIs~~~F 60 ARSE ...... ML HLHHSCLCFRS M- SLAPSAS ---SD S 57 sulfatase, Gal6S, N-acetylgalactosamine-6-sul- ARSF ...... ~ ...... 0 fatase, ARSA, , ARSB, arylsul- G6S PLKKTKAL EMG-- -MTFSSAYVPSA~II~SI YPHNH HVV ...... NNTLEG NCSSKS-WQKIQEPN 130 fatase B, ARSC, arylsulfatase C/STS. Amino ,acidsthat are identicalat least in ARSC, ARSD, ARSE, an d ARSF are written in whiteon a black background. Only partial sequence informa- ARSD i EE S A- --NOY Q WNAG -- 129 tion is available for ARSF. The position of the ARSE D P S- --IGY Q WTG -- 126 A~SF ...... ~ ..... W ...... 0 five mutations in the ARSE gene found in pa- tients with CDPX is indicated by a star below IDS FS~PQYFKE-N~V ~MSV~VF--~P~IS SN---HTDSS~- S WS-~YHPSSEKYE NTKTCRGPDGE--LH 179 the sequence.

ARSA -V~V~.~__- G - G PEa-- -A IP-YSHD ...... QGFC-QNL--- 159

A~SD -T I QQ-H A --- D P-FTL DPGRPPE-VDAALR 197 A~SE -T .... P-FSL RWELSE-KRVNLE 194 ARSF ...... ~ ...... O

G6S GE-NYSVDYLTDV .... i~N ..... VSLDF LD-YKSNFEPFFMMI ATPAPHSPWTAAPQY QK--AF~V-FAPPJE 256 IDS ANLLCPVD-VLDVPE GTLPDKQSTEQAIQL LEKMKTSASFFFLAV GYHKPHIPFRYPKEF QKLYPLE~ITLAPD~ 253

ARSA ...... TC ...... GGCDQ VPIPL~LSVE.~ ARSB ...... YSHE ...... RC ..... TLIDA--LNVT RCALDF~GEEVATG 205 ...c ...... ARSD AQLWGY LALG Q-TCGFFS S ARAV--~TGMAGVGC LFFISWYSSF~FVRR I HDVT 268 ARSE QKLNFL LALV K-LTHLI S WMPVI-WSALSAV-L LLASSYFVG-ALIVH F HTIT 265 ARSF ...... W ...... 0

G6S ...... 'Q ...... S ...... , ...... Q ..... V~DLVEK 305 IDS E-VPDGLPPVAYNPW MDI---~QREDVQA~ N~VPYG ...... PIP ..... VDF RKIRQ S FAS SY Q~R 313 GaI6S G-EANLTQ~YL~ ...... QABH~ YWAVI)~y ...... ~T~QR~ ~S K 267 ARSA PWLPGLEARYMAFAH DLMADAQRQD YYAS P-QFSG Q ..... S~AEF~G~ P~SLM~L~AAV~T 360 ARSB YKNMYSTNIFTKRAI ALIT-NHPPEK~F~ YLALQS~EPLQVPE EYLKPY IQDK]TRH G S EA 279

ARSD LEKTASLML V SYI-- HKH F IP TT S ..... I K 335 A~SE FQRTTPLIL SFL-- KH F P TM E ..... 332 A~SF ...... 0

GaI6S ±~ELLQDLH V LISAP E--Q--~A~-PFL ~.~£-TT~.~@~ ALA HV-A~QV 335 ARSA LMTAIGDL E L VI PETMRMS ---R-z~-LLR -TT ALA HI-A -V 326 ARSB VTAALKSS FI QTLA ...... PLR -S V FVAS LLK-~K 343 -...... A~SD AIEDN F HARD GHSQL- - GMG F VL- V 407 A~SE TLDVE NQL GNTQY - F VL- V 404 ~RSF ...... KV- L 28

G6S TSKMLVAN--I~C~ ~ILDIA--GYDLNK ..... TQMDGMSLLPI --LRGA---SNLTWR SDVLVEYQGEGP~NVT 424 IDS GEKLFPYLDPF~AS QLMEPGRQSMDLVEL VSLFPT~AGLAGLQV -P-PRCFVFSF~ .... VE~C-~EGK-NL 438

ARSB N LIHI- -S LMKLA RGHTNGTK P DVWKTISEGS --PSP-RIE HNI- DPNFVDSSPCPR-NS 409

A/~SD --- QLV GE AE -- F Q H -KD 473

G6S DPTCPSLSPGVSQCF ~DCVCED--A .... Y NNTYACVRTMS -ALW NLQYCE~DQEVF~ I ~ ~N~ADPDQ IT 489 IDS LKHFRFRDLEEDPYL --~GNPREL-IA .... Y -SQYP--RPSDIPQW NSDKPSLKD .... ZK -IMGYSIKT-IDYRY 499

ARSA CAD-~ -VQGSLLH ~GSAHSD-TTA--DP ACHAS--S--SLTA~ -~L~9~ NYNLLGGVAGAT~EV 452 ARS ...... ~HFD ~SS...... HGN WKL ...... ~CGYW F P~__~.~YNV~EI~~~ ...... ~ ...... ~A~ F 473523 ARSC STSI FFF VGSNGC-FAT--HV CFCFG--S-- - I ARSD SGS YT H EERGLL-TAE .... ASAHA--EWG - A--R - D LY 537 ~%.~SE RGT F EGAGAC-YGR--KV CPCFG--E-- ~ T--H ~ F 534 ~SF ...... 54

G6S NIAKTIDP-ELLGKM --NYRLMML~SCSGP - TCRTPGVFDPGYR FDPR---LMFSNRGS VRTRRFSKHLL .... 552 ~DS TVWVGFNPDEFLANF S-DIHAGELYFVDSD --PLQDHNMYNDS-- QGGD---LFQLLMP ...... 550 Ga16S QEALSRIT-SWQQ~ QE L-NVC~WAVM~APP -~LTP ARSA LQALKQLQ-LLKAQL DAAVTFC~S~VARGE DPALQICCHPC~T-- PRPA---~HCPDPH A ...... 507 ARSB LFDIDRDP-EERHDL SREYPHIVTKLLSRL Q--FYHKHSVPVY-- FPAQDP pKATGV WGPWM ...... 533

ARSD HAVIARVG-~~_~SR S F-SMS I G-- HFPF-- HEDGD GTP ...... 593 /~SE YQVMERVQ- Q S DRLG I G-- PFPL-- REDD PQ ...... 589 ARS~ ...... 54

tus of these two genes. Reverse-transcribed RNA samples Searching for Mutations in Patients with CDPX from a male control and a hybrid cell line retaining the Twenty-seven unrelated male patients with radiological inactive X chromosome were amplified by PCR using oli- and/or clinical features of CDPX, but with no reported chro- gonucleotide primers generated from the cDNA se- mosomal abnormality involving the Xp22.3 region, were quences. As shown in Figure 3C, both ARSD and ARSE tested for small rearrangements or point mutations in the mRNAs were detected in the male control and in the hybrid ARSD and the ARSE genes. Full-length cDNAs from these cell line, indicating that these genes are expressed by both genes were used as probes for Southern blot analysis the active and the inactive X chromosomes, thus escaping on EcoRl-, Hindlll-, and Taql-digested genomic DNA from X-inactivation. patients. A normal restriction pattern was detected in all A Novel Sulfatase Implicated in CDPX and Warfarin Embryopathy 19

A ARSD B ARSE patients. To test for the presence of point mutations, the sequence of most intron-exon boundaries of both ARSD

c~ and ARSE genes was determined, and oligonucleotide primers were generated to amplify single exons. Amplified exons were tested by single strand conformation polymor- 9,5 kb-- 9.5 kb -- phism (SSCP), and several band mobility shifts were iden- 7.5 kb-- 7.5 kb- tified. Each of the exons in which a band mobility shift was 4.4 kb -- 4.4 kb-- identified was tested by SSCP on 60 female controls (a

2.4 kb- total of 120 normal X chromosomes) to exclude a polymor- 2~ kb-- i .,e phisrn. A total of 4 out of 10 exons were tested for ARSD 1.4 kb- 1.4 kb -- and 9 out of 11 exons for ARSE. Five unrelated patients (PAR492, PAR484, BA439, 10 d. exp. 20 hrs. exp. BA440, and NAN514) were found to have a band mobility shift in ARSE exons not present in any of the controls, in all five cases, missense mutations were identified by direct C ~ sequence analysis using oligonucleotides from the flank- + - + - + - ing introns. Types and positions of the mutations are listed in Table 1. The absence of these mutations in 120 normal ARSD PGK1 X chromosomes suggests that they are responsible for CDPX. Figure 4 shows the sequence analysis of exon 4 of the ARSE gene in patient PAR492 in whom the G137V mutation was identified. This mutation is of particular inter- ARSE MIC2 est, since the Gly at position 137 is conserved in all sul- fatases and a mutation in the ARSB gene causing an iden- tical amino acid substitution has been described in a Figure 3. Northern Blot and RT-PCR Analyses of the ARSD and patient with mucopolysaccharidosis VI (Maroteaux-Lamy ARSE Genes syndrome) (Wicker et al., 1991). (A) Poly(A)+ RNA from multiple tissues hybridized to probe 76.15, one of the exon amplification products from the ARSD gene. A -5 kb transcript abundant in kidney and pancreas but present also in liver, Biochemical Characterization of ARSE Gene lung, placenta, brain, and heart is detected. A much less abundant Product in COS Cells transcript of -3.5 kb is also visible in the same tissues. Transient expression of ARSE in COS cells was obtained (B) A different Northern blot containing RNA samples from the same by subcloning the cDNA in the pcDL-SRe296 vector (Ta- tissues was hybridized to probe 48.7, an exon amplification product from the ARSE gene. A very abundant 2.2 kb transcript and a less kebe et al., 1988). As negative controls, COS cells were abundant 4 kb transcript are detected in pancreas, kidney, and liver. also transfected in parallel with vector alone and with a (C) X inactivation studies of ARSD and ARSE genes. Oligonucleotide construct carrying the ARSE cDNA in the antisense orien- primers from ARSD and ARSE cDNAs were used for the amplification tation. Expression of the ARSC/STS cDNA, subcloned in of reverse-transcribed total RNA from a normal male fibroblast cell line (XY), from a human/hamster hybrid cell line retaining the human the same vector, was performed in parallel for comparison. inactive X chromosome (Xi), and from a hamster cell line. Plus and The results of enzymatic determinations performed at neu- minus indicate presence and absence of reverse transcriptase in the tral pH using COS cell extracts 48-76 hr after transfection RT reaction. ARSD and ARSE primers amplify a 148 bp and a 450 bp are shown in Table 2. They indicate that the ARSE gene product, respectively, from both the active and inactive X chromosome product is an arylsulfatase, as demonstrated by the 6- to (left panels). As controls, oligonucleotide primers from MIC2 gene, known to escape X inactivation, and PGK1 gene that is expressed 8-fold increase of enzymatic activity observed when 4-MU only by the active X chromosome were used to amplify the same tem- sulfate was used as a substrate. Unlike ARSC/STS, ARSE plates (right panels). has no activity toward steroid sulfates such as dehydroepi-

Table 1. ARSE Gene Mutations in Patients with CDPX

Nucleotide Amino Acid Primers Used ~r Annealing Patient Substitution Change Exon SSCP Anal~is Temperature BA440 G ~ C Arg-12 ~ Ser 2 F 5'-catacacaagttctgcaaat8-3' 51oC R 5Lcagtggcttctctttcttc.3, NAN514 G ~ A Gly-117 ~ Arg 4 F5~ccaagtgttctatttttg-3' 48oC R 5~ccaaactctttagctgaatg-3 , BA439 G ~ C Arg-111 ~ Pro 4 F5~ccaastgttctattttts-3' 48oc R 5~ccaaactctttasctsaats.3, PAR492 G ~ T Gly-137 ~ Val 4 F5~ccaagtgttctatttttg-3' 48oC R 5Lccaaactctttasctgaats-3 , PAR484 G ~ C Gly-245 ~ Arg 5 F 5Lcsacatgtccasttatsctcag-3 , 58oc R 5~cattccaaasctsagt@ctts-3' Cell 20

WT P Table 2. Expressionof ARSC and ARSE in COS Cells Estrone Construct 4-MU Sulfatea DHEASulfate b Sulfateb pcDL-SR~296 3.3 65 293 pcDL-SR~296-18- 3.0 268 378 (ARSE antisense) pcDL-SRa296-TA5B 66.0-93.5 9,758 31,847 (ARSC/STS) GLY 137 WT AAA[~ATGCCACTGGAC pcDL-SRa296-12+ 19.5-25.7 172 387 (ARSE sense) P AAA~ATGCCACTGGAC VAL , Activities expressed as nmoles/mg/hr, bActivities expressed as pmoles/mg/hr. Figure 4. SequenceAnalysis of the G137V Mutation Genomic DNA from patient PAR492 (P) and from a normal individual (WT) was amplified using primers flanking exon 4 (see Table 1) and sequenced.Sequencing reactions were loadedin the followingorder: A, C, G, T. The arrows indicatethe positionof the G to T transversion. A comparisonbetween patient and wild-typesequences in the region tested fibroblasts from a patient (BA311) with a deletion where the G137V mutationis located is shown at the bottom. of the CDPX critical region and from three patients (BA20, BA424, and BA423) with large Xp22.3 deletions spanning both the CDPX critical region and the ARSC/STS gene androsterone sulfate (DHEAS) and estrone sulfate (ES). (Schaefer et al., 1993). In all of these patients, as well as ARSE activity can be distinguished from ARSC since it is in two patients with MSD (BA426 and BA428), we have not competitively inhibited by DHEAS (Figure 5A) and is found a profound deficiency of a heat-labile DHEAS- more heat labile (Figure 5B). resistant arylsulfatase activity. Cell line BA311 has a com- plete deficiency of a heat-labile arylsulfatase activity, while ARSE Is Inhibited by Warfarin both ARSC and STS activities are normal. Conversely, in Because of the phenotypic similarities of CDPX and warfa- patients affected by isolated X-linked ichthyosis, heat- rin embryopathy, we speculated that warfarin might inter- labile arylsulfatase activity is normal (Figure 6). fere with ARSE activity. Figu re 5C shows ARSE and ARSC The combination of these results points to the existence activities in transfected COS cells in the presence of in- of a neutral arylsulfatase activity that is deficient in patients creasing warfarin concentrations. A significant decrease with deletions of the CDPX critical region and is distinct of ARSE activity was observed starting at concentrations from ARSC. Genetic and biochemical data indicate that between 10 and 50 p.M and reaching a complete inhibition this neutral arylsulfatase is the product of the ARSE gene. at 5 mM. ARSC activity remained constant even at the However, we cannot exclude the possibility that ARSD highest warfarin concentrations. These data indicate that and ARSF gene products may contribute to this activity. ARSE is inhibited by warfarin. Biochemical characterization of ARSD and ARSF gene products and the study of patients with point mutations in Arylsulfatase Activity in CDPX the ARSE gene might help to clarify this issue. Unfortu- Based on the above described data, one would expect nately, flbroblasts from these patients are not available at that patients with CDPX lack an arylsulfatase activity that this time. is biochemically distinct from the other arylsulfatases. To test this hypothesis, we have developed a specific assay Discussion to measure ARSE activity in fibroblasts from these pa- tients. A major problem in developing such a test is the A Cluster of Sulfatase Genes on Xp22.3 interference of ARSC activity at neutral pH. Therefore, the We have identified three novel genes on the X chromo- enzymatic assay was run in the presence of DHEAS, a some sharing homology with all members of the sulfatase natural substrate of ARSC/STS protein, which competi- gene family: ARSD, ARSE, and ARSF. Unlike all other tively inhibits ARSC. Under these conditions, approxi- members of this family, these genes were isolated by posi- mately 70%-90% of the measurable arylsulfatase activity tional cloning, a strategy that does not rely on the knowl- is heat labile. Figure 6 shows the results of enzymatic edge of the gene product. This approach often leads to assays performed in control and patient fibroblasts. The the discovery of novel metabolic pathways that had not heat-labile arylsulfatase activity ranged from 350-850 been predicted by classical biochemical studies. pmol/mg/hr in six control cell lines and in fibroblasts from ARSD, ARSE, and ARSF are positioned in tandem in four patients with isolated X-linked ichthyosis (BA65, a small region on Xp22.3. Sequence and structural similar- BA67, BA88, and BA16) who have a deletion of the ARSC/ ities among these genes suggest that they were generated STS gene. This heat-labile enzymatic activity represents by ancient duplications. Duplications are not unprece- a tiny portion of the total neutral arylsulfatase activity and, dented in this region of the human genome. For example, therefore, can be easily overlooked under the assay condi- evidence has been presented that tandem duplications in tions commonly used for ARSC measurement. We have the pseudoautosomal region resulted in the presence of A Novel Sulfatase Implicated in CDPX and Warfarin Embryopathy 21

A

"y. - ~ 5000 90090 ~.

- 7 .... ARSC 4000 ', ~ ARSE -6 80

-5 300O 60060 -4

2000 -3 40O40 ~ 6 50 160 260 400 660 DHEAS concentration (gM) -2 1000 20020 B ~, ioo ...... ,_ .- 4-.. - ~ ...... ~ ...... ~ -1 ~L A >~ 90 '~ go- corltrols XLI CDPX MSD XLUCDPX TO- (n=6) fn=4) (n=l) (n=2) (n=3) 6O" ~_~ 50- Figure 6. Enzymatic Analysis in Fibroblasts from Patients Enzymatic determinations were performed in six normal individuals, ~>' 30- four patients with XLI carrying a deletion of the ARSC/STS gene, one 2n patient with a deletion of the CDPX critical region, two patients with MSD, and three patients with Xp22.3 deletions spanning both the ~o ARSC/STS gene and the CDPX critical region. ARSC activity is ex- () l~O 10 10 4'5 60 pressed as mmol 4-MU liberated/mg protein/hr, STS as pmol DHEA minutesat 50°C liberated/rag protein/hr, and heat-labile arylsulfatase as pmol 4-MU liberated/mg protein/hr. The average values of three separate mea- C surements and the standard deviations are shown. A profound defi- ~,~ ARSC ~ ARSE ciency of a heat-labile neutral arylsulfatase activity was observed in patients with deletions spanning the CDPX critical region. In patients 90 with XLI, this activity is normal. The CDPX patient has normal ARSC ~ so and STS activities, while both activities are deficient in MSD and XLI/ CDPX patients. ,~ 5o '001 3o I!1 2o

0 lO 50 i00 500 looo 5000 cluster of three genes was generated by unequal crossing- warfarin concentrations (gM) over events that occurred in an ancestral pseudoautoso- mal region. The repositioning of the pseudoautosomal Figure 5. Biochemical Studies of ARSE Expressed in COS Cells boundary more telomeric on Xp may have allowed the X Homogenates from COS cells transfected with either ARSE or with and Y copies of these genes to diverge, with subsequent ARSC/STS cDNAs were used in three different sets of experiments. (A) Competitive inhibition by DHEAS. Neutral arylsulfatase assays loss of function of the Y copies. A similar mechanism was were run as described in the Experimental Procedures either with, or proposed for the STS and KAL genes (Yen et al., 1988; without, adding DHEAS to the incubation mix. No competition was del Castillo et al., 1992; Incerti et al., 1992). The finding observed on ARSE activity (86.0% activity with DHEAS), while ARSC that at least two of these novel genes (ARSD and ARSE) was almost completely inhibited (4.60/o activity). (B) Time course of heat inactivation of ARSE. COS cell extracts were escape X inactivation, as well as the detection of homolo- incubated at 50°C for the times indicated and used for neutral arylsulfa- gous loci on the Y chromosome, supports this hypothesis. tase activity determinations. ARSE activity was almost completely in- The identification of three genes from the CDPX critical activated within 10 rain of incubation at 50°C, while ARSC activity region encoding putative sulfatases suggested the possi- remained constant after incubation times up to 60 rain. (C) Warfarin inhibition of ARSE. Neutral arylsulfatase assay was run bility that CDPX was due to a sulfatase deficiency. The in the presence of increasing concentrations of warfarin. The average observation of at least two patients with MSD who had values of three separate measurements and the standard deviations typical radiological and facial features of CDPX, one pres- are shown. ARSE activity inhibition was observed starting at concen- enting at birth (Burch et al., 1986) and the other who died trations of 10 I~M and was almost complete at 5 mM (15% activity). in utero (J. Hopwood, personal communication), strongly ARSC was not affected by the presence of warfarin in the incubation mixture. supports this hypothesis. As features of CDPX are rela- tively mild, compared with the other features of MSD, and tend to fade with age, they may have been overlooked in homologous loci MIC2, MIC2R, and PBDX (Smith and other cases with MSD. Goodfellow, 1994). An obligatory crossing-over event oc- curs in this region of the X and Y chromosomes at each The ARSE Gene Is Mutated in CDPX male meiosis, and this extremely high frequency of recom- As ARSD, ARSE, and ARSF genes are all located within bination may render this region more prone to unequal the CDPX critical region, it was difficult to establish which crossing-over, leading to deletions and duplications. The one of these three genes was specifically involved in the ARSD, ARSE, and ARSF loci are located only 150 kb from pathogenesis of CDPX. The presence of substantial se- the pseudoautosomal boundary. We propose that this quence divergence among these three genes, as shown Cell 22

by lack of cross-hybridization, suggests that they may not shown), is inactive against steroid sulfates, and is not have a different function as the result of a different sub- competitively inhibited by them. Enzyme analysis of fibro- strate specificity. This is also supported by the presence blasts from patients with CDPX carrying deletions span- of different expression patterns, ARSD being quite ubiqui- ning the ARSE gene and from patients with MSD demon- tously expressed and ARSE being expressed exclusively strated a deficiency of a heat-labile arylsulfatase activity in liver, pancreas, and kidney. that, most likely, corresponds to the product of the ARSE We searched for point mutations in both ARSD and gene. Together, these data indicate that CDPX is an inborn ARSE genes in a heterogeneous group of 27 unrelated error of metabolism due to deficiency of ARSE. male patients displaying features of CDPX. Five missense mutations in the ARSE gene were identified. One can sug- Physiological Role of ARSE and the gest that these mutations are related to the occurrence Warfarin Embryopathy of CDPX on the basis of the following observations. None Warfarin, a coumarin derivative that inhibits vitamin K re- of the mutations were found in any of 60 normal female ductase activity (Fasco et al., 1982), interferes with normal individuals, corresponding to 120 normal X chromosomes. cartilage and bone development. Warfarin embryopathy, Some of the mutations resulted in nonconservative amino a well-known human teratologic syndrome that is the con. acid substitutions, such as the R111P mutation changing sequence of maternal use of warfarin between 6 and 9 a basic amino acid to a (turn inducer) neutral hydrophobic postconceptual weeks (Hall et al., 1980), results in skeletal one, or the G137V changing a polar amino acid to a hy- and facial abnormalities strikingly similar to those of CDPX drophobic one. This latter mutation, located in a highly (Pauli, 1988). Rats treated with warfarin develop skeletal conserved domain, substitutes a glycine residue that is changes characterized by excessive mineralization and found at the same position in all previously described sul- growth plate closure of long bones, associated with snout fatases in human and other species, including E. coli (Dan- dysmorphisms (Price, 1985; Howe and Webster, 1992). iels et al., 1992) and K. aerogenes (Murooka et al., 1990). A phenocopy of CDPX and of warfarin embryopathy is the Furthermore, a mutation in a different sulfatase gene, congenital deficiency of vitamin K epoxide reductase, the ARSB, resulting in a glycine to valine substitution at ex- enzyme recycling vitamin K epoxide to vitamin K (Pauli et actly the same position was described by Wicker et al. al., 1987). These observations indicate the importance of (1991) in a patient with mucopolysaccharidosis VI (Maro- vitamin K in bone and cartilage development. This is fur- teaux-Lamy syndrome). This mutation severely impaired ther emphasized by the involvement of vitamin K in the the processing and stability of ARSB precursor polypep- carboxylation of bone Gla protein and of matrix Gla pro. tides that are not converted to mature ARSB and undergo tein, two essential components of bone and cartilage rapid proteolytic degradation, as shown by the analysis (Price, 1985). The striking similarities among CDPX, war- of fibroblasts from the patient and by the expression of farin embryopathy, and vitamin K epoxide reductase defi- this mutation in LTK- cells. These authors speculated that ciency phenotypes and the evidence that warfarin inhibits the glycine at position 137 is of crucial importance for the ARSE both suggest that these disorders are due to abnor- secondary structure of the arylsulfatases, since it appears malities in the same metabolic pathway. However, since to be located at a junction between an ~ helix and a 13 warfarin concentrations required for inhibition of ARSE sheet (Wicker et al., 1991). activity are high, we cannot rule out the possibility that These data implicate ARSE in the pathogenesis of ARSE is not the direct target of the drug in warfarin em- CDPX. The fact that we could detect ARSE mutations in bryopathy. only a fraction of all patients with the CDPX phenotype The physiological role of ARSE and the mechanisms by referred to us for molecular analysis would be consistent which its deficiency leads to the skeletal changes typical with the presence of genetic heterogeneity for CDPX. of CDPX are not understood. A major unresolved issue Since only 9 out of 11 exons of the ARSE gene have been is the identity of the natural substrate for ARSE. The nature screened for mutations, additional mutations may be of the CDPX phenotype suggests that ARSE may be es- found in the remaining two exons (exons 1 and 2). Further- sential for the correct composition of cartilage and bone more, we cannot exclude the possibility that mutations in matrix during bone development. The importance of sev- the ARSD or ARSF genes may also cause CDPX. eral lysosomal sulfatases in cartilage and bone develop- ment is evident from their involvement in the catabolic CDPX Due to ARSE Deficiency pathway of sulfated glycosaminoglycans, which are es- Previous studies reported data suggesting the presence sential components of the extracellular matrix of cartilage. of two variants of ARSC based on both electrophoresis Accordingly, bone and cartilage abnormalities are present and immunoprecipitation studies. These two variants were in other sulfatase deficiencies, such as in mucopolysac- designated F (faster migrating) and S (slowly migrating). charidoses type II, IIID, IVA, and VI (Neufeld and Muenzer, Only the latter of these two variants appeared to corre- 1995). However, differences in pH optima between ARSE spond to STS, the other being a separate entity (Chang and lysosomal sulfatases indicate that they have different et al., 1990). We have identified a gene encoding an aryl- subcellular localization and, therefore, different metabolic sulfatase with the same biochemical features and tissue functions. The recent finding of a sulfate transport defect distribution of the previously reported "F form" of ARSC. in diastrophic dysplasia, an osteochondrodysplasia that Expression of ARSE cDNA in COS cells demonstrated is characterized by dwarfism, spinal deformation, and joint that ARSE is heat labile, has a neutral pH optimum (data abnormalities, further emphasizes the importance of sul- A Novel Sulfatase Implicated in CDPX and Warfarin Embryopathy 23

fate metabolism in cartilage and bone development (H&st- mRNA Analysis backa et al., 1994). Northern blots containing human RNAs from several tissues were pur- chased from Clontech and hybridized to exons 76.15 and 48.7 from In conclusion, our data suggest that a sulfatase defi- the ARSD and ARSE genes, respectively, using the conditions recom- ciency is responsible for bone and cartilage changes in mended by the manufacturer (Clontech). Washing conditions were both genetic and drug-induced human disorders. Further 0.1 x SSC, 0.2% SDS at 50°C for both probes. Fibroblasts from a studies are needed to characterize the relative roles of normal male individual and a somatic cell hybrid (BA342) retaining the inactive X chromosome were used for the X inactivation studies. ARSE and vitamin K in bone and cartilage development RT-PCR experiments were performed as previously described (Bor- and to understand fully the metabolic pathway that is im- sani et al., 1991). The following oligonucleotide primers were used: paired in CDPX and in warfarin ernbryopathy. ARSD F, GAGGAAGGTGTGAGGCTC; ARSD R, CCTGAGCCTGCG- I-rCCACTG; ARSE F, GTCTCAACTGTGAGTCAG; ARSE R, AAGTT- Experimental Procedures CTCCATAGTGATAA. PCR was carried out for 40 cycles. Annealing temperatures were Patients and Cell Lines 50°C for ARSD and ARSE primers. MIC2 and PGK1 PCR conditions The 27 patients (23 sporadic and 4 familial) tested for point mutations and primers were as previously described (Brown et al., 1990; Franco in the ARSD and ARSE genes had the typical radiological and clinical et al., 1991). features of CDPX. Some of these patients have been described in previous reports (Sheffield et al., 1976; Maroteaux, 1989). Patient SSCP Analysis PAR492, in whom the G137V mutation was identified, corresponds SSCP analysis was performed on amplified genomic DNA from pa- to case 1 in a previous report (Maroteaux, 1989). Fibroblast cell lines tients according to a previously described method (Orita et al., 1989). BA65, BA67, BA88, and BA16 are from patients with X-linked ichthyo- Direct sequencing was carried out on PCR products purified from a sis (XLI) carrying a deletion of the STS gene. BA311 is from a patient 2% low melting agarose gel (Nusieve FMC), using a Sequenase 2.0 with CDPX carrying an X/Y translocation involving the Xp22.3 region kit (United States Biochemical). (Wang et al., 1995). BA428 cell line (Burch et al., 1986), a gift from Dr. Fensom, and BA426, obtained from the Human Mutant Cell Repository Expression of ARSC and ARSE in Eukaryotic Cells (cell line GM3245), are from patients with MSD. BA20, BA423, and ARSC/STS cDNA (Ballabio et al., 1987) was subcloned in EcoRI- BA424 (Schaefer et al., 1993) are from patients with terminal Xp dele- digested plasmid pcDL-SRa296 (Takebe et al., 1988). ARSE cDNA, tions involving both the CDPX and STS loci. obtained by ligating two different cDNA clones, was blunt ended and subcloned in the same vector. COS cells were transfected with 10- 20 I~g of either construct by electi'oporation and harvested by trypsin- Physical Mapping ization 48-76 hr after transfection for enzyme assays. We have employed a previously described strategy to identify overlap- ping cosmid clones using entire YACs as probes (Wapenaar et al., Enzyme Assays 1994). YAC clones spanning the CDPX region (Figure 1) are described Fibroblast extracts were prepared as previously described (Chang et in Wang et aL (1995). Yeast DNA preparation, screening of the X-chro- al, 1985). COS cells were resuspended in 0.1 M Tris-HCI (pH 7.5), mosome specific cosmid library using purified DNA from YACs 1% Triton X-100, and 150 mM NaCI. Neutral arylsulfatase activity NB6F12 and 455B10, and construction of cosmid contigs were per- assay conditions were as follows: the incubation mixture (200 p.I) con- formed as previously described (Wapenaar et al., 1994). Two cosmid tained 30-40 I~g of protein, 0.2 mmol/I 4-MU sulfate, in 0.05 M Tris- contigs, spanning 340 kb of the CDPX critical region (with a gap of HCI buffer (pH 7.5). The enzymatic assay was run at 37°C for 4 hr. 60 kb), were assembled. The reaction was stopped by adding 1.8 ml of glycine-carbonate buffer (pH 10.7), and fluorescence was determined at 365 nm (excitation) Exon Amplification and 460 nm (emission) on a Hoefer TKO 100 fluorometer. Warfarin Four cosmid clones belonging to the first contig (clones 16, 76, 75, inhibition analysis was performed by adding 3-(~z-acetonylbenzyl)-4- and 48) and five belonging to the second ~clones 32, 66, 58, 96, and hydroxycoumarin (warfarin) sodium salt (Sigma A-3430) at different 100) were used for exon amplification experiments. The cosmid clones concentrations to the incubation mix. were individually digested with BamHI-Bglll, cloned in the pSPL3 vec- To measure heat-labile, non-STS, arylsulfatase activity, part of the tor, and used for the exon amplification experiments as previously fibroblast extract was inactivated for 15 rain at 50°C, while the noninac- described (Church et al., 1994). tivated samples were kept in ice. The assay was performed in the presence of 0.6 mmol/I DHEA sulfate (DHEAS) in the incubation mix- ture. The enzymatic activity was calculated as the difference between Screening of eDNA Libraries the activities of heat-inactivated and noninactivated samples. DHEA An adult human kidney cDNA library, oligo(dT)+ random primed (Clon- and estrone sulfatase activities were measured as previously de- tech HL1123n), was plated at a density of 1 x 106 PFU and screened scribed (Ballabio et al., 1985; Munroe and Chang, 1987). using exon amplification products as probes. Hybridization and wash- ing conditions were as previously described (Franco et al., 1991). To Acknowledgments validate the 5' end of ARSD transcript, 5'-RACE-PCR was performed as previously described (Frohman et al., 1988). We thank Drs. A. Beaudet, B. Casey, J. Hopwood, G. Karsenty, R. Smith, and H. Zoghbi for helpful suggestions, Drs. M. Fraccaro and DNA Sequence Analysis G. Camerino for providing us with fibroblasts from a patient with CDPX Consensus cDNA sequences spanning the entire coding regions of (BA311), Dr. A. Fensom for fibroblasts from a patient with MSD the ARSD and ARSE genes were obtained by end sequencing of over- (BA428), and the Baylor College of Medicine Human Genome Center lapping cDNA clones and by primer walking. Sequencing was per- for assistance. This work was supported by grants from the National formed either man ually, using a Seq uenase Version 2.0 DNA Sequenc- Institutes of Health (NS31367-01) (A. B.), the Department of Energy ing kit (United States Biochemical), or automatically, using an Applied (FG05-88ER60692) (A. B.), P. F. Ingegneria Genetica Consiglio Nazio- Biosystem ABI 373A fluorescence sequencer. Assembly and analysis nale delle Ricerche (G. A.), and the Italian Telethon Foundation (A. B.). of DNA sequences were performed using the facilities of the Molecular Biology Information Resource, Department of Cell Biology, Baylor Col- Received December 10, 1994; revised January 27, 1995. lege of Medicine. Data base searches were carried out using the GCG software package and the BLAST network service from the National References Center for Biotechnology Information. The multiple sequence align- ment shown in Figure 2 was performed using the pattern-induced Ballabio, A., and Andria, G. (1992). Deletions and translocations involv- multisequenoe alignment (PIMA) algorithm (Smith and Smith, 1992). ing the distal short arm of the human X chromosome: review and Cell 24

hypotheses. Hum. Mol. Genet. 1,221-227. porter: positional cloning by fine-structure linkage disequilibrium map- Ballabio, A., and Shapiro, L. J. (1995). Steroid sulfatase deficiency ping. Cell 78, 1073-1087. and X-linked ichthyosis. In The Metabolic and Molecular Bases of Howe, A. M., and Webster, W. S. (1992). The warfarin embryopathy: Inherited Disease, C. R. Scriver, A. L. Beaudet, W. S. Sly, and D. a rat model showing maxillonasal hypoplasia and other skeletal distur- Valle, eds. (New York: McGraw-Hill), pp. 2999-3022. bances. Teratology 46, 379-390. Ballabio, A., Parenti, G., Napolitano, E., Di Natale, P., and Andria, G. Incerti, B., Guioli, S., Pragliola, A., Zanaria, E., Borsani, G., Tonlorenzi, (1985). Genetic complementation of steroid sulphatase after somatic R., Bardoni, B., Franco, B., Wheeler, D., Ballabio, A., and Camerino, cell hybridization of X-linked ichthyosis and multiple sulphatase defi- G. (1992). Kallmann syndrome gene on the X and Y chromosomes: ciency. Hum. Genet. 70, 315-317. implications for evolutionary divergence of human sex chromosomes. Ballabio, A,, Parenti, G., Carrozzo, R., Sebastio, G., Andria, G., Nature Genet. 2, 311-314. Buckle, V., Fraser, N., Craig, I., Rocchi, M., Romeo, G., Jobsis, A. C., Kolodny, E. H., and Fluharty, A. L. (1995). Metachromatic leukodystro- and Persico, M. G. (1987). Isolation and characterization of a steroid phy and multiple sulfatase deficiency: sulfatide lipidosis. In The Meta- sulphatase cDNA clone: genomic deletions in patients with X chromo- bolic and Molecular Bases of Inherited Disease, C. R. Scriver, A. L. some-linked ichthyosis. Prec. Natl. Acad. Sci. USA 84, 4519-4523. Beaudet, W. S. Sly, and D. Valle, eds. (New York: McGraw-Hill), pp. Borsani, G., Tonlorenzi, R., Simmler, M. C., Dandolo, L., Arnaud, D., 2693-2741. Capra, V., Grompe, M., Pizzuti, A., Muzny, D., Lawrence, C., Willard, Maroteaux, P. (1989). Brachytelephalangic chondrodysplasia punc- H. F., Avnsr, P., and Ballabio, A, (1991). Characterization of a murine tata: a possible X-linked recessive form. Hum. Genet. 82, 167-170. gene expressed from the inactive X chromosome. Nature 351,325- Munroe, D. G., and Chang, P. L. (1987). Tissue-specific expression 329. of human arylsulfatase-C isozymes and steroid sulfatase. Am. J. Hum. Brown, C. J., Flenniken, A. M., Williams, B. R., and Willard, H. F. Genet. 40, 102-114. (1990). X-chromosome inactivation of the human TIMP gene. Nucl. Murooka, Y., Ishibashi, K., Yasumoto, M., Sasaki, M., Sugino, H., Acids Res. 18, 4191-4195. Azakami, H., and Yamashita, M. (1990). A sulfur- and tyramine- Burch, M., Fensom, A. H., Jackson, M., Pitts-Tucker, T., and Congdon, regulated Klebsiella aerogenes operon containing the arylsulfatase P. J. (1986). Multiple sulphatase deficiency presenting at birth. Clin. (atsA) gene and the atsB gene. J. Bacteriol. 172, 2131-2140. Genet. 30, 409-415. Neufeld, E. F., and Muenzer, J. (1995). The mucopolysaccharidoses. Chang, P. L., Ameen, M., Lafferty, K. L, Varey, P. A., Davidson, A. R., In The Metabolic and Molecular Bases of Inherited Disease, C. R. and Davidson, R. G. (1985). Action of surface-active agents on arylsul- Scriver, A. L. Beaudet, W. S. Sly, and D. Vails, eds. (New York: fatase-C of human cultured fibroblasts. Anal. Biochem. 144, 362-370. McGraw-Hill), pp. 2645-2494. Chang, P. L., MueUer, O. T., Lafrenie, R. M., Varey, P. A., Rosa, N. E., Orita, M., Suzuki, Y., Sekiya, T., and Hayashi, K. (1989). Rapid and Davidson, R. G., Henry, W. M., and Shows, T. B. (1990). The human sensitive detection of point mutations and DNA polymorphisms using arylsulfatase-C isoenzymes: two distinct genes that escape from X the polymerase chain reaction. Genomics 5, 874-879. inactivation. Am. J. Hum. Genet. 46, 729-737. Pauli, R. M. (1988). Mechanism of bone and cartilage maldevelopment Church, D. M., Stotler, C. J., Rutter, J. L., Murrell, J. R., Trofatter, J. A., in the warfarin embryopathy. Pathol. Immunopathol. Res. 7, 107-112. and Buckler, A. J. (1994). Isolation of genes from complex sources of Pauli, R. M., Lian, J. B., Mosher, D. F., and Suttie, J. W. (1987). Associ- mammalian genomic DNA using exon amplification. Nature Genet. 6, ation of congenital deficiency of multiple vitamin K-dependent coagu- 98-105. lation factors and the phenotype of the warfarin embryopathy: clues Curry, C. J. R., Magenis, E., Brown, M., Lanman, J. T., Jr., Tsai, to the mechanism of teratogenicity of coumarin derivatives. Am. J. J., O'Lague, P., Goodfellow, P., Mohandas, T., Bergner, E. A,, and Hum. Genet. 41,566-583. Shapiro, L. J. (1984). Inherited chondrodysplasia punctata due to a Peters, C., Schmidt, B., Rommerskirch, W., Rupp, K., ZOhlsdorf, M., deletion of the terminal short arm of an X chromosome. N. Engl. J. Vingron, M., Meyer, H. E., Pohlmann, R., and von Figura, K. (1990). Med. 311, 1010-1015. Phylogenetic conservation of arylsulfatases. J. Biol. Chem. 265, 3374- Daniels, D. L., Plunkett, G., III, Burland, V., and Blattner, F. R. (1992). 3381. Analysis of the Escherichia co/i genome: DNA sequence of the region Price, P. A. (1985). Vitamin K-dependent formation of bone Gla protein from 84.5 to 86.5 minutes. Science 257, 771-778. (osteocalcin) and its function. Vitam. Horm. 42, 65-108. del Castillo, I., Cohen-Salmon, M., Blanchard, S., Lutfalla, G., and Robertson, D. A,, Freeman, C., Nelson, P. V., Morris, C. P., and Hop- Petit, C. (1992). Structure of the X-linked Kallmann syndrome gene wood, J. J. (1988). Human glucosamine-6-sulfatase cDNA reveals ho- and its homologous pseudogene on the Y chromosome. Nature Genet. mology with steroid sulfatase. Biochem. Biophys. Res. Commun. 157, 2, 305-310. 218-224. Fasco, M. J., Hildebrandt, E. F., and Suttie, J. W. (1982). Evidence Sasaki, H., Yamada, K., Akasaka, K., Kawasaki, H., Suzuki, K., Saito, that warfarin anticoagulant action involves two distinct reductase activ- A., Sato, M., and Shimada, H. (1988). cDNA cloning, nucleotide se- ities. J. Biol. Chem. 257, 11210-11212. quence and expression of the gene for arylsulfatase in the sea urchin Franco, B., Guioli, S., Pragliola, A., Incerti, B., Bardoni, B., Tonlorenzi, (Hemicentrotus pu/cherrimus) embryo. Eur. J. Biochem. 177, 9-13. R., Carrozzo, R., Maestrini, E., Pieretti, M., Taillon-Miller, P., Brown, Schaefer, L., Ferrero, G. B., Grillo, A., Bassi, M. T., Roth, E. J., Wapen- C. J., Willard, H. F., Lawrence, C., Persico, M. G., Camerino, G., and aar, M. C., van Ommen, G. J. B., Mohandas, T. K., Rocchi, M., Zoghbi, Ballabio, A. (1991). A gene deleted in Kallmann's syndrome shares H. Y., and Ballabio, A. (1993). A high resolution deletion map of human homologywith neural cell adhesion and axonal path-finding molecules. chromosome Xp22. Nature Genet. 4, 272-279. Nature 353, 529-536. Schuchman, E. H., Jackson, C. E., and Desnick, R. J. (1990). Human Frohman, M. A., Dush, M. K., and Martin, G. R. (1988). Rapid produc- : MOPAC cloning, nucleotide sequence of a full-length tion of full-length cDNAs from rare transcripts: amplification using a cDNA, and regions of amino acid identity with arylsulfatases A and single gene-specific oligonucleotide primer. Prec. Natl. Acad. Sci. USA C. Genomics 6, 149-158. 85, 8998-9002. Sheffield, L. J., Danks, D. M., Mayne, V., and Hutchinson, L. A. (1976). Hall, J. G., Pauli, R. M., and Wilson, K. M. (1980). Maternal and fetal Chondrodysplasia punctata-23 cases of a mild and relatively common sequelae of anticoagulation during pregnancy. Am. J. Med. 68, 122- variety. J. Pediatr. 89, 916-923. 140. Smith, M. J., and Goodfellow, P. N. (1994). MIC2R: a transcribed MIC2- H&stbacka, J., de la Chapelle, A., Mahtani, M. M., Clines, G., Reeve- related sequence associated with a CpG island in the human pseudo- Daly, M. P., Daly, M., Hamilton, B. A., Kusumi, K., Trivedi, B., Weaver, autosomal region. Hum. Mol. Genet. 3, 1575-1582. A., Coloma, A., Lovett, M., Buckler, A., Kaitila, I., and Lander, E. S. Smith, R. F., and Smith, T. F. (1992). Pattern-induced multi-sequence (1994). The diastrophic dysplasia gene encodes a novel sulfate trans- alignment (PIMA) algorithm employing secondary structure-depen- A Novel Sulfatase Implicated in CDPX and Warfarin Embryopathy 25

dent gap penalties for use in comparative protein modelling. Protein Eng. 5, 35-41. Spranger, J. W., Opitz, J. M, and Bidder, U. (1970). Heterogeneity of chondrodysplasia punctata. Hum. Genet. 11,190-212. Stein, C., Gieselmann, V., Kreysing, J., Schmidt, B., Pohlmann, R., Waheed, A., Meyer, H. E., O'Brien, J. S., and von Figura, K. (1989a). Cloning and expression of human aryisulfatase A. J. Biol. Chem. 264, 1252-1259. Stein, C., Hille, A., Seidel, J., Rijnbout, S., Waheed, A., Schmidt, B., Geuze, H., and von Figura, K. (1989b). Cloning and expression of human steroid-sulfatase. J. Biol. Chem. 264, 13865-13872. Takebe, Y., Seiki, M., Fujisawa, J.-I., Hoy, P., Yokota, K., Arai, K.-I., Yoshida, M., and Arai, N. (1988). SR~ promoter: an efficient and versa- tile mammalian cDNA expression system composed of the simian virus 40 early promoter and the R-U5 segment of human T-cell leukemia virus type 1 long terminal repeat. Mol. Cell. Biol. 8, 466-472. Tomatsu, S., Fukuda, S., Masue, M., Sukegawa, K., Fukao, T., Yama- gishi, A., Hod, T., Iwata, H., Ogawa, T., Nakashima, Y., Hanyu, Y., Hashimoto, T., Titani, K., Oyama, R., Suzuki, M., Yagi, K., Hayashi, Y., and Orii, T. (1991). Morquio disease: isolation, characterization and expression of full-length cDNA for human N-acetylgalactosamine- 6-sulfate sulfatase. Biochem. Biophys. Res. Commun. 181,677-683. Wang, I., Franco, B., Ferrero, G. B., Chinault, A. C., Weissenbach, J., Chumakov, I., Le Paslier, D., Levilliers, J., Klink, A., Rappold, G., Ballabio, A., and Petit, C. (1995). Mapping and cloning of the critical region for X-linked recessive chondrodysplasia punctata. Genomics, in press. Wapenaar, M. C., Schiaffino, M. V., Bassi, M. T., Schaefer, L, Chi- nault, A. C., Zoghbi, H. Y., and Ballabio, A. (1994). A YAC-based binning strategy facilitating the rapid assembly of cosmid contigs: 1.6 Mb of overlapping cosmids in Xp22. Hum. Mol. Genet. 3, 1155-1161. Wicker, G., Prill, V., Brooks, D., Gibson, G., Hopwood, J., von Figura, K., and Peters, C. (1991). Mucopolysaccharidosis VI (Maroteaux- Lamy Syndrome). J. Biol. Chem. 266, 21386-21391. Wilson, P. J., Morris, C. P., Anson, D. S., Occhiodoro, T., Bielicki, J., Clements, P. R., and Hopwood, J. J. (1990). Hunter syndrome: isola- tion of an iduronate-2-sulfatase cDNA clone and analysis of patient DNA. Proc. Natl. Acad. Sci. USA 87, 8531-8535. Yen, P. H., Allen, E., Marsh, B., Mohandas, T., Wang, N., Taggart, R. T., and Shapiro, L. J. (1987). Cloning and expression of steroid sulfatase cDNA and the frequent occurrence of deletions in STS defi- ciency: implication for X-Y interchange. Cell 49, 443-454. Yen, P. H., Marsh, B., Allen, E., Tsai, S. P., Ellison, J., Connolly, L., Neiswanger, K., and Shapiro, L. J. (1988). The human X-linked steroid sulfatase gone and a Y-encoded pseudogene: evidence for an inver- sion of the Y chromosome during primate evolution. Cell 55, 1123- 1135.

GenBank Accession Numbers

The accession numbers for the ARSD and ARSE sequences are X83572 and X83573, respectively.