<<

Proc. Natl. Acad. Sci. USA Vol. 90, pp. 10783-10787, November 1993 Developmental Biology The precursor region of a active in sperm-egg fusion contains a metalloprotease and a disintegrin domain: Structural, functional, and evolutionary implications (PH-30/spermatogenesis/snake venom//celi adhesion) TYRA G. WOLFSBERGt4, J. FERNANDO BAZANt§, CARL P. BLOBELt$¶, DIANA G. MYLESII, PAUL PRIMAKOFFII, AND JUDITH M. WHITEtt** Departments of tPharmacology and tBiochemistry and Biophysics, University of California, San Francisco, CA 94143; and IlDepartment of Physiology, University of Connecticut Health Center, Farmington, CT 06030 Communicated by Bruce M. Alberts, August 4, 1993

ABSTRACT PH-30, a sperm surface protein involved in quences of the remainder of the a precursor region and the sperm-egg fusion, is composed oftwo subunits, a and (3, which entire ( precursor region were determined from clones of a are synthesized as precursors and processed, during sperm and (8 isolated at high stringency (3) from a guinea pig development, to yield the mature forms. The mature PH-30 whole-testis cDNA library (8). a/fl complex resembles certain viral fusion in mem- Northern Analysis. RNA was isolated from adult male brane topology and predicted binding and fusion functions. guinea pig tissues (9), electrophoresed in a formaldehyde/ Furthermore, the mature subunits are similar in sequence to agarose gel, and transferred and cross-linked to a Hybond-N each other and to a family of disintegrin domain-containing nylon membrane (Amersham). High-stringency prehybrid- snake venom proteins. We report here the sequences of the ization and hybridization with-PH-30 a and 3832P-labeled PH-30 a and (3 precursor regions. Their domain organizations DNA probes was carried out at 65°C in 5x standard saline are similar to each other and to precursors of snake venom citrate (SSC)/5x Denhardt's solution/0.1% SDS containing metalloproteases and disintegrins. The a precursor region salmon sperm DNA at 0.2 mg/ml. The membrane was contains, from amino to carboxyl terminus, pro, metallopro- washed, 10 min per wash, in 2 x SSC/0.1% SDS once at room tease, and disintegrin domains. The ,B precursor region con- temperature and twice at 65°C and then in 0.2x SSC/0.1% tains pro and metafloprotease domains. Residues diagnostic of SDS twice at 65°C. Hybridization with a mouse f-actin probe a catalytically active metalloprotease are present in the a, but was carried out in the same solution at 55°C, with identical not the ,B, precursor region. We propose that the active sites of wash conditions. the PH-30 a and snake venom metalloproteases are structurally similar to that of astacin. PH-30, acting through its metallo- AND and/or disintegrin domains, could be involved in RESULTS DISCUSSION sperm development as well as sperm-egg binding and fusion. The sequences of the PH-30 a and (3 precursor Phylogenetic analysis indicates that PH-30 stems from a mul- regions were deduced from cDNA sequences and are shown tidomain ancestral protein. in Fig. 1. Following their signal sequences, the a and P precursor regions contain sequences similar to those in the PH-30, a guinea pig sperm surface protein, is a candidate prodomains of disintegrin domain-containing snake venom sperm-egg membrane binding and fusion protein (1-5). The proteins, and then sequences which align with the snake PH-30 subunits found on fertilization-competent sperm, mature venom -dependent metalloprotease domain (Figs. 1 and a and mature (, share membrane topologies and other charac- 2). a contains the consensus active-site residues for a me- teristics with viral binding and fusion proteins. The (3 subunit talloprotease (see below); (3 does not. Following the metal- contains a potential receptor binding domain, a disintegrin loprotease domain, both proteins contain a disintegrin do- domain, related to soluble ligands found in snake main (Figs. 1 and 2). The cleavage site which generates venom. The a subunit contains a potential fusion peptide. In mature a falls within the disintegrin domain (3) (arrows, Figs. addition, the two subunits share sequence similarity. 1 and 2). The cleavage site which generates mature , lies at Snake venom disintegrins derive from precursors that also the amino terminus of the disintegrin domain (3) (arrow contain zinc-dependent metalloprotease domains (6, 7). In- before position 383, Figs. 1 and 2). The sequence alignment terestingly, PH-30 a and , are present on testicular sper- of mature PH-30 a and 8 with the snake venom proteins matogenic cells as larger precursors, termed here pro-a and continues through the cysteine-rich domain (Figs. 1 and 2). pro-,8 (2). Here we show that the precursor regions of PH-30 No snake venom proteins include either the epidermal growth a and 3 (the regions amino-terminal to the mature proteins factor repeat or the transmembrane and cytoplasmic seg- and found on developing, but not fertilization-competent, ments of a and (3(Figs. 1 and 2). Additional mammalian genes sperm) share further amino acid identity with each other as encode proteins with domain organizations identical to those well as with this family of metalloprotease and disintegrin of PH-30 a and 8 (Figs. 1 and 2). EAP I, cloned from rat and domain-containing snake venom proteins.tt monkey, is an androgen-regulated protein located on the apical surface of epididymal epithelial cells (13). Cyritestin is MATERIALS AND METHODS a mouse testis cDNA (GenBank accession no. X64227). Cloning. A portion of the a precursor region sequence was obtained from a PCR product generated by the nested RACE §Present address: Department of Molecular Biology, DNAX, Palo (rapid amplification of cDNA ends) protocol (3). The se- Alto, CA 94304. lPresent address: Department ofCellular Biochemistry and Biophys- ics, Sloan-Kettering Institute, New York, NY 10021. The publication costs of this article were defrayed in part by page charge **To whom reprint requests should be addressed. payment. This article must therefore be hereby marked "advertisement" ttThe sequences reported in this paper have been deposited in the in accordance with 18 U.S.C. §1734 solely to indicate this fact. GenBank data base (accession nos. Z11719 and Z11720). 10783 Downloaded by guest on September 27, 2021 10784 Developmental Biology: Wolfsberg et at. Proc. Natl. Acad. Sci. USA 90 (1993)

PH 3I a N R S 0 S M M A V P N T I S F S A S C Q K A H V V L HV A R S L L Q T C T L L MV A P R L 0 L V P 0 HL C V R L V T K LL V 0 TV L L P H IOH CH C G P V 78 PH 30 P ~' M LC LLL L LCOCL ACO1OGP.L KK YV 22 Cyritestoon M A A ALFFL; L 0 Y CDGQVI AtA OK D V 22 EA;P I M F P T 0 I F L H $ V L I S QMH 0G KG I:V G VE t, Q K L V H P K 33 Ht-e MI QV LCLV T I CLAAF PY QOG'SSISILE SOGN VN D 30 TrigraaS4n M I QV L COIT I C L A VFPP Y QOGVI SOIIL E S ON C N D 30

PH 30 aE H Y SIX ElI V IP E S L T V K 0 S Q -D POO R TOSY ML LI QWH K IHL RDEVDDP Y1-HHN DKTI OAR DC.. H Y 152 PH-30 E NOON QITDO V P K K I RS V K KG - -S V K S E VI - K IV OKE NT T YII V N L V R K H F L P HD F Q VY SWO I AIoI H K P F K DY S Q N F C IYI 96 Cyritestin K T P L Q rTI V F E K ID TOIl IQD AF KE AKE T Q V T Y V V T I EGKOA Y T L LKE K I sIFIL H P L F 0 T Y L R D K LWT L Q P YF IL V KT H:C FlY 98 KAP I K L S L L Q Y RD L K K I H C S D T P EK KY K K K L L YE I KL -OGR K T L T LH L L P A K KIFIL A L NY I K T YIYN IK K K MV T K H P QI LD H :C FFll 109 A TR P KOA V IP -K Y EDLA M QY E FYV NOKEPIV% L HLEFKN KOGLFISKODlIE IHY SP DOR EOIT TYP P VKEDHCYYI 69

E R Q G K Y - TriaramirHr a YK V VYPIPY PVV T A L P KP0 A V KFYIEY KMD A YMKYK F P V NN-OKE0E P V,V VLC_ H L K N K L IE & D -10£EY I HY IS D OKKE I T A IP I V K D H C YMlj S A I L K H T A A P A K D SQ A V S 225 PH830 at EW ISP C Q ENT TIS-V Y KI LIWISQP VLIY-- K RD Y VSERA K TO HIDQ PH20 I 0 H I K F F T I A I I I L R 0 L L F K 0 O I K P SIS I 0 F K H V I I P V H DN K K S I S I N V K N V V Y I K I I ---171 Cyritestin Q101N A A0~~~JElI FY TVT LITTOCADGC I L ROCL L Q L EN I21T1Y1K PLF LKE I IIA T F E HOI LY EIOKNNKKID Y I F LCHKKEKN F A N I K 1 11Y1KQ 0 LIE 1716 KAFO QGIISfH E F It' ~A I TOCNIGLIR GIF F R JNDIQRHYL OKE V K YWSD K 0 DIHI L V F K I NV K A P Y A T N ISOE 0 L N K T K K I T C I D A K 0 086 Jararhagmn H INK DA D ST ASI S ACN GL K 0 Y F K LIQR E T YF OK FL F L F DOSE AJHA I FFKY E NV EK ED K A PfrMHfGJV T QNWEK S Y E POIXKE A S 044 HI-e NOGR OKE N D A DS T A I I I AINOG L F O H F E LGOMNYCL I K F L K LD S K A H A I F K L K N I K K K D K A P KM C G V T QN W El SEP IKEK A S 183 Trigramin HOE-IR N D ADS TA IACDIG CFK OH F E L 0KEM SCL I K P C E LCWD S K A H A I F K I E NV E KE D K P PWM CJG V T IN W El I K S T K K A S 183 Metalloprotease Domain PH-3S ax 00 P P: P H S V~Q A C I I I CV..N T KHn,VWHE F V VH N00 IRK I H WDGSDV OlNE TIDWEV V D II A L A Nl F TED I N T EHVIH A 0 V EWVHT EDG 301 P9-30 p - -P0------KSS V P T H WO NL~ HO ILLE N LIP H MOO N T AT V T EK IFO Q L VOCL N N A F FSTTWN L T VOAILA L WID E 234 Cyritestin P EK GOS ------N S T L T KF PILE OIK ION D F A M F O HM 0 K OVGVA T QK V V HOPF LOI NIT H FQI ILI M T VMLNN L EIIWOI I 244 EAP 0 0 K K H F V K 3 H K K K F oW'L F V VWADIE F V; S R K N I K P Q N K C R KI H0WGMVIN P VIHIH I IYKA ILIN 0 K V T L T 0 M EIKWII A 0 255 Jararhagin ICL A FPT A -K-- I RIDr P YK KYI F P VV V 000 T VT K NNOG DCL D K OKX A R H YE L AN I VWKE OFRK SYHMHWIVIA VOGLCIKWIN 0 225

TrigraRin Q C N T P K -'--- 2 Q H F P Q RIOtI C 0G P- V 0 H 0 M S T K 1 0 0 N S KH I T K RIV H IH I N N OWFN H I K AWF]N I I T T CLS V C K I W S EKF 253

PH-30 aE WCHIE V P V OR V IIN F1N RERI D FHLL PRN RH DVWAH M I 'OGH H P0-KE T SWIWQF LCNO AtC SlOPC A A A 1KE A P H H ED A LCCI A A C C 37H PHlI II OD E EI VC V RSPI10IR M:IC. N TISOYODOG A C H s K TO TL D SF0C V I 310 Cyritestin D ElI E T NOGD A DKV 11FQRF C LIEKS F £011 K-AID I T Y CLILL K I H P-DY VIOlA TY HICIM A:C N F N F TACOI A C H P FTLC A V K C F A I V 320 EAP I D ElI ElI V S N C E SIT CILCHIFIS T WQK TICLK K REKD P D NV OIL LC 0K W LIYT IN QG A SF00G OI::T LEIS'C7 011 K DCC POIVN 01IHENRM 2333 Jararhagin D D T W T Q T Y Q Y iV T Q Q D I N V A S N N F R T D KK H N A T S I A F D I I I C-D K R S V V H S I N V A V T 327 HI-a T1 C T P K W C C HD LCL T 0 F GIRjAKJIGGIYIAI D MIC D F 0 0 0 FPI C I 0 N Trigramin V A A A R T V N T H D H A QI A I F N N V A P V K S VA I V H N A I V F V V A V T 331 0 C 0T I I F 1 Cj C PI0 K C C K 0 CCIT 0 0 0 K 0 0 HO" 0P K K 0 DisintegrlnH Domnain PH VI a - NECE OH N COG I END H I A V:C ROE H I CL MIQE N 01K ESGFKOK N[EfS DIPSFHWCf~ H K H RDG AWJC K NE[FP]W H FKA EKER A A 449 PH-30 C V QLL O I AD N A D C K-C-R K 0 A I C: N S P K A VF 5 1 0 N P N F 10 Ni 11V1K A F P CWT 5 1 Q V S Q.C CQDNIDWtjY C KEF V I K r9 NW1 385 Cyritastin L111CC G~0 LAIY D011 N C I POGS T IM1lNI P S AIRP100IK V FISIIIIV OKE FEKC A S IP EL DCL K N TOSE TKEF V VJ Q F CDG0 397

Jararhagin A1 >jLOGjjH0H ICSI'CIG D I F - 100Mb FPT I S N K P 5 P F F - SN ISIS I DQIW0DF I M N H N F KIll I I WEEPIL 0 TO I I - - -~S P P~ 364 HI-a T HE C O H N9 COO H NOET O)S CS COOlS-CON F VOISD K F S K I F - L1KD CE FO N N QK FIJ I C KKX K TODT V -S TI t 398

52 N K, ETj77D A A LEO TS CA C: CI O QYKFE T F DLI KPT DECOC: LP pH39V( TICON VIKIDWEIDO0101G0KC. D EIII-.ITIC TIN MSNELO K"' EEI-NT -WG4607G K~~~E)IC- H) K.::CND TF KTK OREV KE0100F KP NC 4750 Hyita tin 1-:0N-H L C K A 0 I C:0'C:0 S C KNPC. T Ol K TN C.IP DP0I 2 OA K0 CPC 0I - -:: IK FHK K 0 VLI K1WN 0 -K.N 0P0 FC:T0 ES A 469 a PIP 0 0 H Al T T M ON C TI 1001001 PH-A COTS1 K DOE TI.~ NE:- PP7EC P11OCT:OFDITHKCA 1KPG OWi-:V ESA 0C VOIF ASK NPE0 I T 759 Caroarhagi DCGA PO P A A, LEFCN:& GTPETN 01 K OP OHA T DILPWKIFiKOFS C VIOlA I KVINICQNOEIFN1TRC G I IA 5453 Jararhagon K F A V H 0 [Li 0 N N N H I 2' K K 0 P K0 N 0 1 K K E - - P 0 C FF0 C~~~~~UG L A E C; 0-1 P T CK .C A C A: QW KG R ND0 P: A 546 Trigramin DIFKGNPFHLEA L Z A1 ,C 7-DA .YIPCAQRGD[L D 'NGRSA 490

PH-S30 A El P C Y11V 0-,P K K I C5 NO G1 3::L A ; QQ- 11111F T 1 P P 0 H AK1.D C NIHR1 TVIGDI P 1 K l- K I INI lKl S SSI C 0 653 Cyritestin I, P 0C K A11A 1 N K A K.C N N E T K KC: G0F V 1R C 0 0 QV D I FKV YANC K S;PNATTO T SHINLQlDKTiFG NI S GWI - -I~ 011 C R VV-H C K 61054 Jararhagin EKCPCW NL7

PH 30 a A N -' TGISSIPNA IA1NNQ H HIO DEAPOHOCDD S- 12 T - D PIDVI FPPEC;A S SGT P TAP P F F:.TNITTEAST1KFOCLD 645

a 0 C C C C LH AW C 0 0 P A K K P P P FK K K K A 0 [J 0 PH-OS K L I C E L A KR KFKQ KKKQ KKKQ P9-32 CI IP F K V I F S I C I A T V P V I I I K K KIPS K 0 1 AN 0 El N I E KGHAKSA DG TVMWV-0:K F KE VS F VYI GI6K KAPIteti V V V; C- V V 0C 0CIC INFAFINTEPY0C Y YL V QP,WI1 K 0 OW P I KGV D F V0 1 H.Q WT A D:SCK P VP 1 KL 6174

EAPF ; D IFHTOTC NLPOPONSIPPIIDNNW NLDI SCCK KTWMFLYC NRML---O :GEMV:.,S E 675

Swis-Not (RelAse1- 24) dat bae." Prdmi.P-0aad ae3%ietcl ecuiggp)oerti ein h irtaioai eiu for a and tha of th fis a cotin poenia mehonn res (udrie) non of sHow-CTPA.T-I (His n-frmemehinine or start L PIduEsTYFVEG whichis68

followed by an obvious signal sequence (10). The methionine at position 7 is encoded by an AUG codon in the most optimal context for translation initiation (11). The putative start methionine of B(underlined), encoded by an AUG codon in a good context to initiate translation (11), is followed by a potential signal sequence (10). Potential sites for signal-sequence cleavage are marked with arrows. Stars mark cysteine residues which may be involved in a matrix metalloprotease-like "cysteine switch" (12) activation of the metalloprotease domain. Metalloprotease domain. PH-30 a and 8 are 27% identical (excluding gaps) over this region. The consensus snake venom metalloprotease active-site sequence, Downloaded by guest on September 27, 2021 Developmental Biology: Wolfsberg et al. Proc. Natl. Acad. Sci. USA 90 (1993) 10785 SS P .X D C E TMIT XZ e Z'-, 5AS | PH-30ae ... ,.' \Z I

9.5 - FIG. 2. Domain organization of disintegrin and metalloprotease 7.5- domain-containing proteins. SS, signal sequence; P, prodomain; M, zinc-dependent metalloprotease domain; D, disintegrin domain; C, 4.4- cysteine-rich domain; E, epidermal growth factor-like repeat; TM, transmembrane domain; T, cytoplasmic tail; AS, snake venom/ 2.4- PH-30 a metalloprotease active-site consensus sequence HEXGHX- 1.4- LGXXHD. Mammalian members ofthis sequenced to date include all domains. Snake venoms contain three subfamilies of this protein family, all of which begin with a signal sequence, but which terminate with either the metalloprotease, the disintegrin, or the cysteine-rich domain. The sequences ofmammalian PH-30 a and FIG. 3. Tissue distribution of PH-30 a (Upper) and (Lower). A ,(, cyritestin (GenBank accession no. X64227), and EAP I (13), as Northern blot containing total RNA from eight adult male guinea pig well as the sequences of snake venom jararhagin (14), trigramin (15), tissues was probed at high stringency with PH-30 a or PH-30 ,BDNA. Ht-e (12), rhodostomin (see ref. 14), and Ht-a, -b, -c, and -d (J. Fox, Both hybridized to separate RNA species of -3000 nt in testis. These personal communication) are known from cDNA cloning. The se- sizes are in good agreement with the lengths ofPH-30 cDNAs (-3300 quences ofthe snake venom proteins HR1B (see ref. 7), RVV-X (16), nt for a and -2900 nt for S). All samples contained about 10 pLg of and the soluble (see ref. 7) metalloproteases and disintegrins are RNA, except for the two rightmost lanes on each gel (lung and testis), known from direct amino acid sequencing only. The amino termini which contained about 1 pg of RNA. The exposure time for samples

of mature PH-30 a and (, as determined by amino acid sequencing containing less RNA was approximately 8-fold higher. (-Actin (3), are indicated by arrows. The potential fusion peptide of PH-30 message was intact in all samples (data not shown). Although a is indicated by a black box. Northern analysis revealed only one mRNA species for both a and

3, we have found sequence variations in the 3' untranslated regions Neither is the respective species homologue of PH-30 a or of PH-30 a (T.G.W., unpublished data) and PH-30 C (C.P.B. and (ref. 13; T.G.W., unpublished data). A Northern blot of P. D. Straight, unpublished data). guinea pig tissue RNA, probed at high stringency, suggests a the The a that PH-30 a and ,B are testis-specific in the adult (Fig. 3). of helix, help ligate zinc (Fig. 4) (20). PH-30 and venom fea- Like the mature PH-30 , subunit, the PH-30 a precursor sequence-similar snake metalloprotease domains region contains a disintegrin domain. Soluble snake venom ture an extended active-site signature motif (Fig. 1, box) disintegrins which contain an RGD binding sequence interact which is similar to the conserved active-site sequence, with on and other cells (reviewed in refs. HEXXUXXGXXHE, of a family of zinc-dependent metal- 6 and 7), presumably through a 13-amino acid RGD- loproteases homologous to the crayfish astacin (22, containing loop (17, 18). Disintegrins which are linked to a 26). The x-ray structure of astacin indicates that a glycine (in carboxyl-terminal cysteine-rich domain (Fig. 2) lack the RGD italics) terminates a -like HEXXH-containing motif, but contain instead a unique tripeptide as well as an helix by making a tight turn, enabling the third underlined adjacent cysteine (Fig. 1, stars). The latter snake venom histidine to serve as a third zinc ligand. We predict that the disintegrins, as well as the disintegrin domains of EAP I and histidine, glutamate, and glycine residues of the PH-30 a and snake venom metalloprotease active sites play analogous mature PH-30 3, maintain a negatively charged residue in the position ofthe RG12 aspartic acid. The disintegrin domains of structural roles. The PH-30 and cyritestin metalloprotease PH-30 a and cyritestin lack an acidic residue in this position domains lack the critical histidine and glutamate residues but (Fig. 1, stars). maintain the conserved glycine and (carboxyl-terminal) as- The PH-30 a precursor region contains a metalloprotease- partic acid (Fig. 1, box). Thus, these latter proteins, although like domain. The sequence HEXXH is the active-site signa- probably not active metalloproteases, may fold similarly. ture of a large class of zinc-dependent metalloproteases (19). Outside ofthe active-site consensus motif, the PH-30 a and As the structure of bacterial thermolysin demonstrates, the snake venom metalloprotease domains share little sequence glutamic acid probably acts as a catalytic base, and the two similarity with the astacin-like . To assess the histidines (underlined), which are adjacent on the same face possibility that HEXXHj-containing metalloproteases share a

HEXGHXLGXXHD, is boxed. PH-30 a contains the metalloprotease consensus sequence; PH-30 (3 and cyritestin do not. EAP I contains all residues of the consensus sequence except for the catalytic glutamic acid, suggesting that it may be able to ligate a zinc ion but be catalytically inert. Disintegrin domain. PH-30 a and ,B are 47% identical (excluding gaps) over this region. Stars mark the position of the RGD tripeptide of trigramin. Arrows indicate the amino termini of mature PH-30 a and (3. Cysteine-rich domain. PH-30 a and P are 29%o identical (excluding gaps) over this region. EGF-like, transmembrane, and cytoplasmic domains. The epidermal growth factor (EGF) repeat is bracketed. The potential transmembrane domains are underlined. Downloaded by guest on September 27, 2021 10786 Developmental Biology: Wolfsberg et al. Proc. Natl. Acad. Sci. USA 90 (1993) Thermolysin may be structurally distinct from those of astacin. Alternatively, activation of the snake venom metallopro- teases could occur by a matrix metalloprotease-like "cys- teine switch" mechanism, through a conserved cysteine in the prodomain (12). The PH-30 a prodomain features a cysteine in a similar location (Fig. 1, star), although it lacks other postulated consensus residues. To trace the history of the snake venom and mammalian PH-30-like sequences, we subjected the individual domains to phylogenetic analysis by maximum parsimony (27, 28). The trees generated from the prodomain (not shown), the metalloprotease domain (Fig. 5), and the cysteine-rich do- main (not shown) contain three main branches. One branch contains PH-30 a and 3 and cyritestin, one branch contains EAP I, and one branch contains the snake venom proteins. That the trees are largely congruent indicates that the indi- vidual domains are evolving simultaneously and were prob- ably all present in the progenitor gene. Snake venoms contain soluble disintegrins and metallo- proteases; by analogy, PH-30 a and 83 might be proteolyti- cally processed at interdomain boundaries. The first proteo- lytic processing event for ,B, the transition from pro-l3 (-88 kDa) to pro-,8* (75 kDa) (2), is consistent with removal ofthe prodomain. Removal of the 3 metalloprotease-like domain and subsequent exposure of its disintegrin domain may Astacin render epididymal sperm fertilization-competent (2, 29). Al- FIG. 4. The zinc-dependent metalloproteases of the thermolysin Monkey EAP I PH-30 a and astacin classes share an "50-amino acid core structural motif of three p-strands and a catalytic a-helix. Shown are the principal Rat EAP I A P3 ,B-sheets ofthermolysin (20, 21) and those we have located for astacin [by inspection of the ribbon diagram of astacin (figure lb of ref. 22) and by structural prediction algorithms (23, 24)] with distinct ABCED and BACED strand topologies [corresponding to + 1, +lx, +2, -1 and -lx, +2x, +2, -1 (-strand orders in Richardson Ht-d 6 notation (25), respectively]. A shaded box delineates the shared CED (+2, -1) three-,B-strand motif, and the subsequent a-helix 2 that HRlB /\ carries the signature HEXXH catalytic groups. The positions of structurally equivalent histidine and distal glutamic acid ligands to the zinc (black circle) are noted. Chain segments amino- and car- + f ) 75 ~~~~~~~PH-30 boxyl-terminal to this common "50-amino acid supersecondary structure exhibit different chain topologies. core framework in which the is embedded, we compared the x-ray structures of thermolysin-like bacterial enzymes (20, 21) and astacin (22). Although there is meager sequence identity, both structures are globally similar with an active site sandwiched between an amino-terminal (-sheet FIG. 5. Phylogenetic analysis of the snake venom/PH-30 zinc- domain and a carboxyl-terminal helix/coil-rich structure. dependent metalloprotease domain from a subset of available pro- Only one discrete section ofcontinuous chain is topologically tease sequences. Numbers on the branches are proportional to the conserved, an =z50-amino acid core (Fig. 4) consisting of number of nucleotide changes between sequences; branch lengths three and and the are drawn to scale. Heavy lines indicate the three main branches of /-strands (sheets C, E, D) following the tree: one to PH-30 and cyritestin, one to EAP I, and one to the catalytic a-helix (helix 2) that carries the HEXXH sequence. snake venom proteins. EAP I is evolving separately from PH-30 and The existence of this core fold, as well as the astacin-like cyritestin, as it lies on a separate branch of the tree and contains a active-site geometry, in the PH-30 a and snake venom different linker module between its cysteine-rich and transmembrane metalloproteases is supported by the recently determined domains (Fig. 1). The zinc-dependent metalloprotease domains from x-ray structure of a protease from the venom of the snake snake venom, Ht-e and Ht-d (12) (Crotalus atrox), HR1B (see ref. 7) Crotalus adamanteus (L. Kress and W. Bode, personal (Trimeresurus flavoviridis), RVV-X (16) (Vipera russelli), rhodos- tomin (see ref. 14) (Calloselasma rhodostoma), and trigramin (15) communication). (Trimeresurus gramineus), and mammalian EAP I (13), cyritestin Analysis of the sequences of the snake venom and PH-30 (GenBank accession no. X64227), and PH-30 a and , were analyzed a metalloprotease domains suggests at least two activation by the program PAUP (27). The tree was generated by a heuristic mechanisms. In astacin, the final glutamic acid of the active- search algorithm, with 25 repetitions of random stepwise sequence site consensus sequence interacts with the amino-terminal addition. Tree validity was tested with the bootstrap approach, using ammonium group of the mature, active protease (22). The 100 replications of a heuristic search, itself composed of 10 replica- conserved aspartic acid of the mammalian and snake venom tions of random sequence addition. The tree shown is the most metalloprotease domains (Fig. 1, box) may undergo a similar parsimonious, and conforms to criteria specified in ref. 28. Separate trees generated from a sequence alignment of critical residues in the interaction. However, an exactly analogous mechanism conserved core domain (see Fig. 4) for bacterial thermolysin, astacin, would require that astacin and the PH-30 a and snake venom as well as the proteins shown here, indicate that the root of this tree metalloproteases have a similar protein fold outside of the (the black dot) falls at the midpoint of the branch whichjoins PH-30 conserved core. But, like thermolysin, the amino- and car- and cyritestin with EAP I and the snake venom proteins (J.F.B., boxyl-terminal extensions of the PH-30 a and snake venom unpublished data). Downloaded by guest on September 27, 2021 Developmental Biology: Wolfsberg et al. Proc. Natl. Acad. Sci. USA 90 (1993) 10787

though we have not detected release ofthe a metalloprotease bilewicz, and J. G. White for communicating results prior to publi- domain during the developmental processing of pro-a (2), cation. This work was supported by grants to J.M.W. from the is a potential dibasic cleavage site A) in the short Muscular Dystrophy Association and the National Institutes of there (R4 I Health (GM48739). T.G.W. was a predoctoral fellow of the National linker between the a metalloprotease and disintegrin do- Science Foundation. mains. The metalloprotease domain of the PH-30 a precursor 1. Primakoff, P., Hyatt, H. & Tredick-Kline, J. (1987) J. CellBiol. region may be active in early spermatogenesis in the testis, 104, 141-149. either before or after proteolytic processing of pro-a. The 2. Blobel, C. P., Myles, D. G., Primakoff, P. & White, J. M. protease could help spermatocytes cross the tight junction (1990) J. Cell Biol. 111, 69-78. between adjacent Sertoli cells. It could be involved in the 3. Blobel, C. P., Wolfsberg, T. G., Turck, C. W., Myles, D. G., occurs as sperm migrate toward the Primakoff, P. & White, J. M. (1992) Nature (London) 356, restructuring which 248-252. lumen of the seminiferous tubules. It might release sperma- 4. White, J. M. (1992) Science 258, 917-924. tozoa into the lumen of the seminiferous tubules. It could be 5. Myles, D. G., Kimmel, L. H., Blobel, C., Turck, C., White, J. autocatalytic, as some snake venom proteins are autocata- & Primakoff, P. (1992) J. Cell Biol. 3, 212a (abstr.). lytically processed at the boundary between the metallopro- 6. Blobel, C. P. & White, J. M. (1992) Curr. Opin. Cell Biol. 4, tease and disintegrin domains (12). Or it might process the 83 760-765. precursor. 7. Kini, R. M. & Evans, H. J. (1992) Toxicon 30, 265-293. The disintegrin domains of PH-30 pro-a or pro-l3 may also 8. Baba, T., Hoff, H. B. I., Nemoto, H., Lee, H., Orth, J., Arai, Integrins appear to be present Y. & Gerton, G. L. (1993) Mol. Reprod. Dev. 34, 233-243. play roles in spermatogenesis. 9. Shackleford, G. M. & Varmus, H. E. (1987) Cell 50, 89-95. at sites of attachment between spermatids and neighboring 10. von Heijne, G. (1986) Nucleic Acids Res. 14, 4683-4690. Sertoli cells (30). Interaction of PH-30 disintegrins with 11. Kozak, M. (1991) J. Biol. Chem. 266, 19867-19870. Sertoli cell integrins may be partially responsible for the 12. Hite, L. A., Shannon, J. D., Bjarnason, J. B. & Fox, J. W. adhesion of spermatogenic cells to Sertoli cells. (1992) Biochemistry 31, 6203-6211. In summary, we have shown that the precursor regions of 13. Perry, A. C. F., Jones, R., Barker, P. J. & Hall, L. (1992) PH-30 a and (3 contain further similarity to each other as well Biochem. J. 286, 671-675. as to a family of snake venom proteins which contain 14. Paine, M. J. I., Desmond, H. P., Theakston, R. D. G. & metalloprotease and disintegrin domains. PH-30 a and are Crampton, J. M. (1992) J. Biol. Chem. 267, 22869-22876. 15. Neeper, M. P. & Jacobsen, M. A. (1990) Nucleic Acids Res. 18, multidomain proteins which contain pro, metalloprotease, 4255. disintegrin, cysteine-rich, transmembrane, and cytoplasmic 16. Takeya, H., Nishida, S., Miyata, T., Kawada, S., Saisaka, Y., domains (Fig. 2). Their domain organization defines a family Morita, T. & Iwanaga, S. (1992) J. Biol. Chem. 267, 14109- of mammalian integral membrane proteins, which now in- 14117. cludes additional members [cyritestin, EAP I, MS2 (Gen- 17. Saudek, V., Atkinson, R. A. & Pelton, J. T. (1991) Biochem- Bank accession no. X13335), HUMORF09 (GenBank acces- istry 30, 7369-7372. sion no. D14665), and T.G.W. and P. D. Straight, unpub- 18. Adler, M., Lazarus, R. A., Dennis, M. S. & Wagner, G. (1991) lished data]. The ancestors of these proteins may be the Science 253, 445-448. progenitor molecules from which the snake venom proteins 19. Jiang, W. & Bond, J. S. (1992) FEBS Lett. 312, 110-114. at as far 20. Holmes, M. A. & Matthews, B. W. (1982) J. Mol. Biol. 160, evolved. These ancestors probably exist least back 623-639. as invertebrates, as a PH-30 homologue has been found in 21. Holland, D. R., Tronrud, D. E., Pley, H. W., Flaherty, K. M., Caenorhabditis elegans (B. Podbilewicz and J. G. White, Stark, W., Jansonius, J. N., McKay, D. B. & Matthews, B. W. personal communication). Furthermore, PH-30 appears to be (1992) Biochemistry 31, 11310-11316. a multifunctional protein. The potential disintegrin and zinc- 22. Bode, W., Gomis-Rueth, F. X., Huber, R., Zwilling, R. & dependent metalloprotease domains in the precursor regions Stoecker, W. (1992) Nature (London) 358, 164-167. may play vital roles in early spermatogenesis. The disintegrin 23. Rost, B. & Sander, C. (1992) Nature (London) 360, 540. domain and the potential fusion peptide in the mature sub- 24. Colloc'h, N. & Cohen, F. E. (1991) J. Mol. Biol. 221, 603-613. units may be necessary for the binding and fusion of sperm 25. Richardson, J. S. (1981) Adv. Protein Chem. 34, 167-339. R. & egg The functions of the 26. Gomis-Rueth, F. X., Stoecker, W., Huber, R., Zwilling, and plasma membranes. putative Bode, W. (1993) J. Mol. Biol. 229, 945-968. PH-30 a precursor region suggest novel contraceptive strat- 27. Swofford, D. L. (1991) PAUP: Phylogenetic Analysis Using egies. Agents which inhibit metalloprotease or disintegrin Parsimony, Version 3.Os (Illinois Natural History Survey, domain activity may interfere with sperm development and Champaign). be useful as male contraceptives. 28. Stewart, C.-B. (1993) Nature (London) 361, 603-607. 29. Hunnicutt, G. R., Beardsley, J. R. & Myles, D. G. (1990) J. We thank M. Skinner, C. Damsky, and G. Kemble for discussions; Cell Biol. 111, 361a (abstr.). P. D. Straight for technical assistance; G. Gerton for the guinea pig 30. Pfeiffer, D. C., Dedhar, S., Byers, S. W. & Vogl, A. W. (1991) testis cDNA library; and J. Fox, L. Kress, W. B. Bode, B. Pod- J. Cell Biol. 115, 482a (abstr.). Downloaded by guest on September 27, 2021