<<

View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

Volume 289, number 1, 4-7 FEBS 10077 September 199 I G 1991 Federation of European Biochemical Societies 00145793/91/S3.S0 ADONIS 001457939100788H

Discussion Letter The origin of matrix and their familial relationships

George J.P. Murphy’, Gillian Murphy’ and John J. Reynolds’

‘instirrrre of Planr Scierxe Research. Cambridge Laboratory, Norwich, NR4 7UJ. UK and ‘Cell atld Molecular Biology Department, Strangeways Research Laboratory, Cambridge, CBI 4RN. UK

Received 4 June 1991

New computer comparisonsof the sequences cf mammalian matrix metalloproteinases have established for the first time strong links with bacterial metalloproteinas:s. We also propose that there are five groups in the family of matrix metalloproteinases. although only three are as yet well-characterized as proteins, and discuss their origin and relationships with other containing .

Matrix : Sequence alignment: Phylogeny

The family of matrix metalloproteinases (MMPs), family of MMPs, naml:d ‘stromelysin 3’. However, no which are synthesized primarily by connective tissues, data on its substrate sl jecificity is available. is of great importance in the initial events leading to Many mammalian V,MPs have now been cloned and tissue degradation, both in physiological and pathologi- sequenced and they show a highly ex:ended degree of cal situations. Three major groups of MMPs have been similarity with each other (reviewed in [ 13). Metallopro- well-characterized, each group having more than one teinases with other functions than MMPs, and from distinct gene product, which can be distinguished on many species, have been similarly characterized. At- immunological and biochemical criteria as well as se- tempts at identification of the basic features of MMP quence data [l-3]. The specific have inter- sequences that might be involved in the catalytic mecha- stitial almost uniquely as substrates; the sec- nism have been based on the detailed studies of the ond group of MMPs are often referred to as better known bacterial , notably because they degrade denatured collagens very effi- . and have preceded any analyses of the ciently, but are also known as type IV collagenases, MMPs themselves. degrading native type IV in a relatively specific In an earlier analysis [5] we suggested that the only fashion; members of the third family group, which are feature of MMPs in common with thermolysin was the now generally referred to as stromelysins. are MMPs zinc binding motif HEXXH. With currently published which have quite broad proteolytic action but which data, within all the known zinc metalloprotease sequen- were originally described as proteoglycanases. Another ces, three groups of can be delineated accord- less clearly characterized metalloproteinase, PUMP. ing to the presence of absence of identifiable catalytic has been described and its sequence indicates that it may site motifs. Proteases derived from Uacillzrs sp. and be the firs: member of a fourth group. Pseudot~tonas elastase, as well as forms of amino-pepti- Very recently Basset et al. [4] described in detail their dase N, LTAJ , enkephalinases and angioten- interesting findings of a novel putative metalloprotei- sin converting enzymes [9, and references therein], have nase gene expressed in stromal cells of human breast both the H.EXXH motif identified in thermolysin as the carcinomas. Although this putative proteinase was not two histidine Zn”’ binding ligands and the glutamic acid itself isolated, the mRNA was also found to be ex- involved in the catalytic mechanism? and a second se- pressed in the uterus, placenta and cmbryollil: limb bud, quence containing the glutamic acid corresponding to as well as phorbol ester of growth factor stimulated a third Zn’)+ ligand [9]. A second group including &rcr- fibroblasts. Based on its sequence and possible function ria proteinase, bone morphogenetic protein I and the in the progression of malignancy Basset et al. [4] pro- Etwinicr proteinases B and C contain these motifs as well posed that this gene codes for a new member of t.l,e as a third motif adjacent to the zinc , (HEXXH)XXGXXH, whose function is not known. The third group, which includes the MMPs, Asfucus C’orrc.~ftcttlrlc’t~~,~a~fdrcxs: G. Murphy. Strangcways Rcscarch Labora- proteinasc and the snake venom proteinascs. contain tory. Catnbridgc, CBI 4RN. UK. I’;Ix: (44)(223)41 1609. the zinc box and the adjacent motif of the second group

4 Volume 289. number 1 FEBS LETTERS September 1991

Table I Alignments of metalloproteinases with the new gene product [J].

Region homologous to PUMP Full length sequence

No. of residues used No. identical SD No. of residues used No. identical SD

Stromclysin 3 I68 484 Stromelysin I 172 86 20.0 378 I71 19.9 172 86 22.6 377 166 22.2 170 85 22.6 370 163 22.0 72 kDa 176 86 21.4 382 173 24.6 95 kDa gelatinase 172 73 19.6 431 174 16.5 PUMP 172 73 19.6 172 82 5.6 Erlc’iniu B 222 61 7.1 434 I21 3.7 Thermolysin 202 47 3.8 315 93 0.3 - The metalloproteinases were aligned with ‘stromelysin 3’ [4], using either the regions homologous to PUMP or with the full length sequence. SD is the number of standard deviations of the real score above the random scores. The higher the value the greater the similarity to MMPl I.

but have no identifiable glutamic acid corresponding to domain and 62.8 for the full-length sequence. In con- the third Zn” ligand in thermolysin. Consequently a trast the catalytic region of ‘stromelysin 3’ shows signi- great deal of speculation currently surrounds the iden- ficant similarity to all the other known matrix metallo- tity of the third Zn’+ ligand in these proteinases [6-lo] proteinase sub-groups, with the highest homology, as and their possible relationships. indicated by the SD value, being to collagenase and Using computer analyses we have now extended the stromelysin 2. This was also true when the full-length comparison between the MMPs and other metallo-pro- sequence was compared, with the highest homology teases in an attempt to gain further information on the being to the 72 kDa gelatinase rather than to the stro- origin and relationships of these enzymes. Thermolysin melysins. The homology to thermolysin and the Errvit~ict was included in the comparisons, together with the se- was lower, though still statistically significant. quence of the metalloproteinase I3 derived from the Since the precise biochemistry and function of the new plant bacterium Etwiniu chrysattt/zenti[ 1 I]. The cata- are unknown we suggest that it be systemati- lytic domain of the MMPs can be defined by that of the cally named 11 (MMPI l), al- punctuated (previously putative) metalloproteinase. though future work may suggest a more descriptive title. PUMP, which lacks the C-terminal domain present in The conclusion that ‘stromelysin 3’ (MMPl 1) repre- all the other known MMPs (reviewed in [1,3]). The re- sents the first member of a new subgroup was reinforced gions corresponding to the sequence of PUMP in both when the sequences of the catalytic domains were subse- stromelysin-I and have been quently used to derive a phylogenetic tree. The tree was_ shown to retain catalytic activity [I]. Since we wished to constructed using TREBALIGN [ 141, with a gap pe- compare only the catalytic domains, the fibronectin-like nalty function of 1 + (length of insertion or deletion) x3 domains of the gelatinases were not included in the sequence comparison, since these appear to have arisen by recombination of a common ancestor with fibronec- tin [12]. Comparison of the sequence of the catalytic domain of the proposed new ‘stromelysin 3’ [4] with those of the other members of the family (Table I) indi- cates that it might better be described as the first mem- ber of a new sub-group. The similarity between pairs of sequences was tested using the program ALIGN [13], employing the PAM250 matrix and with a gap penalty of three. To test the statistical significance of the align- ments the sequences were randomized and realigned 100 Pump times. The matrix bias parameter was varied until the I EC’ Prot El 1130 33 Therm number of standard deviations of the real score over the Strom 1 Strom 2 random score (SD) reached a maximum. Stromclysin 1 (MMP3) and stromclysin 2 (MMPlO) arc very similar, Fig. I. Phylogcnctic tree for nlct;llloprotcinascs. Tbc most parsimo. nious tree is shown hcrc. with the branch lcnglhs being indicated on in that when compared with the methods used to pro- each branch. fhc tree is rooted ai the midpoint of the longest branch duct Table 1 the pairwisc SD value is 45 for the catalytic and is c;~lculatcd assuming that all branches cvolvc aI tbc sxmc rate.

5 Volume 289,number I FEESLETTERS September1991

Strom 3 (MMP11) FVLSGG --RWEKTDLTYRILRFPWQLVQEQVRQT~EALKVWDI Strom 1 FRTFPGIPKWRKTHLTYRIVYTPDLPKDAVDSAVEKALKEEVTPLTFSRLYEGEADI Strom 2 FSSFPGMPKWaKTHLTYRIVYTPDLPRDAVDSAIEKALKEEVTPLTFSRLYEGEADI Collag FVLTEGNPRWEQTHLTYRIENYTPDLPRADVDHAIEKAFQSEGQADI 72kD Gelat YNFFPRKPKWDKNQITYRIIGYTPDLDPETVDDAFARAFQFSRIHDGEADI 95kD Gelat FQTFEGDLKWHHHNITYWIQNYSEDLPRAVIDDAFARAFALWSAVTPLTFTRVYSRDADI PUMP 1 YSLFPNSPKWTSKWTYRIVSYTRDLPHITVDRLVSKALNMGTADI + +* +**+* 4-c +* + + *+ +*+ +*** f ***

Strom 3 (MMP11) MIDFARYWDGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQV~H Strom 1 MISFAVREHGDFYPFDGPGNVLAHAUAPGPGINGDAHFDDH Strom 2 MISFAVKEHGDFYSFDGPGHSLRHAYPPGPGPGLYGDIHFDDDE~T-EDASGTNLFLV~H CoLlag MISFVPiGDHRDNSPFDGPGGN~~FQPGPGIGGDAHFDEHER~-NNFTEYNLHRV~H 72kD GeLat MINFGRWEHGDGYPFDGKDGLLAHAFAPGTGVGGDSHFDDDELWTLGEGQGYSLFLV~H 95kD Gelat VIQFGVAEHGDGYPFDGKDGLLAHAFPPGPGPGIQGDAHFDDDELWSLGKGVGYSLFLV~H PUMP 1 MIGFARGAHGDSYPFDGPGNTLAHAFAPGTGLGGDAHFDEDERWTDGSSLGINFLY~TH +* * +* c*** + ****+ * + ** *** +* *+ + + ++ +*+*

Stlrom 3 (MMPll) EFGHVLGLQHTTAAKALMSAFYTFRYP ----LSLSPDDCRGVQHLYGQP--WPTVT Strom 1 EIGHSLGLFHSANTEALMYPLYH-SLTDLTRFRLSQDDINGIQSLYGPPPDSPET- Strom 2 ELGHSLGLFHSANTEALMYPLYN-SFTELABFRLSQDDVNGIQSLYGPPPASTEEP CoLlag ELGHSLGLSHSTDIGALMYPSY--TFSG-- DVQLAQDDIDGIQAIYGRS-QNPVQP 72kD GeLat EFGHAMGLEHSQDPGALMAPIYT --YTK--NFRLSQDDIKGIQELYGASPD-IDLG 95kD Gelat EFGHALGLDHSSVPEALMYPMYR-- FTE--GPPLHKDDVNGIRHLYGPRPE-PEPR PUMP 1 ELGHSLGMGHSSDPNAVMYPTYG-N-GDPQNFKLSQDDIKGIQKLYGKRSNSRKK- *+** +*+ *+ *-I-*+ * * ** *cc +** +

Fig. 2. Multiple alignments of mctalloproteinases. The regions of the matrix metalloprotcinases homologous to PUMP-l werealigned. Residues identicalinall sequences are marked with an *. while those in which conservative substitutions have occurred are indicated by +.

-# !I#-# - N -# A 1 ~~~~~~~~~~~---~~~~~---~---~~~~~~~~~~~~~~~~~~___~~~~~~Q~~_Q~ 2, B 1 ITGTSTV--GVGR------GVLGDQKNINTTYSTYYYLQ------DNTRGDGIFTYDA 44 C 1 YTTDKAVSEGLTRPHTTWNGDNVFGKAANL -----TYSFLNTFSSTPNGHTGPVKFT-PV 54 + + * CC Jr* *+ ++

- #- -#- -## # - #+I#-# # s # -##+I - d###- A 28 QVRQTMAEALKVb4SDVTPLTFTEV-HEGRADIMIDFARYHAF 84 B 45 KYRTTLPGSL--WADADN-QF------FASY---DAPAVDAHYYAGV---TY 81 C 55 QMQQAKL-SLQSWADP1ANLTFTEVSPNQKANI--TFANYTD------TQAY 105 +++ +* Jr+*++ * *Jr+* -l--k+ c+ ++

# - ## ### -.ij #- _ _ II -# _ A 85 --FPKTHR------EGDVHFD--YDET-WT-----IGDDQGTDLLQVA------A 117 B 82 DYYKNVHNRLSYDGNNAAIRSSVHYSQGYNNAFWNGSEMVYGDGDGQTFIPLSGGIDWA 141 C 106 AAYPGTHP------VSGSAWFN--YNQS--S-----IRNPDTDEYGRHS-----FT 141 + c* +++ ++ Jr+++ + ++++ ++ +

##-## -#- #- # ;I _ 2’ A 118 ~~FG~LGLQ~TT~~~~~~~~~~~~--~~-~----~~~~~~,~~~~~~~.~~_~TFR~~ 143 B 142 HELTHAV-----TDYTAG--- LIYQNESGAINEAISDIFGTLVEFYANKN------184 C 142 HEIG~LGLSHPAEYNAGEGDISYKNSAAYAED--SRQFSIMS--Y-WEVENTGGDFKGH 196 Jr*++*++ +-I-+* + 3c

# ## #-- -## - A 144 YPLSLSPDDCRGVQHLYGQPWPTVT- 168 B 185 ------PD-WEIGEDVYTPGISGDSL 202 C 197 YSAGPLMDDIAAIQKLYGANMTTRTG 222 * ++-!-*i-+ ++ + Fig. 3. Comparison of stromclysin 3 (MMPI I)with bactcriel mctalloprotcinascs. The scqucnccs aligned wcrc: I\. Stromclysin 3 (MMPI I): U. thcrmolysin and C. Erwhi~r protcasc B. aligned as in Fig. 2. Rcsiducs identical in all three scqucnccs arc marked with an + below 111~’scqucnccs. while those in which conscrvativc substitutions have occurred arc indicated by +. Rcsiducs in MMPI I common to all MMPs ;lrCilldicatcd nbovc the scqucncc with a #. whilcconservativc substitutions arc indicated by -. The ungappcd rcsiduc numbers i\rc indicated on tllc Ick md ripht margins. Rcsiducs conacrvcd across al scqt~c~rccsarc in bold.

6 Volume 289, number 1 FEBS LE-l-rERS September 199 1

Fig. 1 indicates that in the most parsimonious tree, the not only give new insights into the molecular mecha- new MMP [43 did not group with any of the other MMP nisms of action of MMPs and their functions but also sequences, forming instead a separate branch between provide a considerable challenge to the computer ex- the thermolysin and E~rvinia sequences and those of the pert. other MMPs. It thus provides a link between the MMPs previously known [l] and the well-known bacterial me- REFERENCES talloproteinase, thermolysin. Previously, algorithms used to detect evolutionary relationships between zid [1] Docherty. A.J.P. and Murphy. G. (1990) Ann. Rheum. Dis. 49. 469-479. proteases failed to define the presence of such a link [ 151. [2] Murphy, G.. Reynolds, J.J. and Hembry. R.M. (1989) Int. J. We have also compared in more detail the predicted Cancer-44 757-f60. sequence of a number of human MMPs 131 Matrisian, L.M. (1990) Trends Genet. 6, 121-125. including the new gene product [4] in a multiple align- [41 Basset. P., Bellocq, J.P.. Wolf, C., Stoll, I., Hutin, P., Limacher, ment, using regions corresponding to PUMP. Multiple J.M.. Podhajcer, O-L., Chenard, M.P.. Rio, MC. and Chambon, P. (I 990) Nature 348, 699-704. alignments were carried out using CLUSTAL [16]. A I51 Whitham, SE.. Murphy, G., Angel, P., Rahmsdorf. H.-J., Smith, pairwise gap penalty of 1 was used, with multiple align- B.J., Lyons. A., Harris, T.J.R.. Reynolds, J.J., Herrlich, I’. and ment gap penalties of 7 (fixed) and 9 (varying). The Docherty. A.J.P. (1986) Biochem. J. 240, 913-916. alignments (Fig. 2) clearly demonstrate that the simi- 161 Vallee, B.L. and Auld, D.S. (1990) Biochemistry 29, X47-5659. 171 Springman, E.B., Angleton, E.L., Birkedal-Hansen, H. and Van larities extend over far greater regions of the molecules Wart, H.E. (1990) Proc. Natl. Acad. Sci. USA 87, 364-368. than just the zinc binding motif previously identified. PI Quinn, CO.. Scott, D.K., Brinckerhoff, C.E., Matrisian, LX., Since the phylogenetic tree suggests that MMPl 1 is the Jeffrey, J.J. and Partridge, NC. (1990) J. Biol. Chem. 265,22342- MMP nearest to thermolysin in an evolutionary sense, 22347. it was aligned with therrnolysin and the Ewiuia PI Shannon. J.D., Baramova. E.N., Bjarnason. J.B. and Fox, J.W. (1989) J. Biol. Chem. 264. 11575-l 1583. protease (Fig. 3). Although only twelve residues are [lOI Takeya, H., Onikura. A., Nikai, T., Sugihara. H. and Iwanaga, conserved throughout the sequences, a distinct clus- S. (1990) J. Biochcm. 108, 711-719. tering of conservatively replaced residues can be ob- [I 11 Delepelaire, P. and Wandersman, C. (1989) J. Biol. Chem. 264, served. These comparisons establish for the first time an 9083-9089. [12] Collier, I., Bruns. G.A.P.. Goldberg, G.I. and Gerhard, D.S. extended relationship between the mammalian MMPs (1991) Genomics 9, 4299434. and a bacterial metalloproteinase, although it does not [13] Dayhoff. M.O.. Barker, WC. and Hunt, L.T. (1983) Methods involve the other catalytic site motifs identified in ther- Enzymol. 91, 524-545. molysin [17]. Thermolysin has other regions of simi- [14] Hein. J. (1990) Methods Enzymol. 183, 626-645. [15] Jongeneel. C.V.. Bouvier. J. and Bairoch, A. (1989) FEBS Lett. larity with the E. chrysantherni enzymes but not all of 242. 21 I-214. these have similarity with the mammalian MMPs. [16] Higgins, D.G. and Sharp, P.M. (1988) Gene 73, 237-244. Further detailed computer analysis should be reward- [I71 Weaver. L.H.. Kestcr, W.R. and Matthews, B.W. (1977) J. Mol. ing now that the human MMPs can clearly be related Biol. 114, 119-132. back to enzymes of primitive origin, Such work could

7