Clues from the Gene and Protein Side Offer Novel Perspectives in Molecular Diversity and Function
Total Page:16
File Type:pdf, Size:1020Kb
REVIEW Proteoglycans of the extracellular environment: clues from the gene and protein side offer novel perspectives in molecular diversity and function RENATO V. IOZZO’ AND ALAN D. MURDOCH Department of Pathology, Anatomy, and Cell Biology, and the Jefferson Cancer Institute, Thomas Jefferson University, Philadelphia, Pennsylvania 19107, USA ABSTRACT This review focuses on the extracellu- and multifunctional molecules of the animal kingdom. lar proteoglycans. Special emphasis is placed on the They carry a variety of structural units that vary from structural features of their protein cores, their gene simple linear sugars to the most highly charged, sulfated organization, and their transcriptional control. A polysaccharide in nature: heparin. They provide struc- simplified nomenclature comprising two broad tural constraints, function as growth-supportive or sup- groups of extracellular proteoglycans is offered: the pressive molecules, possess adhesive and anti-adhesive small leucine-rich proteoglycans or SLRPS, pro- properties, act as major biological filters, promote angio- nounced “slurps,” and the modular proteoglycans. genesis, induce neurite outgrowth, and bind, store, and The first group encompasses at least five distinct deliver growth factors to target cells during normal devel- members of a gene family characterized by a central opment and in pathologic states. This review centers on domain composed of leucine-rich repeats flanked by the extracellular proteoglycans, those molecules whose two cysteine-rich regions. The second group consists broad determining roles are to be secreted into the of those proteoglycans whose unifying feature is the pericellular environment and to provide bridging informa- assembly of various protein modules in a relatively tion to the cells. Special emphasis is placed on the struc- elongated and often highly glycosylated structure, ture of the protein cores, the gene organization, the This group is quite heterogeneous and includes a transcriptional control, and the functional implications distinct family of proteoglycans, the “hyalectans,” derived from the design of these molecules. First, we will that bind hyaluronan and contain a C-type lectin offer a simplified nomenclature of these proteoglycan motif that is likely to bind carbohydrates, and a less groupings. Then we will discuss the structure/function re- distinct group that contains structural homologies lationships of some paradigmatic proteoglycans, and ex- but lacks hyaluronan-binding properties or lectin- amine novel aspects of proteoglycan biology derived from like domains.-Iozzo, R. V., Murdoch, A. D. Proteo- recent genetic and structural studies. glycans of the extracellular environment: clues from the gene and protein side offer novel perspectives in A SIMPLIFIED NOMENCLATURE FOR molecular diversity and function. FASEB J. 10, PERICELLULAR PROTEOGLYCANS 598-614 (1996) The growth of the proteoglycan gene family has been staggering in the past decade. To date, more than 25 dis- Key Words: exlracellular matrix ‘ leucine-rich repeats modular genes tinct genes scattered throughout the mammalian genome code for protein cores that carry at least one glycosamino- UNDER PHYSIOLOGICAL CONDITIONS, the adhesive force glycan (GAG)2 chain, the hallmark of proteoglycans. In between two proteoglycans of the marine sponge Micro- addition, different proteoglycans exist as structural van- ciona pro1fera is about 400 pN. This strong bond is me- diated by the homophilic and polyvalent, 1To whom correspondence and requests for reprints should be ad- calci urn-dependent interactions between two adhesive dressed, at: Department of Pathology, Anatomy ,nd Cell Biology, Room 249, Jefferson Alumni Hall, Thomas Jefferson University, 1020 Locust proteoglycans (1). Thus, a single pair of these molecules Street, Philadelphia, PA 19107, USA. can hold the weight of 1600 cells. That’s incredible! If 2Abbreviations: bFGF, basic fibroblast growth factor; CRP, complement one had to consider the billions of proteoglycan mole- regulatory protein; CS, chondroitin sulfate; DS, dermatan sulfate; EGF, cules contained in a gram of cartilage or tendon, for in- epidermal growth factor; FGF, fibroblast growth factor; GAG, glycosami- stance, their importance as a major intercellular glue noglycan; HBR, hyaluronan binding region; HS, heparan sulfate; hyalec- would be better appreciated. More astonishing is the fact tan, hyaluronan- and lectin-binding proteoglycan; Ig, immunoglobulin; KS, keratan sulfate; LDL, low density lipoprotein; LRR, leucine-rich that proteoglycans have evolved during the last 50 mil- repeat; N-CAM, neural cell adhesion molecule; PG, proteoglycan; RI, lion years to reach an unprecedented level of sophistica- ribonuclease inhibitor; SLRP, small leucine-rich proteoglycan; TGF-, tion. Proteoglycans comprise some of the most complex transforming growth factor beta. 598 Vol. 10 April 1996 0892-6638/96/001 0-0598/$01 .50. © FASEB REVIEW ants, further increasing the diversity of this class of mac- alternating hydrophobic and hydrophilic amino acid resi- romolecules. The classification presented in Table 1 is dues that harbor multiple amino acid repeats with con- based on the nature, overall structure, and biological served leucine residues. The latter are generally properties of the protein cores. We have divided the se- multidomain assemblies of protein motifs with a relatively creted penicellular proteoglycans into two broad catego- elongated and often highly glycosylated structure carrying ries: the SLRPs (pronounced “slurps”), an acronym for numerous protein signatures or modules shared with other small leucine-nich proteoglycans; and the modular proteo- proteins involved in the control of cellular growth, differ- glycans. The former are typically compact proteins with entiation, lipid metabolism, and adhesion. TABLE 1.Structure and properties of secreted pericellular proteoglycans General features Designation Protein core” Clycosaminoglycan Chromosomal location Tissue distribution Gene product Gene Size (kDa) Type [Number] Human Mouse SLRP Decorin DCN 36 CS/DS [1] l2q21.3-q23 10 Ubiquitous, collagenous matrices, bone, teeth, mesothelia, floor plate Small, ubiquitous PGs Biglycan BGN 38 CS/DS [1-2] Xq28 X Interstitium, and cell enriched in leucine, surfaces with 24 amino acid tandem repeats flanked by cysteine clusters. Fibromodulin FMOD 42 KS [4] lq32 Collagenous matrices Lumican LUM 38 KS [2-3] 12q2l.3-q22 10,distal Cornea, intestine, liver. muscle, cartilage Epiphycan6 36 CS/DS [2] Epiphyseal cartilage Modular Versican CSPG2 265-370 CS/DS [10-30] 5ql3.2 13 Blood vessels, brain, skin, Multidomain cartilage proteoglycans with protein modules Aggrecan AGC1 220 CS [ 100] l5q26 7 Cartilage, brain, blood homologous to the Ig vessels superfamily, selectin, EGF, laminin, LDL Neurocan MNC1 136 CS [3-7] 8 Brain, cartilage receptor, N-CAM and protease inhibitors. Brevican 100 CS [1-3] Brain Perlecan HSPG2 400-467 HS/CS [3-10] lp36 4, distal Basement membranes, cell surfaces, sinusoidal spaces, cartilage Agrin AGRN 200 HS [3-6] 1p32-pter 4 Synaptic sites of neuromuscular junctions, renal basement membranes Testican 44 HS/CS[1-2] 21 Seminal fluid “The size is based on the amino acid sequence deduced from eDNA cloning. In general, however, the size of the individual protein cores is larger when estimated by DS-PAGE due to varying degrees of N- and 0-linked glycosytations. ‘This proteoglycan. originally named PC-Lb (6), has been recently renamed epiphycan (L. Rosenberg and M. Hook, personal communication) to reflect its typical tissue distribution in the epiphyseal cartilage, PROTEOGLYCANS 599 REVIEW THE SMALL LEUCINE RICH PROTEOGLYCANS sentially on how many conserved amino acids one re- (SLRPs) quires for a given repeat. For simplicity’s sake, we con- sider true repeats only those flanked by the two clusters General structural features of cysteine residues. Accordingly, the four major mem- bers of the SLRP family contain 10 LRRs, with epiphy- SLRPs encompass a class of secreted proteoglycans that can containing 8 repeats (Fig. 1). To date, these repeats include five structurally related but genetically distinct have been found in more than 40 gene products and are members (Table 1 and Fig. 1): deconin (2), biglycan (3), present as intracellular, transmembrane, and secreted fibromodulin (4), lumican (5), and epiphycan, which was products in mammalian cells, plant cells, yeast, and originally called PG-Lb (6). These proteoglycans share prokaryotes (8). The number of residues in a given LRR the unique feature of being composed primarily of is between 20 and 29, with the most common being the leucine-rich tandem repeats that confer most of the bio- 24 amino acid residues seen in the SLRPs. The consen- logical functions. A close examination of the overall pro- sus sequence derived from all the known LRR proteins tein core structure reveals that it consists of three main contains leucine or other aliphatic residues at positions 2, regions: an amino-terminal region, which contains the 5, 7, 12, 16, 21, and 24, and cysteine/threonine (A-type) negatively charged GAGs or tyrosine sulfate; a central or asparagine (B-type) at position 10. The consensus se- domain with varying numbers of LRRs; and a carboxyl quence for the SLRP family is LxxLxLxxNxIJIxS/TxV/I, a end region of poorly defined function. In all cases, the B-type repeat, followed by a less homologous sequence of central domain is flanked by cysteine-rich clusters (Fig. about 7-10 residues (9). 1). In decorin,