<<

HUMAN LYSYL KATI HYDROXYLASES RAUTAVUOMA Characterization of a novel isoenzyme and its , Research Unit, determination of the domain structure of the lysyl Biocenter Oulu, hydroxylase polypeptides and generation of knock-out Department of Medical Biochemistry mice for the novel isoenzyme and Molecular Biology, University of Oulu

OULU 2003

KATI RAUTAVUOMA

HUMAN LYSYL HYDROXYLASES Characterization of a novel isoenzyme and its gene, determination of the domain structure of the polypeptides and generation of knock-out mice for the novel isoenzyme

Academic Dissertation to be presented with the assent of the Faculty of Medicine, University of Oulu, for public discussion in the Auditorium of the Department of Medical Biochemistry and Molecular Biology, on October 23rd, 2003, at 1 p.m.

OULUN YLIOPISTO, OULU 2003 Copyright © 2003 University of Oulu, 2003

Supervised by Professor Kari I. Kivirikko Docent Johanna Myllyharju

Reviewed by Recearch Associate Professor Sirpa Aho Docent Tuula Kallunki

ISBN 951-42-7136-X (URL: http://herkules.oulu.fi/isbn951427136X/)

ALSO AVAILABLE IN PRINTED FORMAT Acta Univ. Oul. D 752, 2003 ISBN 951-42-7135-1 ISSN 0355-3221 (URL: http://herkules.oulu.fi/issn03553221/)

OULU UNIVERSITY PRESS OULU 2003 Rautavuoma, Kati, Human lysyl hydroxylases Characterization of a novel isoenzyme and its gene, determination of the domain structure of the lysyl hydroxylase polypeptides and generation of knock-out mice for the novel isoenzyme Collagen Research Unit, Biocenter Oulu, Department of Medical Biochemistry and Molecular Biology, University of Oulu, P.O.Box 5000, FIN-90014 University of Oulu, Finland Oulu, Finland 2003

Abstract Lysyl hydroxylase (E.C. 1.14.11.4) catalyzes the formation of hydroxylysine in and other with collagenous domains. The resulting hydroxylysine residues participate in the formation of collagen crosslinks, and serve as attachment sites for carbohydrate units. They have been regarded as non-essential, since the absence of lysyl hydroxylase 1 activity is not lethal, although it leads to the kyphoscoliotic type of Ehlers-Danlos syndrome, and since recombinant collagens I and III lacking any hydroxylysine form native-type fibrils in vitro. A novel human lysyl hydroxylase isoenzyme, lysyl hydroxylase 3, was identified, cloned and characterized here. The novel isoenzyme was expressed as a recombinant in insect cells, and the protein was shown to catalyze of residues in vitro. No differences were found in the catalytic properties between the recombinant lysyl hydroxylases 3 and 1. The human lysyl hydroxylase 3 gene was shown to be 11.6 kb in size and to contain 19 exons. The introns contain 15 full-length or partial Alu retroposons, which are known to be involved in most human gene rearrangements that occur by homologous recombination. The three recombinant human lysyl hydroxylase isoenzymes were isolated here for the first time as homogenous proteins. Limited proteolysis data suggested that the lysyl hydroxylase polypeptides might consist of at least three distinct domains, A-C. The N-terminal domain A was found to play no role in lysyl hydroxylase activity as a recombinant B-C polypeptide was a fully active hydroxylase. This work also confirmed that lysyl hydroxylase 3 has collagen glucosyltransferase activity as well as trace amounts of collagen galactosyltransferase activity. However, the levels of these activities were so low that their biological significance remains to be determined. In the last part of this work, lysyl hydroxylase 3 knock-out mice were produced and analyzed. The homozygous null embryos were found to die at a very early stage of development due to lack of type IV collagen in the basement membranes. The data demonstrated that hydroxylysine formed by lysyl hydroxylase 3 is essential for early mouse development and that lysyl hydroxylase 1 or 2 cannot compensate for the lack of its function.

Keywords: , collagen, fetal development, gene structure, lysyl hydroxylase, transgenic mice

Acknowledgements

This work was carried out at the Collagen Research Unit of the Department of Medical Biochemistry and Molecular Biology, University of Oulu, and Biocenter Oulu during the years 1997-2003. I wish to express my deepest gratitude to my supervisor, Professor Kari Kivirikko for his continuous encouragement and never-ending optimism throughout these years. His broad knowledge of science has provided an excellent basis for this work. My other supervisor, Docent Johanna Myllyharju, deserves my warmest thanks for her inspiring attitude and helpful advice. I would also like to express my sincere thanks to Professor Taina Pihlajaniemi for providing excellent research facilities during these years and for introducing me to the field of science by giving me the opportunity to begin my work in the Department. Professor Leena Ala-Kokko deserves my honest thanks for her contribution to the first manuscript in this thesis and for her many helpful comments during my first steps in the Department. I am also grateful to Professor Peppi Karppinen for her true interest and optimism towards my work. I also wish to express my respect to Professor Ilmo Hassinen for his distinguished scientific work in the energy . Associate Professor Sirpa Aho and Docent Tuula Kallunki are acknowledged for their thorough and valuable comments on the thesis. I wish to thank Malcolm Hicks for his careful revision of the language of the manuscript. My warmest thanks go to my co-authors Kaisa Passoja and Kati Takaluoma for the pleasure of their collaboration and friendship. Their presence has made this work so much more fun! I would also like to thank Asta Pirskanen for her advice and constant optimism towards my work and life itself. Taru Kosonen, Tarja Helaakoski, Ari-Pekka Kvist and Raija Sormunen are also acknowledged for their valuable help in this work. Special thanks go to Raija Soininen for her extremely valuable contribution to the last manuscript in this thesis, and her inspiring attitude towards my work. Laura Silvennoinen deserves my warmest thanks and a big hug for her friendship, support and many unforgettable moments spent both at work and outside the lab! I would also like to thank Laura for all the help with the computers and in drawing the figures for the thesis. I would like to acknowledge the whole staff of the Department, both present and former, for creating an inspiring and cheerful atmosphere. Especially Joni Mäki,

Liisa Kukkola, Outi Pakkanen, Juha Lantto, Tiina Holster, Lauri Eklund and Ritva Nissi deserve my sincere thanks. I warmly thank Annamari Pirneskoski, Reija Hieta and Ritva Ylikärppä for sharing many enjoyable and sometimes even desperate moments when preparing for the doctoral discussion and organizing the party. I wish to thank Anu Myllymäki for her excellent and thorough technical assistance at all stages during this work, and for her great optimism and friendship over all these years. Jaana Jurvansuu, Jaana Träskelin and Riitta Jokela are also acknowledged. Sincere thanks go to Auli Kinnunen and Marja-Leena Kivelä for their efficient secretarial assistance, and to Pertti Vuokila, Marja-Leena Karjalainen and Seppo Lähdesmäki for their help in a variety of practical matters. I thank the whole “computer-support-crew”, especially Risto Helminen, for excellent and efficient help with the computing matters. I am also grateful to the personnel of the Biocenter Oulu transgenic core facility for their invaluable practical help with the mice. I wish to express my gratitude to my parents, Marja-Leena and Veikko, for all their care and support, and for giving me an excellent basis for my life. I am also grateful to my sister Heli and her fiancé Mikko for all the enjoyable moments spent together with them. I also wish to thank all my friends and relatives, especially my aunts as well as my grandmother Katri and my late grandfather Toivo, for all their help and support throughout my whole life. There are no words to express the importance of my beloved husband, Marko, over the past years and in those that are to come! His unconditional love and support have been essential for the completion of this thesis, not to mention his care of our precious daughter, Oona, while I was doing long hours. Finally, and above all, I thank Oona for bringing so much joy and happiness into my life. All the tiredness and distress is swept away every time I go home and see her smile. This work was supported by the Oulu University Scholarship Foundation, the Finnish Cultural Foundation, the Emil Aaltonen Foundation, the Tyyni Tani Foundation and the Finnish Centre of Excellence programme (2000-2005) of the Academy of Finland.

Oulu, October 2003 Kati Rautavuoma Abbreviations bp (s) BSA bovine serum albumin C- carboxy cDNA complementary deoxyribonucleic acid DEAE diethyl aminoethyl ER EST expressed sequence tag(s)

Km Michaelis constant kb kilobase(s) kDa kilodalton(s) LH lysyl hydroxylase mRNA messenger RNA N- amino PAGE polyacrylamide gel electrophoresis PBS phosphate-buffered saline PCR polymerase chain reaction RACE rapid amplification of cDNA ends SDS sodium dodecyl sulphate UDP uridine diphosphate

Vmax maximal velocity X any amino acid Y any amino acid

List of original articles

This thesis is based on the following articles, which are referred to in the text by their Roman numerals:

I Passoja K, Rautavuoma K, Ala-Kokko L, Kosonen T & Kivirikko KI (1998) Cloning and characterization of a third human lysyl hydroxylase isoform. Proc. Natl. Acad. Sci. USA, 95: 10482-10486. II Rautavuoma K, Passoja K, Helaakoski T & Kivirikko KI (2000) Complete exon- intron organization of the gene for human lysyl hydroxylase 3 (LH3). Matrix Biology, 19: 73-79. III Rautavuoma K, Takaluoma K, Passoja K, Pirskanen A, Kvist AP, Kivirikko KI & Myllyharju J (2002) Identification of three domains in the monomers of the recombinant human lysyl hydroxylase isoenzymes 1, 2 and 3. J Biol Chem 277: 23084-23091. IV Rautavuoma K, Takaluoma K, Sormunen R, Myllyharju J, Kivirikko KI & Soininen R (2003) Hydroxylysine formed by lysyl hydroxylase-3 is essential for early mouse development. (Manuscript)

Contents

Acknowledgements Abbreviations List of original articles Contents 1 Introduction ...... 13 2 Review of the literature ...... 14 2.1 Collagens...... 14 2.1.1 Collagen types ...... 14 2.1.2 Collagen biosynthesis...... 15 2.1.3 and connective tissues...... 17 2.1.3.1 Basement membranes ...... 17 2.2 Hydroxylysine ...... 17 2.2.1 Hydroxylysine in collagens and other proteins...... 18 2.2.2 Functions of hydroxylysine ...... 18 2.2.2.1 Glycosylation of hydroxylysine...... 19 2.2.2.2 Stabilization of collagen crosslinks ...... 19 2.3 Lysyl hydroxylase...... 20 2.3.1 Molecular properties...... 20 2.3.2 Localization of lysyl hydroxylase in the cell ...... 21 2.3.3 Reaction catalyzed by lysyl hydroxylase...... 21 2.3.3.1 Substrates...... 23 2.3.3.2 Cosubstrates and inhibitors...... 23 2.3.4 Isoenzymes ...... 25 2.3.4.1 Lysyl hydroxylases 1 and 2...... 26 2.3.5 Structure of the lysyl hydroxylase 1 gene...... 27 2.3.5.1 Alu sequences ...... 27 2.4 Ehlers-Danlos syndrome type VI...... 28 2.5 Human diseases and animal models ...... 29 2.5.1 The mouse as a model...... 29 2.5.2 Gene targeting, the introduction of site-specific modifications into the genome by homologous recombination...... 30

3 Outlines of the present research...... 31 4 Materials and methods...... 33 4.1 Cloning and characterization of the third human lysyl hydroxylase isoenzyme (I, II) ...... 33 4.1.1 Cloning and characterization of cDNA and genomic clones (I, II)...... 33 4.1.2 Nucleotide sequencing (I, II, III, IV)...... 34 4.1.3 The BLAST program and other sequence analyses (I, II)...... 34 4.1.4 Isolation of RNA and Northern blot analysis (I, IV)...... 34 4.1.5 Nuclease S1 assay (II) ...... 35 4.1.6 Construction of baculovirus transfer vectors and generation of recombinant viruses (I, III)...... 35 4.1.7 Expression of recombinant proteins and their purification (I, III) ...... 36 4.1.8 activity assays (I, III)...... 37 4.2 Analysis of recombinant lysyl hydroxylase fragments (III)...... 37 4.2.1 Limited proteolysis of the recombinant proteins (III)...... 37 4.2.2 N-terminal sequence analysis (III)...... 37 4.2.3 Urea gradient gel electrophoresis and circular dichroism analysis (III) ...... 37 4.3 Generation and analysis of a mouse strain with an inactivated lysyl hydroxylase 3 gene (IV) ...... 38 4.3.1 Generation of the mutant mice (IV)...... 38 4.3.2 Light microscopy (IV)...... 38 4.3.3 Immunohistological stainings (IV) ...... 39 4.3.4 Electron microscopy (IV) ...... 39 4.3.5 Immunoelectron microscopy (IV) ...... 39 5 Results ...... 40 5.1 The third human lysyl hydroxylase isoenzyme (I, II)...... 40 5.1.1 Molecular cloning and characterization of the cDNA (I)...... 40 5.1.2 Expression of the mRNA in various human tissues (I)...... 41 5.1.3 Expression in insect cells and characterization of the recombinant protein (I)41 5.1.4 Organization of the gene (II) ...... 42 5.2 Purification and characterization of recombinant human lysyl hydroxylase polypeptides and their fragments (III) ...... 43 5.3 Embryonic lethality of the lysyl hydroxylase 3 knock-out mice (IV) ...... 45 5.3.1 Generation of a mouse strain with the lysyl hydroxylase 3 gene inactivated (IV)...... 45 5.3.2 Microscopic analysis ...... 46 6 Discussion ...... 47 6.1 A novel human lysyl hydroxylase isoenzyme, lysyl hydroxylase 3...... 47 6.2 Lysyl hydroxylase polypeptides consist of at least three distinct domains...... 50 6.3 Hydroxylysine formed by lysyl hydroxylase 3 is essential for early mouse development ...... 53 7 Future prospects...... 56 References 1 Introduction

The extracellular matrix surrounds the cells in every tissue, its primary role being to offer mechanical support and protection and to participate in cell attachment and migration. Collagens are the major components of this extracellular matrix and constitute one-third of the proteins in the human body. Collagen biosynthesis is characterized by the presence of many cotranslational and posttranslational modifications, most of which are unique to collagens and proteins with collagen-like domains. Eight specific are required to catalyze these modifications, one of them being lysyl hydroxylase. This is located within the endoplasmic reticulum and hydroxylates lysine residues in collagenous sequences. The resulting hydroxylysine residues take part in the formation of collagen crosslinks, and also serve as attachment sites for carbohydrate units. Two isoenzymes of lysyl hydroxylase have been recognized so far, termed lysyl hydroxylases 1 and 2, the latter having been cloned only recently. In the present work an additional, novel human lysyl hydroxylase isoenzyme, lysyl hydroxylase 3, is identified, cloned and characterized, the recombinant protein is expressed in insect cells and used to study the catalytic properties of the novel isoenzyme, and the organization of the human lysyl hydroxylase 3 gene is determined. The present work also reports for the first time on isolation of the three recombinant human lysyl hydroxylase isoenzymes 1-3 as homogeneous proteins, and suggests that the lysyl hydroxylase polypeptides may consist of at least three distinct domains. Hydroxylysine has been regarded as non-essential, since the absence of lysyl hydroxylase 1 activity is not lethal in humans, since it leads to the kyphoscoliotic type of Ehlers-Danlos syndrome, and since recombinant collagens I and III lacking in hydroxylysine residues form native-type fibrils in vitro. In this work, however, the mouse lysyl hydroxylase 3 was inactivated by gene targeting, and the null embryos were found to die at a very early stage of development. Hydroxylysine formed by lysyl hydroxylase 3 is thus essential for early mouse development, and lysyl hydroxylases 1 or 2 cannot compensate for the lack of its formation. 2 Review of the literature

2.1 Collagens

Collagens, the most abundant proteins in the human body, constitute a multigene family of extracellular matrix proteins. In addition to providing mechanical strength for various organs and tissues, they have a number of other important biological functions. Currently 27 collagen types with at least 42 distinct polypeptide chains have been identified, and their are dispersed among at least 15 . In addition, more than 20 proteins with collagen-like domains have been identified (Kivirikko 1993, Brown & Timpl 1995, Prockop & Kivirikko 1995, Fitzgerald & Bateman 2001, Koch et al. 2001, 2003, Myllyharju & Kivirikko 2001, 2003, Hashimoto et al. 2002, Kielty & Grant 2002, Sato et al. 2002, Tuckwell 2002, Banyard et al. 2003, Boot-Handford et al. 2003, Pace et al. 2003). All collagen molecules are built up of three polypeptide chains, called α-chains, containing the repeating triplet sequence Gly-X-Y, where X and Y represent amino acids other than . These α-chains then wind together to form a triple helix. The presence of glycine in every third position is essential because it is small enough to fit into the restricted space in the center of the triple helix. is commonly found in the X position and 4-hydroxyproline in the Y position of the Gly-X-Y triplets. These two amino acids are essential for the collagen molecule, in that they provide stability for the triple helix (van der Rest & Garrone 1991, Prockop & Kivirikko 1995, Myllyharju & Kivirikko 2001, Jenkins & Raines 2002).

2.1.1 Collagen types

Most collagen molecules form supramolecular assemblies, and the superfamily can be divided into eight families based on their polymeric structures or other features: collagens that form fibrils (types I-III, V, XI, XXIV and XXVII); collagens that are located on the surfaces of fibrils and are called fibril-associated collagens with interrupted triple helices 15 (FACIT and structurally related collagens, types IX, XII, XIV, XVI, XIX, XX, XXI, XXII and XXVI); collagens that form hexagonal networks (types VIII and X); the family of type IV collagens found in basement membranes; type VI collagen, forming beaded filaments; type VII collagen, forming anchoring fibrils; collagens with transmembrane domains (types XIII, XVII, XXIII and XXV) and the family of type XV and XVIII collagens (Byers PH 2001, Myllyharju & Kivirikko 2001, 2003). Increasing numbers of secreted proteins containing triple-helical collagenous domains but not defined as collagens are also being included within the collagen superfamily. According to Kielty and Grant (2002) these proteins can be divided into five categories: the collectin subgroup of C-type animal lectins, complement and related factors, metabolic molecules, cytokines and related molecules, and elastic fibre-associated molecules (Prockop & Kivirikko 1995, Beck & Brodsky 1998, Kivirikko & Pihlajaniemi 1998, Chung et al. 1999, Ezer et al. 1999, Myllyharju & Kivirikko 2001, 2003, Kielty & Grant 2002).

2.1.2 Collagen biosynthesis

Collagen biosynthesis is characterized by a large number of cotranslational and post- translational modifications, many of which are unique to collagens and other proteins with collagen-like amino acid sequences. The post-translational processing of fibril-forming collagens can be regarded as occurring in two stages (see Figure 1). Intracellular modifications, together with the synthesis of the polypeptide chains, result in the formation of the triple-helical procollagen molecule. These modifications, which take place when the procollagen chains are translocated across the ER membrane into the lumen and are being synthesized, include cleavage of the signal peptides; hydroxylation of certain proline and lysine residues to 4-hydroxyproline, 3-hydroxyproline and hydroxylysine; glycosylation of some of the hydroxylysine residues to galactosyl-hydroxylysine and glucosylgalactosyl-hydroxylysine; glycosylation of certain residues in one or both of the propeptides; chain association; disulphide bonding and formation of the triple helix (Prockop & Kivirikko 1995, Kadler 1995, Myllyharju & Kivirikko 2001, Kielty & Grant 2002). Extracellular processing then converts the procollagens to collagens and incorporates the collagen molecules into stable, crosslinked fibrils or other supramolecular aggregates. The extracellular steps include cleavage of the N and C propeptides, self-assembly of the collagen molecules into fibrils by nucleation and propagation, and formation of covalent crosslinks (Prockop & Kivirikko 1995, Kadler et al. 1996, Prockop et al. 1998, Myllyharju & Kivirikko 2001). The intracellular modifications require five specific enzymes: prolyl 4-hydroxylase, prolyl 3-hydroxylase, lysyl hydroxylase, hydroxylysyl galactosyltransferase and galactosyl-hydroxylysyl glucosyltransferase, while the extracellular modifications require procollagen N-proteinase, procollagen C-proteinase and lysyl (Kivirikko & Pihlajaniemi 1998, Prockop et al. 1998, Smith-Mungo & Kagan 1998, Myllyharju & Kivirikko 2001, Kagan & Li 2003, Myllyharju 2003). Additional important enzymes are 16 peptidyl proline cis-trans and protein disulphide , and collagen synthesis also involves a specific molecular chaperone, Hsp47 (Nagata 1998, Lamande & Bateman 1999, Hendershot & Bulleid 2000).

Fig. 1. Biosynthesis of a fibril-forming collagen. Procollagen polypeptide chains are synthesized on the ribosomes of the rough endoplasmic reticulum and secreted into the lumen, where they are modified by hydroxylation of certain proline and lysine residues and glycosylation before chain association and triple helix formation. The procollagen molecules are secreted into the extracellular space where the N and C propeptides are cleaved by specific proteases. The collagen molecules then assemble into fibrils, which are stabilized by the formation of covalent crosslinks (Reproduced from Myllyharju and Kivirikko 2001, with the permission of Taylor & Francis AB). 17 2.1.3 Extracellular matrix and connective tissues

The extracellular matrix (ECM) is a complex mixture of structural and functional proteins arranged into a unique, tissue-specific three-dimensional ultrastructure. These proteins provide a natural scaffold for tissue and organ morphogenesis, maintenance, and regeneration following injury (Alberts et al. 2002). Animal connective tissues consist mostly of extracellular matrix, with collagens and providing the tensile strength of the tissue. The cells producing collagen have different names in different tissues, being known as chondrocytes in cartilage, osteoblasts in bone and fibroblasts in skin and some other tissues (Alberts et al. 2002).

2.1.3.1 Basement membranes

Basement membranes are specialized extracellular matrices that underlie all epithelial cell sheets and tubes. They also surround individual muscle cells, fat cells and Schwann cells, separating these cells and cell sheets from the underlying or surrounding connective tissue. All basement membranes contain type IV collagen together with proteoglycans (primarily heparan sulphates) and the glycoproteins and entactin (Alberts et al. 2002). Basement membranes are largely synthesized by the cells that rest on them, and they provide a strong connection between the epithelia and the underlying connective tissue. Basement membranes also act as filtration barriers for substances moving between parenchymal cells and the connective tissue space, and provide a scaffold for the migration of cells during embryogenesis and regeneration (Leeson et al. 1988). The best-studied basement membrane proteins are the , which constitute the major non-collagenous basement membrane component (Timpl 1996). In order to study the initial function of basement membranes in tissue development during embryogenesis, Smyth et al. (1999) used homologous recombination to inactivate one or both of the alleles coding for the laminin γ1 subunit. Mice that where heterozygous for the mutation had a normal phenotype and were fertile, whereas homozygous mutant embryos did not survive beyond embryonic day 5.5. These embryos lacked basement membranes, having disorganized extracellular deposits of the basement membrane proteins collagen IV and perlecan. Surprisingly, basement membranes were not necessary for the formation of the first epithelium during embryogenesis, but were first required for extra-embryonic endoderm differentiation (Smyth et al. 1999).

2.2 Hydroxylysine

Hydroxylysine is found in vertebrate proteins mostly in the collagens and collagen-like domains of other proteins that are not defined as collagens. The hydroxyl groups of hydroxylysine residues have two important functions: they serve as attachment sites for 18 carbohydrate units and they are crucial for the stability of the intramolecular and intermolecular collagen crosslinks (Kivirikko & Pihlajaniemi 1998).

2.2.1 Hydroxylysine in collagens and other proteins

Hydroxylysine is almost exclusively found in the Y positions of the repeating Gly-X-Y sequences in collagens and proteins with collagen-like domains. An exception to this rule is found in some fibril-forming collagens, where the short non-triple-helical regions at the ends of the α chain contain the sequences X-Hyl-Ser and X-Hyl-Ala (Kivirikko et al. 1992). The amount of hydroxylysine varies greatly between collagen types, and also within the same collagen type from different sources and even in the same tissue in different physiological and pathological conditions (Kivirikko & Myllylä 1980, Kivirikko et al. 1992). This is at least partly explained by incomplete hydroxylation of many lysine residues in the Y position, and in some collagens a number of lysine residues in these positions are not hydroxylated at all (Kivirikko & Pihlajaniemi 1998). The hydroxylation of lysine residues also seems to be age-related, so that the hydroxylysine content of collagen from embryonic tissues is higher than of that of collagen from adult tissues (Bailey & Shimokomaki 1971). The amount of hydroxylysine is also increased in some diseases, e.g. osteoporosis (Bailey et al. 1992, Knott et al. 1995, Lo Cascio et al. 1999), osteogenesis imperfecta (Lehmann et al. 1995a, Bank et al. 2000), osteosarcoma (Shapiro & Eyre 1982, Lehmann et al. 1995b) and lipodermatosclerosis (Shapiro & Eyre 1982, Lehmann et al. 1995b, Brinckmann et al. 1999). Hydroxylysine is also found in most, if not all, proteins with collagenous sequences not defined as collagens (Kielty & Grant 2002, Myllyharju & Kivirikko 2003). It has recently been shown that the fat-derived-hormone adiponectin has the ability to reduce hyperglycaemia and to reserve insulin resistance, and that hydroxylation and subsequent glycosylation of the four in the collagenous domain of this protein may contribute to this activity (Y. Wang et al. 2002). In addition to proteins with collagenous sequences, hydroxylysine residues are found in some proteins without such sequences. One example is anglerfish somatostatin-28 which is a peptide hormone consisting of 28 amino acids, residue 23 being hydroxylysine in the sequence Trp-Hyl-Gly. The function of this hydroxylysine remains uncertain (Andrews et al. 1984).

2.2.2 Functions of hydroxylysine

Hydroxylysine residues have at least two important functions in collagens: they serve as attachment sites for carbohydrates and they play a crucial role in stabilizing intramolecular and intermolecular crosslinks (Kivirikko & Pihlajaniemi 1998). 19 2.2.2.1 Glycosylation of hydroxylysine

The carbohydrates linked to hydroxylysine residues are either the monosaccharide galactose or the disaccharide glucosylgalactose. Formation of these carbohydrate moieties involves two specific enzymes, a galactosyltransferase (E.C. 2.4.1.50) and a glucosyltransferase (E.C. 2.4.1.66). The extent of glycosylation of hydroxylysine residues varies greatly between collagen types, and even within the same collagen type in various physiological and pathological states (Kivirikko & Myllylä 1979, 1980, Kivirikko et al. 1992). The functions of the hyroxylysine-linked carbohydrate units are not fully understood, but it has been suggested that they may regulate the packing of collagen molecules into supramolecular assemblies, as they are situated on the surface of the collagen molecules (Kivirikko & Pihlajaniemi 1998). Studies of fibrillar collagens have indicated that collagens containing high amounts of hydroxylysine and hydroxylysine-linked carbohydrates form very thin fibrils, whereas collagens with low amounts of these modifications form thick fibrils (Notbohm et al. 1999). As mentioned above, hydroxylation and glycosylation of the four lysines in the collagenous domain of adiponectin may contribute to the insulin-sensitizing ability of this protein (Y. Wang et al. 2002).

2.2.2.2 Stabilization of collagen crosslinks

The formation of covalent intramolecular and intermolecular crosslinks is the final step in collagen biosynthesis that occurs as an extracellular process after cleavage of the propeptides and fibril assembly (Kadler et al. 1996). These crosslinks are essential in providing the tensile strength and mechanical stability of the collagen fibrils and other supramolecular assemblies (Hulmes 1992). Two crosslinking pathways can be distinguished for fibrillar collagens (Kielty & Grant 2002): the lysine aldehyde pathway , which occurs mainly in adult skin, cornea and sclera, and the hydroxylysine aldehyde pathway, which occurs mainly in the bone, cartilage, ligament, tendons, embryonic skin and most internal organs (Eyre 1987, Kielty & Grant 2002). Crosslinks formed in the hydroxylysine-derived aldehyde pathway are more stable than those formed in the lysine- derived pathway (Kielty & Grant 2002). Crosslinking of collagens is initiated by the enzyme (Smith-Mungo & Kagan 1998, Csiszar 2001, Kagan & Li 2003), which catalyzes oxidative deamination of the ε-amino group in certain lysine and hydroxylysine residues situated in the telopeptide regions of collagens to form reactive aldehydes ( and hydroxyallysine). These aldehydes then form crosslinks, either by aldol condensation between two of the aldehydes or by condensation between one aldehyde and one ε-amino group of an unmodified lysine, hydroxylysine or glycosylated hydroxylysine residue (Eyre 1987, Bailey et al. 1998, Kielty & Grant 2002). 20 2.3 Lysyl hydroxylase

Lysyl hydroxylase (protocollagen-lysine 2-oxoglutarate 5-dioxygenase, E.C. 1.14.11.4) catalyzes the hydroxylation of lysine residues in X-Lys-Gly triplets in collagens and other proteins with collagenous domains (Kivirikko et al. 1992, Kivirikko & Pihlajaniemi 1998). It was suggested at an early stage in collagen research that hydroxylation of lysine and proline residues might be catalyzed by a single enzyme, protocollagen hydroxylase (Kivirikko & Prockop 1967). The reaction mechanism of lysyl hydroxylase is in fact very similar to that of prolyl 4-hydroxylase, requiring similar substrates and the same cosubstrates (Kivirikko & Pihlajaniemi 1998). It was demonstrated later, however, that these two enzymatic activities involve two separate enzymatic sites (Weinstein et al. 1969) and that purified proline hydroxylase does not act on lysine residues (Halme et al. 1970). Kivirikko and Prockop (1972) partially purified lysyl hydroxylase from chick embryos by DEAE cellulose chromatography and gel filtration and showed that it is indeed a separate enzyme from prolyl hydroxylase. It was further purified from chick embryos and chick embryo cartilage by conventional methods (Ryhänen 1976), and subsequently by an affinity chromatography procedure involving concanavalin A-agarose (Turpeenniemi et al. 1977). Finally, it was purified to homogeneity from chick embryos and human placental tissues by means of two affinity column procedures (Turpeenniemi- Hujanen et al. 1980, 1981).

2.3.1 Molecular properties

Active lysyl hydroxylase is a homodimer with a molecular weight of about 170 kDa, consisting of only one type of monomer with a molecular weight of about 85 kDa in SDS-PAGE analysis (Turpeenniemi-Hujanen et al. 1980, 1981, Myllylä et al. 1988). A larger protein corresponding to 550 kDa is also recovered in gel filtration analysis, probably being an aggregate of the enzyme dimers because in SDS-PAGE analysis it is seen as one band with a molecular weight of 85 kDa (Turpeenniemi-Hujanen et al. 1980, 1981). Molecular cloning and sequencing have been reported so far for the chick (Myllylä et al. 1991), human (Hautala et al. 1992a, Yeowell et al. 1992), rat (Armstrong & Last 1995, Mercer et al. 2003), mouse (Ruotsalainen et al. 1999) and C. elegans (Norman & Moerman 2000) lysyl hydroxylases. An unexpected finding was that no significant homology was detected between the primary structures of lysyl hydroxylase and prolyl 4- hydroxylase in spite of the marked similarities in their catalytic properties (Myllylä et al. 1991). The human lysyl hydroxylase polypeptide contains 709 residues and a signal peptide of 18 additional amino acids. Its mRNA is 3.2 kb in size. The human and chick lysyl hydroxylases are 76% identical at the amino acid level, the identity being as high as 94% in the catalytically important C-terminal region (Hautala et al. 1992a). The identity between the human (Hautala et al. 1992a) and rat (Armstrong & Last 1995) polypeptides, 21 and also between the human and mouse (Ruotsalainen et al. 1999) polypeptides is 91%, and that between the chick (Myllylä et al. 1991) and rat polypeptides 77%. The chick, human and rat lysyl hydroxylases all contain four potential attachment sites for asparagine-linked oligosaccharides (Myllylä et al. 1991, Hautala et al. 1992a, Yeowell et al. 1992, Armstrong & Last 1995). A substantial heterogeneity is found in the extent of glycosylation of the polypeptides, and only some of the asparagine-linked carbohydrates are critical for maximal enzyme activity (Myllylä et al. 1988, Pirskanen et al. 1996).

2.3.2 Localization of lysyl hydroxylase in the cell

Lysyl hydroxylase has been shown by a variety of techniques to be located in the lumen of the endoplasmic reticulum (Kivirikko et al. 1992, Kellokumpu et al. 1994). The hydroxylation of lysine and proline residues and the glycosylation of hydroxylysine residues occur as co-translational modifications while the collagen polypeptide chains are being synthesized, continuing as post-translational modifications after the complete polypeptide chains have been released from the ribosomes. Formation of the collagen triple helix is a turning point which prevents any further modifications (Kivirikko & Myllylä 1980, Kivirikko et al. 1992, Kivirikko & Pihlajaniemi 1998). Even though lysyl hydroxylase is located almost exclusively in the rough endoplasmic reticulum (Peterkofsky & Assad 1979), it does not contain either of the two ER-specific KDEL or double lysine retention motifs in its primary structure (Myllylä et al. 1991, Hautala et al. 1992a). Furthermore, the enzyme seems to be present only in association with the ER-membrane, as a luminally-oriented peripheral membrane protein binding to it via weak electrostatic interactions (Kellokumpu et al. 1994). Recent studies have shown that the 40-amino acid C-terminal end of lysyl hydroxylase is able to convert cathepsin D from a soluble lysosomal protein into a membrane- associated protein (Suokas et al. 2000), the amino acids in the middle of this segment being the most crucial for ER localization (Suokas et al. 2003). The data thus reveal a novel localization mechanism by which the ER lumen can preserve its specific protein components (Suokas et al. 2003).

2.3.3 Reaction catalyzed by lysyl hydroxylase

Lysyl hydroxylase belongs to the family of 2-oxoglutarate dioxygenases, an family with over 15 members. The reaction catalyzed by lysyl 2+ hydroxylase in vitro requires Fe , 2-oxoglutarate, O2 and ascorbate and produces succinate and CO2 (see Figure 2). One atom of the molecule is incorporated into the hydroxyl group of the substrate, while the other is incorporated into the succinate (Kivirikko & Pihlajaniemi 1998) 22

Fig. 2. The hydroxylation reaction catalyzed by lysyl hydroxylase (Puistola et al. 1980a). The 2-oxoglutarate is decarboxylated during the hydroxylation of a lysine residue in a peptide 2+ linkage in the presence of Fe , O2 and ascorbate.

Kinetic studies have shown that the hydroxylation reactions of lysine and proline 2+ residues involve an ordered binding of Fe , 2-oxoglutarate, O2 and the peptide substrate to the enzyme followed by an ordered release of the hydroxylated peptide, CO2, succinate and Fe2+, in which Fe2+ is not released between catalytic cycles (Kivirikko & Pihlajaniemi 1998). Ascorbate is not a stoichiometric component in the hydroxylation reaction, and the collagen hydroxylases can complete many catalytic cycles at a maximal rate in the complete absence of it (Tuderman et al. 1977, de Jong & Kemp 1984, Myllylä et al. 1984). Hydroxylation then ends very rapidly, however, and ascorbate is required to reactivate the enzyme (Myllylä et al. 1978). The reaction requiring ascorbate is an uncoupled decarboxylation of 2-oxoglutarate, a reaction occurring without subsequent hydroxylation of the peptide substrate (de Jong & Kemp 1984, Myllylä et al. 1984). The rate of the uncoupled reaction is only 1-4% of that of the complete one. The collagen hydroxylases catalyze uncoupled decarboxylation cycles occasionally even in the presence of a saturating concentration of their peptide substrates, and certain non- hydroxylatable peptide sequences increase the rate of uncoupled decarboxylation (Tuderman et al. 1977, Counts et al. 1978, Myllylä et al. 1984). 23 2.3.3.1 Substrates

The minimum sequence requirement for lysyl hydroxylase is the tripeptide X-Lys-Gly. The corresponding free amino acid or the tripeptide Lys-Gly-Pro are not hydroxylated, whereas the tripeptide Ile-Lys-Gly is (Kivirikko et al. 1972b). Lysyl hydroxylase also acts on lysine-rich histones containing X-Lys-Gly triplets, and surprisingly, on -rich histones, which do not contain any X-Lys-Gly triplet but do contain sequences such as X- Lys-Ala and X-Lys-Ser (Ryhänen 1975). This agrees with the occurrence of hydroxylysine in the sequences X-Hyl-Ala and X-Hyl-Ser in the short non-collagenous regions at the ends of some collagen chains (Kivirikko & Myllylä 1980). Interaction with lysyl hydroxylase is further influenced by the amino acid sequence around the lysine residue, the peptide length and the peptide conformation (Kivirikko & Pihlajaniemi 1998). There are studies suggesting that some amino acid sequences around the lysine residue may prevent interaction with lysyl hydroxylase (Kivirikko et al 1972a, 1973). The peptide chain length appears to influence only the Km value, longer peptides having lower Km values than shorter ones, whereas the reaction rate (Vmax) seems to be the same (Kivirikko et al. 1972a). The non-triple-helical conformation of the substrates is an absolute requirement, in that a triple-helical conformation prevents hydroxylation (Kivirikko et al. 1973, Ryhänen & Kivirikko 1974). It has been suggested that the conformational requirement for lysine hydroxylation is the presence of a ‘bent’ structure, such as the γ- or β-turn at the catalytic site of the enzyme (Jiang & Ananthanarayanan 1991, Ananthanarayanan et al. 1992).

2.3.3.2 Cosubstrates and inhibitors

The hydroxylation reaction catalyzed by lysyl hydroxylase requires Fe2+, 2-oxoglutarate, 2+ O2 and ascorbate (Kivirikko et al. 1992). The Fe is loosely bound to the enzyme by three side chains (Kivirikko & Pihlajaniemi 1998), and it does not have to leave the enzyme between catalytic cycles (Puistola et al. 1980b). Lysyl hydroxylase activity is dependent on added Fe2+, whereas other divalent cations such as Zn2+, Cd2+, Co2+, Ca2+ 2+ 2+ and Cu have an inhibitory effect (Ryhänen 1976). The Km values for Fe of human lysyl hydroxylase and prolyl 4-hydroxylase are practically identical (2-4 µM), suggesting the presence of a similar Fe2+ (Helaakoski et al. 1995, Pirskanen et al. 1996, Myllyharju & Kivirikko 1997). The inactivation of collagen hydroxylases by diethyl pyrocarbonate suggests that residues may have a catalytic function, probably as Fe2+ binding sites (Myllylä et al. 1992). A search for conserved amino acids within the sequences of several 2-oxoglutarate dioxygenases and a related enzyme, isopenicillin N synthase, indicated a weak homology within two histidine-containing motifs, termed His- 1 and His-2, located about 50-70 amino acids apart in the C-terminal regions of the enzymes. Site-directed mutagenesis of human prolyl 4-hydroxylase demonstrated that mutation of either of these to will inactivate the enzyme completely (Lamberg et al. 1995, Myllyharju & Kivirikko 1997). Subsequent determination of the crystal structure of isopenicillin N synthase indicated that the third binding ligand is a conserved aspartate residue located in position +2 with respect to the histidine present in 24 the His-1 motif (Roach et al. 1995). This was confirmed by site-directed mutagenesis studies of lysyl hydroxylase and prolyl 4-hydroxylase (Pirskanen et al. 1996, Myllyharju & Kivirikko 1997). The corresponding ligands in human lysyl hydroxylase are His638, Asp640 and His690 (Pirskanen et al. 1996). 2-Oxoglutarate is a highly specific requirement for the hydroxylation reaction (Kivirikko & Myllylä 1980, Kivirikko et al. 1992). It can be replaced by 2-oxoadipinate, but the Km for the latter (4.8 mM) is strikingly higher (Majamaa et al. 1985). The 2- oxoglutarate binding sites of lysyl hydroxylase and prolyl 4-hydroxylase can be divided into two main subsites: subsite I consists of a positively charged side-chain which binds the C-5 carboxyl group, while subsite II consists of two coordination sites of the enzyme- bound Fe2+ and is chelated by the C1-C2 moiety (Hanauske-Abel & Günzler 1982). The Km of human lysyl hydroxylase for 2-oxoglutarate, 100 µM, is significantly higher than those of the human prolyl 4-hydroxylases (12-22 µM) (Helaakoski et al. 1995, Pirskanen et al. 1996), suggesting some differences in the structures of their 2-oxoglutarate binding sites (Kivirikko & Pihlajaniemi 1998). Passoja et al. (1998) have identified Arg700 as the positively charged residue that binds the C-5 carboxyl group of 2-oxoglutarate in recombinant human lysyl hydroxylase, while the residue that does this in the α subunit of human type I prolyl 4-hydroxylase is Lys493 (Myllyharju & Kivirikko 1997). The most compelling inhibitors with respect to 2-oxoglutarate all have functional groups that can bind to the Fe2+ at the catalytic site, and these compounds can also interact at the other subsites of the 2-oxoglutarate site (Majamaa et al. 1984, 1985, 1986). The molecular oxygen needed for the hydroxylation reaction comes from the atmosphere, the other atom of the O2 molecule being incorporated into the succinate (Fujimoto & Tamiya 1962, Cardinale et al. 1971). Molecular oxygen bound to the Fe2+ at the catalytic sites of lysyl and prolyl hydroxylases is assumed to be end-on in an axial position (Hanauske-Abel & Günzler 1982, Hanauske-Abel 1991). The Km values of oxygen for both enzymes are also quite similar, 40-50 µM (Myllylä et al. 1977, Puistola et al. 1980a, Turpeenniemi-Hujanen et al. 1981). Superoxide dismutase active copper chelates inhibit collagen hydroxylases competitively with respect to O2, the inhibition probably taking place by dismutation of the activated form of oxygen at the catalytic site (Myllylä et al. 1979). Ascorbate is a highly specific requirement for the hydroxylation reaction, although it can be replaced by or dithiothreitol, but with a significantly slower reaction rate (Puistola et al. 1980a). The ascorbate binding site probably contains the two cis- positioned coordination sites of the enzyme-bound iron, and is partially identical to the binding site of 2-oxoglutarate (Majamaa et al. 1986). The Km values of the lysyl and prolyl hydroxylases for ascorbate are basically identical (about 350 µM), suggesting a similar structure at the binding site (Helaakoski et al. 1995, Pirskanen et al. 1996). Minoxidil, an antihypertensive piperidinopyrimidine nitrooxide, has been reported to suppress lysyl hydroxylase activity and the proliferation of cultured human skin fibroblasts. It had no effect when added directly to cell extracts, however, showing a requirement for intact cells (Murad & Pinnell 1987). The enzyme-suppressing effect of minoxidil seems to be dependent on one of the two amino groups situated on each side of the nitrooxide oxygen (Murad et al. 1992). The drug reduces the steady-state mRNA level of lysyl hydroxylase and also the amount of lysyl hydroxylase protein (Hautala et al. 25 1992b, Yeowell et al. 1992), and the effect appears to be highly specific, as no changes were observed in the amount of mRNAs for prolyl 4-hydroxylase (Hautala et al. 1992b). Shigematsu et al. (1994) have shown that an aqueous extract of Salviae miltorrhizae Radix, a Chinese medicinal herb, inhibits collagen secretion by human skin fibroblasts. In view of its chemical structure, this compound was predicted to be an inhibitor of the lysyl and prolyl hydroxylases, reducing the extent of lysyl and prolyl in collagen by approximately 50%. Recent studies with cultured fetal rat fibroblasts have shown that the organophosphate insecticides malathion and malaoxon inhibit lysyl hydroxylation in newly synthesized collagen, apparently through direct inhibition of lysyl hydroxylase. Additional kinetic studies showed this inhibition to be competitive or partially competitive with respect to the synthetic peptide, partially uncompetitive with respect to Fe2+, and partially non- competitive with respect to ascorbic acid. It is possible that the same mechanism may also account for the inhibition of prolyl 4-hydroxylase, as it shares several common substrate and requirements with lysyl hydroxylase (Samimi & Last 2001a, 2001b). It has been reported that even in the presence of optimal concentrations of all the cosubstrates, maximal activity of the collagen hydroxylases is not obtained unless dithiothreitol, bovine serum albumin and catalase are added to the reaction mixture. The stimulation provided by dithiothreitol suggests that free –SH groups at the catalytic site are essential for the enzyme activity. The action of bovine serum albumin is in part explained by a non-specific ‘protein effect’ and in part by the presence of many free thiol groups on this protein. Catalase destroys peroxidase and acts partly through a non- specific protein effect as well (Kivirikko et al. 1992).

2.3.4 Isoenzymes

Isoenzymes are defined as proteins with the same catalytic activity. Many enzymes are known to have isoforms, which may differ in substrate turnover or in regulation ability. Their occurrence may be restricted either to certain organs or to certain steps in development. Opinions have differed on the possible existence of lysyl hydroxylase isoenzymes over the years. Many findings have suggested that such isoenzymes do exist, especially since large differences in the extent of lysine hydroxylation are found between the collagen types, and even within the same collagen type from different tissues and at different developmental stages (Barnes et al. 1974). The hydroxylysine deficiency observed in patients with Ehlers-Danlos syndrome type VI also shows great variation between tissues and collagen types (Krane et al. 1972, Ihme et al. 1984). Two novel isoenzymes, termed lysyl hydroxylase 2 (LH2) and lysyl hydroxylase 3 (LH3), have been cloned and characterized from human (Valtavaara et al. 1997, 1998, and paper I of the present study) and mouse (Ruotsalainen et al. 1999) tissues during the past few years. The previously known isoenzyme is now correspondingly termed lysyl hydroxylase 1 (LH1). 26 2.3.4.1 Lysyl hydroxylases 1 and 2

The cDNA for human lysyl hydroxylase 2 encodes a polypeptide of 737 amino acids, including a signal peptide of 25 residues. This novel polypeptide has an overall amino acid sequence similarity of 75% to the human and chick lysyl hydroxylase 1 polypeptides, and the similarity is even higher, over 90%, in the conserved C-terminus. The lysyl hydroxylase 2 polypeptide contains nine conserved cysteine residues and seven potential N-glycosylation sites. Furthermore, the two conserved , one aspartate and one arginine residue required for lysyl hydroxylase activity are present in the sequence (Valtavaara et al. 1997). The mRNA for human LH2 is 4.2 kb in size, the highest expression levels being found in the pancreas, placenta, heart and skeletal muscle (Valtavaara et al. 1997). The corresponding gene (PLOD2) has been localized to the region 3q23-q24 (Szpirer et al. 1997), while the lysyl hydroxylase 1 gene (PLOD1) is located on (see 2.3.5) (Hautala et al. 1992a). A new, alternatively spliced form of the human lysyl hydroxylase 2 cDNA has been cloned from human skin fibroblasts (Yeowell & Walker 1999b) and human kidney (Valtavaara, M., Risteli, M., Ruotsalainen, H., Wang, C. and Myllylä, R., unpublished data). This novel lysyl hydroxylase 2b encodes a protein of 758 amino acids, of which 21 amino acids are encoded by a new 63-bp exon, termed 13A, located between exons 13 and 14 of the originally described LH2 cDNA, now designated lysyl hydroxylase 2a. Analysis of genomic DNA from different tissues has shown that both transcripts are generated from the same lysyl hydroxylase 2 gene (Yeowell & Walker 1999b). The lysyl hydroxylase 2b mRNA, of size 4.4 kb, is well expressed in human skeletal muscle and heart but almost undetectable in human placenta and pancreas (Valtavaara, M., Risteli, M., Ruotsalainen, H., Wang, C. and Myllylä, R., unpublished data). PCR amplification studies indicate that it is expressed as the major lysyl hydroxylase 2 form in all tissues studied except kidney and spleen. Preliminary studies further suggest that this alternative splicing event is not developmentally regulated (Yeowell & Walker 1999b). Bank et al. (1999) have reported recently that lysine residues within the telopeptides of in bone are underhydroxylated in patients with Bruck syndrome, an autosomal recessive disease with characteristics of fragile bones, joint contractures, scoliosis and osteoporosis (Brenner et al. 1993), leading to aberrant crosslinking, but that the lysine residues in the triple helix are modified in the normal manner. Cartilage and ligament tissues nevertheless showed unaltered telopeptide hydroxylation. These results suggested the existence of a telopeptide lysyl hydroxylase with tissue-specific forms. Uzawa et al. (1999), studying the expression patterns of different lysyl hydroxylase isoenzymes during in vitro differentiation of human bone marrow stromal cells and skin fibroblasts, observed that expression of lysyl hydroxylase 2 mRNA is associated with lysine hydroxylation in the non-triple-helical domains of collagens, so that this isoenzyme could be partially responsible for the tissue-specific collagen cross-linking pattern. Furthermore, van der Slot et al. (2003) very recently reported two different homozygous missense mutations in exon 17 of the lysyl hydroxylase 2 gene in two families with the Bruck syndrome, but gave no data to indicate whether these mutations inactivated the enzyme protein. 27 2.3.5 Structure of the lysyl hydroxylase 1 gene

The gene for human lysyl hydroxylase 1 has been mapped to the region 1p36.2-1p36.3 on chromosome 1 (Hautala et al. 1992a). It is 40 kb in size and contains 19 exons, which altogether represent only about 7% of the total size of the gene. Exons 1 to 18 vary in size from 64 to 164 bp, whereas exon 19 is 887 b, containing a 732 bp 3’ untranslated sequence and the polyadenylation signal. Exons 2, 5 and 7 begin with the second base of a codon, exons 4, 11, 12 and 13 with the third base and all the other exons with the first base. The consensus sequence CAG-exon-GT(G/A) is found at the boundaries of all introns except for intron 13, which begins with a sequence GTC instead of GT(G/A) (Heikkinen et al. 1994). The sizes of the introns vary from 350 bp to about 12 500 bp, the first being the largest. Introns 9, 15 and 16 have been sequenced completely, because they are involved in the gene rearrangements seen in Ehlers-Danlos syndrome type VI patients (see section 2.4) (Heikkinen et al. 1994). Results obtained by primer extension and S1 nuclease mapping analysis suggest the presence of multiple transcription initiation sites, the size of the 5’ untranslated region varying from 49 to 136 bp. The main transcription initiation sites are located 55-65 bp upstream of the ATG codon. The promoter region lacks a TATAA box, a typical promoter element in highly regulated genes. On the other hand, this agrees with the presence of multiple transcription initiation sites. A CCAAT sequence, which is a potential binding site for the CTF/NF transcription factor, is located 370 bp upstream of the main transcription initiation site. This sequence is usually found 80 bp upstream of the transcription initiation site, but it is also functional at different distances. The gene contains two potential Sp1 transcription factor binding sites in the coding strand of the promoter region, one in the non-coding strand and one at the 5’ end of the first intron. On the whole, the 5’ flanking region of the human lysyl hydroxylase 1 gene resembles the promoters of many housekeeping genes which encode proteins having essential and/or growth-related function in cells (Heikkinen et al. 1994).

2.3.5.1 Alu sequences

Short interspersed elements (SINEs) are repetitive elements that make up almost half of the , and Alu sequences represent the most abundant human SINE family. It has been estimated that there are around 700 000 copies of Alu sequences in the haploid genome, constituting more than 5% of our DNA. In practice, Alu sequences are found in the introns of protein-coding genes in humans and other primates. Consensus Alu sequences are approximately 280 bp in length, consisting of two similar but distinct monomers linked by an adenine-rich region. The 3’ end of the Alu sequence also features an adenine-rich region resembling a polyadenyl tail (Makalowski et al. 1994, Mighell et al. 1997). Alu sequences have been assumed to be retrotransposons that have been inserted into the human genome via a single-stranded RNA intermediate generated by RNA polymerase III (Mighell et al. 1997). This insertion of Alu sequences began early in 28 primate evolution, approximately 65 million years ago. The rate of amplification appears to have reached a maximum between 35 and 60 million years ago, and is currently occurring only at 1% of the maximum rate. It has been estimated that there is one new insert in about every 200 new births, which is still high enough to represent a significant factor in human mutagenesis (Deininger & Batzer 1995, 1999). Alu repeats can be grouped into subfamilies, which differ in their respective consensus sequences. At least 12 subfamilies have now been characterized, the Alu-J subfamily being thought to be the oldest and the Alu-Y subfamily the youngest (Makalowski et al. 1994, Batzer et al. 1996). It has been suggested that Alu sequences function as recombinational hotspots, being involved in most human gene rearrangements that occur by homologous recombination. This can lead to many genetic disorders, but can also represent yet another way in which retroposons may act as ‘seeds’ of evolution (Rüdiger et al. 1995, Deininger & Batzer 1999). Alu sequences are found at least in introns 9 and 16 in the lysyl hydroxylase 1 gene. Intron 9 contains five different Alu sequences and intron 16 eight, and computer analysis indicates that these repeats belong to the Alu-Sp, Alu-Sx or Alu-Sc subfamilies or represent truncated forms of Alu repeats (Heikkinen et al. 1994). The large number of Alu repeats in introns 9 and 16 generates extensive homology between the two. The longest identical sequence found in both introns is 44 nucleotides, which appears to explain the extensive gene duplication found in families with Ehlers-Danlos syndrome type VI (see section 2.4) (Hautala et al. 1993, Heikkinen et al. 1994, 1997, Pousi et al. 1994).

2.4 Ehlers-Danlos syndrome type VI

Ehlers-Danlos syndrome (EDS) is a heterogeneous group of heritable connective tissue disorders characterized by articular hypermobility, skin extensibility and tissue fragility. This autosomally inherited syndrome has traditionally been divided into ten subtypes on clinical, genetic and other grounds (Krane et al. 1972, Yeowell & Pinnell 1993, Byers 1994, Steinmann et al. 2002), but Beighton et al. (1998) have proposed a new, simplified classification of the syndrome based primarily on the cause of each type. Patients with Ehlers-Danlos syndrome type VI (EDS VI), also classified as the kyphoscoliotic type, are clinically characterized by scoliosis, generalized joint laxity, skin fragility, ocular manifestations and severe muscle hypotonia (Beighton 1970, Krane et al. 1972, Sussman et al. 1974, Beighton et al. 1998, Yeowell & Walker 2000, Steinmann et al. 2002). EDS VI was the first subtype in which the biochemical abnormality was characterized, namely a deficiency in lysyl hydroxylase activity in most but not all cases (Krane et al. 1972, Quinn & Krane 1976, Beighton 1993, Byers 1994, Yeowell & Walker 2000, Steinmann et al. 2002). The type VI variant has thus been divided into two subclasses: those with a low lysyl hydroxylase activity, classified as EDS VIA, and those with normal lysyl hydroxylase activity, classified as EDS VIB (Judisch et al. 1976, McKusick 1983). Sequence analysis of human lysyl hydroxylase 1 cDNA and genomic DNA (Yeowell et al. 1992, Hautala et al. 1992a, Heikkinen et al. 1994) has made it possible to 29 characterize the mutations responsible for the clinical phenotype of this disorder. At least 20 different mutations have been identified (Yeowell & Walker 2000). The first was reported by Hyland et al. (1992), who identified a homozygous single basepair substitution converting the CGA codon (Arg319) to a TGA translation termination codon in two siblings with EDS VI. The healthy parents and two of the three healthy siblings of the patients were heterozygous carriers of the mutation. The mutation led to an almost complete absence of lysyl hydroxylase activity in skin fibroblasts. The most common mutation, with a prevalence of about 20% among the families studied, appears to be due to a homologous recombination event between identical Alu sequences in introns 9 and 16 of the lysyl hydroxylase 1 gene. This results in the duplication of seven exons and an abnormally large mRNA (4.2 kb) (Hautala et al. 1993, Pousi et al. 1994, Heikkinen et al. 1997). Three other mutations have been identified in more than one unrelated EDS VI patient. These include a 15-bp deletion in exon 11 (Yeowell & Walker 2000), and homozygous single base substitutions in exon 14 (Walker et al. 1999, Yeowell & Walker 1997, 1999a, Pousi et al. 2000) and exon 10 (Yeowell et al. 2000b), both generating a premature termination codon. Other mutations identified so far include two different deletions (Ha et al. 1994, Pousi et al. 1998), two separate insertions of a single C nucleotide (Heikkinen et al. 1999, Yeowell et al. 2000b), eight point mutations, five of them predicted to produce a premature termination codon (Hyland et al. 1992, Ha et al. 1994, Brinckmann et al. 1998, Yeowell et al. 2000a, 2000b), four splice site mutations (Yeowell & Walker 1997, Pajunen et al. 1998, Pousi et al. 1998, Heikkinen et al. 1999) and a combination of abnormal sequences in exon 17 causing a missense mutation and a frameshift producing a premature termination codon (Heikkinen et al. 1997).

2.5 Human diseases and animal models

Since the development of the technology needed to manipulate the germlines of animals almost two decades ago, a large number of transgenic animals have been produced worldwide for use in both basic and applied research. Animal models have now become valuable for studying human genetic diseases and for starting the testing of new therapeutic strategies, including gene therapy.

2.5.1 The mouse as a model

Of all the mammals studied by the early geneticists, the mouse became the first choice because of its small size, resistance to infection, large litter size and relatively rapid breeding time. The mouse is also genetically very similar to the human being, having a comparable genome size and number of genes and a similar basic physiology and patterns of development. A great number of inbred and genetically defined mouse strains are available nowadays. 30 2.5.2 Gene targeting, the introduction of site-specific modifications into the genome by homologous recombination

Gene targeting, defined as the introduction of site-specific modifications into the genome by homologous recombination, has revolutionized the field of mouse genetics and allowed the analysis of diverse aspects of gene function in vivo. It is now possible to produce mice with specific genetic alterations ranging from simple gene disruptions and point mutations to chromosomal rearrangements, and even tissue-specific inducible gene targeting has become possible more recently (Müller 1999). The development of embryonic (ES cell) technology (Evans & Kaufman 1981, Martin 1981), and the observation that introduced DNA can combine with its homologous chromosomal counterpart in ES cells (Thomas & Capecchi 1987), made it possible to apply exact modifications to the gene of interest. ES cell lines are derived from pluripotent, uncommitted cells of the inner cell mass of pre-implantation blastocyst- stage mouse embryos. Using standard gene technology, a targeting construct containing the desired alteration in the gene of interest and markers for selection flanked by sequences homologous to the endogenous target gene can easily be produced in vitro. The ES cells are then tranfected with the targeting construct, which will homologously recombine with the resident gene and introduce the mutation at the desired site in the genome (Matise et al. 2000). In most cases, introducing a positive selection marker, which will disrupt the gene structure, is sufficient to achieve targeted inactivation of the gene. The most commonly used positive selection marker is a cassette carrying the neomycin resistance gene under the control of a strong promoter (von Melchner et al. 1992). In addition, a reporter gene can be introduced into the gene of interest in order to analyze the expression of the targeting construct in different tissues. The β-galactosidase gene is a widely used reporter gene which makes it possible to visualize the expression of specific gene products by X- gal staining on whole embryos and sections already in a heterozygous state (Hasty et al. 2000). 3 Outlines of the present research

Lysyl hydroxylase plays an important role in the biosynthesis of all collagens. It participates in the intracellular modifications by hydroxylating lysine residues in collagenous sequences, and the resulting hydroxylysine residues take part in the formation of collagen crosslinks and also serve as attachment sites for carbohydrate units linked to collagens. The critical role of hydroxylysine in collagens is demonstrated by the marked changes in the mechanical properties of various tissues seen in patients with Ehlers-Danlos syndrome type VI, a heritable disease caused by a deficiency in lysyl hydroxylase activity. Many previous studies have suggested that lysyl hydroxylase may have isoenzymes, and one such isoenzyme, termed lysyl hydroxylase 2, has recently been cloned. The existence of multiple lysyl hydroxylase isoforms would provide an explanation of why the deficiency in lysyl hydroxylase activity observed in Ehlers- Danlos syndrome type VI shows a wide variation between tissues. The specific aims of this work were:

1. to attempt to identify, clone and characterize an additional human lysyl hydroxylase isoenzyme. The subsequent goals were to study the expression of the mRNA for this novel isoenzyme, lysyl hydroxylase 3, in human tissues and to produce a recombinant protein in insect cells to enable its catalytic properties to be studied.

2. to study the exon-intron organization of the novel human lysyl hydroxylase isoenzyme 3 gene and the sequence of its promoter region.

Recombinant lysyl hydroxylase 3 was found to be more soluble than lysyl hydroxylase 1, but no attempt has been reported to purify any of the lysyl hydroxylase isoenzymes to homogeneity, due to a marked insolubility and aggregation tendency. Lysyl hydroxylase 3 has very recently been reported to have collagen glucosyltransferase activity as well (Heikkinen et al. 2000), and the amino acid residues involved in the latter have been reported to differ from those required for lysyl hydroxylase activity (C. Wang et al. 2002). The specific aim in this respect was thus:

32 3. to purify all three recombinant human lysyl hydroxylase isoenzymes to homogeneity, to identify possible protease-sensitive regions that may correspond to domain boundaries, and to express properly folded fragments in order to illustrate their properties. An additional important aim was to determine whether pure recombinant lysyl hydroxylase 3 has any collagen glucosyltransferase activity, what the level of such activity might be, how this activity may be related to the domain structure of the enzyme, and whether the enzyme also has collagen galactosyltransferase activity.

We still know very little about the specific functions of the lysyl hydroxylase isoenzymes. Their human mRNAs have been found to be expressed in a variety of tissues, but no clear tissue-specific differences have yet emerged. It has also been shown that there is no correlation between the expression of lysyl hydroxylase isoforms and individual collagen types, suggesting that they lack collagen type specificity. A further specific goal was thus:

4. to produce lysyl hydroxylase 3 knock-out mice and analyze the consequences of the lack of the gene product. 4 Materials and methods

More detailed descriptions of the materials and methods are presented in the original papers I-IV.

4.1 Cloning and characterization of the third human lysyl hydroxylase isoenzyme (I, II)

4.1.1 Cloning and characterization of cDNA and genomic clones (I, II)

In order to obtain a probe for cDNA library screening (I), PCR primers F1 and R3 were designed based on a 343-bp EST sequence (AA 340 606) and used to obtain a 236-bp product from an adult human kidney λgt-10 cDNA library (Clontech). The purified PCR product was used to screen foetal (λgt-11) and adult (λgt-10) human kidney and human placenta (λgt-11) cDNA libraries (BD Biosciences). Altogether, five positive clones were obtained from the kidney libraries and 16 from the placenta library. Seven of these clones were characterized in detail. To obtain the 5’ end of the cDNA, two new probes were generated by PCR from the human placenta library and used for rescreening this library. Eight positive clones were obtained as a result, four of which were characterized in detail. In an independent approach, rapid amplification of the 5’ cDNA ends was performed using pooled human placenta cDNAs (Marathon-Ready cDNA, BD Biosciences) as a template and oligonucleotides generated from existing sequences as primers. To characterize the structure of the human lysyl hydroxylase 3 gene (II), two P1 genomic clones, 18050 and 18051, were obtained from the P1 Human Library screening services (Genome Systems, Inc.) using a sequence corresponding to nucleotides 2135- 2516 of lysyl hydroxylase 3 cDNA as a probe. The P1 DNA was isolated using the QIAGEN Plasmid Midi Kit. Sequencing was performed directly on the P1 DNA in the case of all the exons and most of the introns. The three largest introns (6, 15 and 16) were 34 first amplified by PCR with a forward primer corresponding to the 3’ end of the preceding exon and a reverse primer corresponding to the 5’ end of the next exon, and the resulting PCR products were then sequenced. The sizes of the PCR products, and thus the sizes of the introns, were analyzed with a 1% agarose gel. All the PCR products were also isolated, cloned into pUC18 (Amersham Biosciences) and sequenced with vector and cDNA-specific primers.

4.1.2 Nucleotide sequencing (I, II, III, IV)

DNA sequencing was performed with an Abi Prism 377 automated sequencer (Applied Biosystems, Inc.) using BigDye, dRhodamine Terminator Cycle Sequencing Ready Reaction (PE Biosystems) (I, II, III) or DYEnamic ET terminator Cycle Sequencing (Amersham Biosciences) (IV) kits.

4.1.3 The BLAST program and other sequence analyses (I, II)

The BLAST (Basic Local Alignment Search Tool) program was used to search the database of human expressed sequence tags (EST) with the lysyl hydroxylase 1 cDNA sequence in order to obtain part of the cDNA sequence of human lysyl hydroxylase 3 (I), and to confirm the new sequence by cDNA library screening (I). It was also used to detect Alu repeats when analyzing the gene for human lysyl hydroxylase 3 (II). DNASIS and PROSIS version 6.00 software (Amersham Biosciences) was used to analyze the nucleotide sequence data after sequencing (I, II, III, IV). The Transcription Factor Database using the TESS (Transcription Element Search Software on the WWW), Version 3.2, and MatInspector, Version 2.2, programs, were employed when searching the consensus sites for the binding of transcription factors in the human lysyl hydroxylase 3 gene (II).

4.1.4 Isolation of RNA and Northern blot analysis (I, IV)

The Northern blot analysis reported in paper I was carried out using the ready-made human multitissue Northern blots I and III and foetal blot II (BD Bioscences), containing 2 µg of poly(A)+ RNA per sample, and samples of 15 µg of total RNA isolated from cultured adult human skin fibroblasts and human Saos-2 osteosarcoma cells with the RNeasy Midi Kit (Qiagen). The samples were hybridized using ExpressHyb solution (BD Biosciences) under the stringent conditions recommended by the manufacturer. A 352-bp PCR fragment from the 3’ end of the lysyl hydroxylase 3 cDNA was used as a probe. In paper IV total RNA was extracted from E9.5 mouse embryos using TriReagent (Sigma) and Northern blots were prepared by routine methods. The membrane was hybridized with a cDNA probe containing exons 2-19 of the mouse lysyl hydroxylase 3 gene, and the integrity of RNA was confirmed by hybridization with a β-actin probe. 35 4.1.5 Nuclease S1 assay (II)

Nuclease S1 protection analysis (Berk & Sharp 1977, Pihlajaniemi & Myers 1987, Sambrook et al. 1989) was performed to localize the transcription initiation site of the lysyl hydroxylase 3 gene. The assay is described in detail in paper II.

4.1.6 Construction of baculovirus transfer vectors and generation of recombinant viruses (I, III)

To construct the full-length lysyl hydroxylase 3 cDNA (I), two pairs of sequence-specific primers were designed. The 5’ end of the cDNA (nucleotides 37-1360) was amplified with the first pair of primers and the 3’ end (nucleotides 1361-2317) with the second pair (Figure 3). The resulting fragments were ligated to the baculovirus transfer vector pVL1393 (Luckow 1993) and the whole construct was sequenced using Abi Prism 377 (Applied Biosystems, Inc.).

Fig. 3. Amplification of the lysyl hydroxylase 3 cDNA in two parts by PCR. The 5’ part contained an artificial BamHI site and a natural NarI site and the 3’ part a natural NarI site and an artificial XbaI site. The arrows indicate the locations of the oligonucleotide primers used in the amplification.

Construction of the expression plasmids for the production of the histidine-tagged full- length lysyl hydroxylase 1, 2 and 3 polypeptides, the histidine-tagged lysyl hydroxylase 1 and 3 fragments and the fragment pairs is described in detail in paper III. The constructs contained a baculovirus GP67 signal sequence, a histidine tag, and cDNAs coding for the processed LH polypeptides and their fragments (Figure 4). The sequences were verified by DNA sequencing using Abi Prism 377. 36

Fig. 4. Construction of the expression plasmids for the production of the histidine-tagged lysyl hydroxylase 1 (LH1) polypeptide and its fragments. The corresponding nucleotides for the full-length lysyl hydroxylase 2 and 3 polypeptides and lysyl hydroxylase 3 fragments can be found in paper III.

The recombinant baculovirus vectors were cotransfected into Spodoptera frugiperda Sf9 cells with modified Autographa californica nuclear polyhedrosis virus DNA (BD PharMingen) by calcium phosphate transfection, and the recombinant viruses were selected (Crossen & Gruenwald 1998).

4.1.7 Expression of recombinant proteins and their purification (I, III)

High Five insect cells (Invitrogen) infected with a baculovirus coding for human lysyl hydroxylase 3 (I) and for the histidine-tagged full-length lysyl hydroxylases 1-3 and the lysyl hydroxylase 1 and 3 fragments and fragment pairs (III) were cultured as monolayers in TNM-FH medium supplemented with 10% foetal bovine serum (Life Technologies) (I) or in suspension in Sf900IISFM serum-free medium (Invitrogen) (III). The cells were infected at a multiplicity of 5, harvested 24-72 h (I) or 48 h (III) after infection, and washed and homogenized as described in paper I. Aliquots of the soluble fractions of the cell homogenates were analyzed by 8 or 12% SDS-PAGE under reducing conditions, and the cell pellets were further solubilized in 1% SDS and analyzed in the same manner. Purification of the histidine-tagged full-length lysyl hydroxylase 1-3 polypeptides and lysyl hydroxylase 1 and 3 fragments and fragment pairs is described in detail in paper III. The media samples from the suspension cultures were harvested by centrifugation and incubated with ProBond metal chelate affinity resin (Invitrogen), which was packed into a column, washed and the bound proteins eluted with 0.2 or 0.4 M imidazole. The fractions containing the full-length lysyl hydroxylase polypeptides or the fragments or fragment pairs were pooled and the imidazole removed on PD-10 columns (Amersham Biosciences). 37 4.1.8 Enzyme activity assays (I, III)

Lysyl hydroxylase activity (I, III) was assayed by a method based on the hydroxylation- 14 coupled decarboxylation of 2-oxo[1- C]glutarate using a synthetic peptide (Ile-Lys-Gly)3 as a substrate (Kivirikko & Myllylä 1982). Pooled soluble fractions of the cell homogenate (I) or the purified recombinant full-length lysyl hydroxylase 1 and 3 isoenzymes and their recombinant fragments (III) were used as the enzyme source. Km values were determined as described previously (Kivirikko & Myllylä 1982). Collagen galactosyltransferase and glucosyltransferase activities (III) were assayed by a method based on the transfer of [14C]galactose or [14C]glucose from radioactive UDP- galactose or UDP-glucose to hydroxylysine or galactosylhydroxylysine residues, respectively, in a denatured citrate-soluble rat skin collagen substrate. The values were converted to mmol of the radioactive product synthesized per mol enzyme per min with a saturating concentration of the collagen substrate (Myllylä et al. 1975, Kivirikko & Myllylä 1982).

4.2 Analysis of recombinant lysyl hydroxylase fragments (III)

4.2.1 Limited proteolysis of the recombinant proteins (III)

Purified full-length recombinant human lysyl hydroxylase 1-3 polypeptides were digested with thermolysin, trypsin, or proteinase K at varying protease:lysyl hydroxylase ratios at 37oC for 30-90 min. The digestions were stopped with Pefabloc SC (Roche Molecular Biochemicals), or EDTA for thermolysin, and samples were analyzed by 12% Tris- Tricine PAGE.

4.2.2 N-terminal sequence analysis (III)

The digested samples were transferred from 12% Tris-Tricine PAGE to a ProBlott membrane (Applied Biosystems, Inc.) by electro-blotting and stained with Coomassie blue. The protease-resistant peptides were cut from the membrane and their N-termini sequenced with an Applied Biosystems Inc. 477A pulse-liquid protein sequencer.

4.2.3 Urea gradient gel electrophoresis and circular dichroism analysis (III)

Transverse 0-8 M urea gradient gel electrophoresis (Wingfield & Pain 1998) was carried out using 50 µg of purified recombinant lysyl hydroxylase 1 and 3 A fragments and 38 fragment pairs. CD spectrum measurements were performed on these same lysyl hydroxylase polypeptides with a Jasco J500A spectropolarimeter in a 1-mm quartz cuvette.

4.3 Generation and analysis of a mouse strain with an inactivated lysyl hydroxylase 3 gene (IV)

4.3.1 Generation of the mutant mice (IV)

A genomic clone containing murine lysyl hydroxylase 3 gene sequences was isolated from the 129SV mouse genomic library (Stratagene) using a human lysyl hydroxylase 3 cDNA as a probe. To construct a targeting vector, a 6.2 kb fragment containing 5 kb of the 5’-flanking region and exons 2, 3, 4 and part of exon 5 was subcloned into the BlueScript plasmid (Promega). A 132 bp PstI-SalI fragment containing the first exon and a small part of the first intron was replaced with a 5 kb β-gal-PGK-neo cassette, which is in frame for translation. The linearized targeting construct was electroporated into GS-1 embryonic stem (ES) cells (Genome Systems). Genomic DNA from the G418-resistant ES clones was screened by PCR, and homologous recombination was verified by Southern blotting (Matise et al. 2000). Correctly targeted ES cells were injected into C57BL/6 blastocysts (Papaioannou & Johnson 2000), which were then implanted into pseudopregnant females. Three heterozygous mouse lines were obtained by breeding the resulting chimeras with C57BL/6 mice. Mutant embryos were obtained by using heterozygous siblings. All the embryos were genotyped by PCR and Southern blotting, and total RNA was extracted from them. Northern blots were prepared from the extracted total RNA by routine methods.

4.3.2 Light microscopy (IV)

Embryos were collected from heterozygous females 8 or 9 days after detection of the copulation plug, stained with X-Gal as described previously (Gossler & Zachgo 1994) and fixed overnight in 10% buffered formalin. The fixed samples were embedded in paraffin, sectioned and stained with hematoxylin-eosin by standard methods. The whole embryos and the hematoxylin-eosin stained sections were observed under a light microscope. 39 4.3.3 Immunohistological stainings (IV)

The E8.5 embryos for indirect immunofluorescence stainings were snap-frozen in liquid nitrogen-cooled isopentane, sectioned and fixed in precooled ethanol for 10 min. After washing in PBS, the sections were blocked by incubating with 1% BSA in PBS for 60 min at room temperature. They were then incubated with appropriately diluted primary antibodies overnight at 4oC, washed with PBS, stained with Cy2 (Jackson ImmunoResearch) or ALEXA Fluor 568-conjugated (Molecular Probes) secondary antibody, washed with PBS, mounted (Immu-mountTM, Shandon Inc.) and visualized under UV light.

4.3.4 Electron microscopy (IV)

Preparation of the whole E9.5 embryos for ultrastructural examinations is described in detail in paper IV. Thin sections were cut with a Reichert Ultracut ultramicrotome and examined in a Philips CM100 transmission electron microscope.

4.3.5 Immunoelectron microscopy (IV)

Preparation of the E9.5 embryo samples for immunoelectron microscopy is described in detail in paper IV. Thin cryosections were cut with Leica Ultracut UCT microtome. For immunolabelling, the sections were first incubated in 0.05 M glycine in PBS followed by incubation in 5% BSA with cold water fish skin (CWFS, Aurion) in PBS. They were then incubated with an appropriately diluted collagen IV antibody (Chemicon) followed by a protein A-gold complex, prepared as described previously (Slot & Geuze 1985). Finally they were embedded in methylcellulose and examined in a Philips CM100 transmission electron microscope. 5 Results

5.1 The third human lysyl hydroxylase isoenzyme (I, II)

5.1.1 Molecular cloning and characterization of the cDNA (I)

To study the possible existence of additional human lysyl hydroxylase isoenzymes, the database of human expressed sequence tags (EST) was searched using the lysyl hydroxylase 1 sequence (Hautala et al. 1992a) as a probe. This resulted in the identification of a 343-bp sequence in a foetal kidney database showing 81% identity to the 3’ end of the coding region of the lysyl hydroxylase 1 cDNA at the nucleotide level and 77% identity to that of the lysyl hydroxylase 2. PCR primers were designed based on this EST sequence and used to obtain a product from an adult human kidney cDNA library, which was in turn used to screen foetal and adult human kidney and human placenta cDNA libraries. Seven out of the total of 21 positive clones were characterized in detail, but none of them contained the 5’ end of the cDNA, which was subsequently obtained by rescreening the placenta library with two new PCR products and simultaneous 5’-RACE analysis with pooled human placenta cDNAs. The cDNA clones for human lysyl hydroxylase 3 cover 61 nucleotides of the 5’ untranslated sequence, the whole coding region corresponding to 738 amino acids, and 295 nucleotides of the 3’ untranslated sequence. The 3’ untranslated sequence contains the canonical polyadenylation signal AATAA, which is accompanied 11 bp downstream by a poly(A)+ tail of 46 nucleotides. A putative signal sequence is present in the N terminus of the coded polypeptide, the most likely first amino acid of the mature polypeptide being serine according to the computational parameters of von Heijne (1986). Thus the size of the signal sequence is likely to be 24 residues and that of the processed polypeptide 714 amino acids. The overall amino acid sequence identity between the human lysyl hydroxylases 3 and 1 is 59%, and that between the human lysyl hydroxylases 3 and 2 57%, whereas the identity between human lysyl hydroxylase 3 and the C. elegans lysyl hydroxylase is only 45%. The degree of identity is highest within the 41 catalytically important C-terminal region (Pirskanen et al. 1996, Kivirikko & Pihlajaniemi 1998), the 87 extreme C-terminal residues being 72% identical between human lysyl hydroxylases 3 and 1, 69% between 3 and 2, and 64% between human lysyl hydroxylase 3 and the C. elegans lysyl hydroxylase. All the critical amino acid residues required for the binding of the Fe2+ atom (Pirskanen et al. 1996) and the C-5 carboxyl group of 2-oxoglutarate (Passoja et al. 1998), namely His643, Asp645, His695 and Arg705, are conserved in the lysyl hydroxylase 3 polypeptide. The lysyl hydroxylase 3 polypeptide contains ten cysteine residues, nine of them being conserved in all three human lysyl hydroxylases. Six of these are also found in the C. elegans enzyme, and these residues may be especially important structurally. All the lysyl hydroxylase isoenzymes have potential attachment sites for asparagine-linked oligosaccharides, human lysyl hydroxylase 3 having two such sites and human lysyl hydroxylases 1 and 2 four and seven sites, respectively, whereas the C. elegans enzyme has only one site (Figure 1 in I).

5.1.2 Expression of the mRNA in various human tissues (I)

Northern hybridization with a cDNA probe for lysyl hydroxylase 3 indicated the existence of only a single mRNA of about 3.0 kb, which was expressed in a variety of tissues (Figure 2 in I). The highest expression levels among the tissues studied were found in the placenta, pancreas, spinal cord and heart. Foetal lung, kidney and liver had higher expression levels than the corresponding adult tissues. High expression levels were also detected in human Saos-2 osteosarcoma cells and fibroblasts, the level being significantly higher in the former. The highest expression levels of isoenzyme 1 mRNA have earlier been found in the liver and skeletal muscle (Heikkinen et al. 1994), whereas those of isoenzyme 2 mRNA have been found in the pancreas, placenta, heart and skeletal muscle (Valtavaara et al. 1997).

5.1.3 Expression in insect cells and characterization of the recombinant protein (I)

In order to study the catalytic properties of human lysyl hydroxylase 3, a baculovirus coding for this isoenzyme was generated and used to infect High Five cells. The infected cells were found to express a new 80-85 kDa polypeptide, not seen 24 h after infection (Figure 3 in I). This was present in all three fractions, i.e. the Nonidet P-40 and glycerol buffer extracts and the remaining pellets solubilized with 1% SDS, the highest levels being found in the SDS-soluble fraction. The novel polypeptide migrated in multiple bands, probably because of heterogeneity in glycosylation, as has previously been shown for the recombinant lysyl hydroxylase 1 (Pirskanen et al. 1996). The Nonidet P-40 and glycerol buffer extracts were analyzed for lysyl hydroxylase activity with an assay based on the hydroxylation-coupled decarboxylation of 2-oxo[1- 14C]glutarate (Table 1 in I). The Nonidet P-40 buffer extract was found to give a high 2- 42 oxo[1-14C]glutarate decarboxylation rate even in non-infected cells. A slight increase in enzyme activity level was seen in both extracts 24 h after infection, the activity in the Nonidet P-40 extract reaching its maximum value at 48 h, whereas that in the glycerol extract continued to increase throughout the 72 h experiment. 2+ The Km values for Fe , 2-oxoglutarate, ascorbate and two peptide substrates with soluble cell extracts as sources of the enzyme were 2, 100, 300, 600 and 800 µM, respectively (Table 2 in I). All these values were found to be very similar to those determined for recombinant human lysyl hydroxylase 1 (Pirskanen et al. 1996).

5.1.4 Organization of the gene (II)

Of the two P1 genomic clones obtained from the P1 Human Library screening services, clone 18050 spanned the entire lysyl hydroxylase 3 gene, whereas clone 18051 contained only the 3’ end. The exon-intron organization of the gene was determined by sequencing with cDNA-specific primers and comparing the result with the human lysyl hydroxylase 3 cDNA sequence. All the exons and most of the introns were characterized by direct sequencing of clone 18050, introns 6, 15 and 16, which were difficult to sequence, being amplified by PCR before sequencing. Although the human lysyl hydroxylase 3 gene contains 19 exons and 18 introns, its overall size is only 11.6 kb (Figure 1 and Table 1 in II). Exon 1 is 358-444 bp in size and contains 109 bp of a translated sequence, while exons 2-18 vary in size from 64 to 164 bp. The last exon, number 19, is the largest, 454 bp, and consists of 156 bp of the coding region and the entire 3’ untranslated region. The sizes of the exons are well conserved as compared with those of the human lysyl hydroxylase 1 gene (Heikkinen et al. 1994), but the introns are very much shorter. The boundaries of all the exons and introns follow the GT/AG consensus rule (Table 1 in II). The BLAST program was used to study the existence of Alu sequences in the gene. A total of 15 full-length Alu retroposons, or partial Alu fragments of more than 100 bp, were found. The longest intron, number 6, of size 1883 bp, contains three full-length Alu repeats and one partial one, while intron 12, 916 bp in size, contains two full-length Alu repeats. Introns 5, 15 and 16, ranging in size from 755 to 1136 bp, each contain one full- length Alu repeat, intron 5 also having one, and introns 15 and 16 two partial ones. Intron 17, 673 bp in size, contains one full-length Alu repeat. The transcription initiation site of the gene was determined by S1 nuclease protection analysis. After S1 nuclease digestion, nine protected bands were found, ranging in size from 288 to 374 bp. Comparison of the protected fragments with the adjacent dideoxynucleotide sequencing reactions indicated the presence of one major transcription initiation site and several minor ones (Figure 2 in II). Thus the length of the 5’ untranslated region of the mRNA ranges from 249 to 335 nucleotides, the 311-nucleotide form being the most abundant. The 5’ flanking region of the lysyl hydroxylase 3 gene was found to show no nucleotide sequence similarity to that of the human lysyl hydroxylase 1 gene (Heikkinen et al. 1994). The lysyl hydroxylase 3 gene lacks the typical TATAA and CCAAT boxes (Figure 3 in II), and the sequence at the transcription start site does not contain the loose 43 consensus sequence of known promoter initiator (Inr) elements (Javahery et al. 1994). Two MAZ binding sites were found, however, one of them in the 5’ flanking region and the other in the 5’ untranslated sequence. The factors binding to the MAZ sites interact with Sp1 transcription factors and could be important regulators of TATA-less promoters (Parks & Shenk 1996, 1997). The 5’ flanking region contains three potential binding sites for the transcription factor Sp1, one for PEA-3, one for NF-1, one for USF and two for AP-1, while the 5’ untranslated region contains two additional Sp1, one NF-1, one AP-1, one AP-2 and one USF binding site. The first intron contains one further Sp1, one AP-1 and one MAZ site.

5.2 Purification and characterization of recombinant human lysyl hydroxylase polypeptides and their fragments (III)

In order to purify and characterize the human lysyl hydroxylase 1, 2 and 3 polypeptides, baculoviruses coding for the isoenzymes were generated and used to infect High Five cells. cDNAs coding for the processed lysyl hydroxylase polypeptides were cloned into a pAcGP67A expression vector, and a histidine tag was inserted between the GP67 signal sequence and the lysyl hydroxylase cDNA in each case. The use of a baculovirus signal peptide GP67 has been reported previously to lead to efficient secretion of a recombinant lysyl hydroxylase 1 polypeptide expressed in insect cells (Krol et al. 1996). High Five insect cells cultured in suspension were infected with baculoviruses coding for the histidine-tagged lysyl hydroxylase polypeptides, and the purification was performed as described in detail in paper III. The final preparations were found to be pure after one- step metal chelate affinity purification (Figure 1 in III) and to have lysyl hydroxylase activity (Table 1 in III). Gel filtration experiments indicated that all three lysyl hydroxylase isoenzymes had identical molecular weights, about 180 kDa. In order to identify possible domain structures of the lysyl hydroxylase isoenzymes, the purified recombinant enzymes were subjected to limited proteolysis with thermolysin, trypsin or proteinase K. Five major protease-resistant peptides of about 68, 37, 33, 18 and 16 kDa were found after digestion (Figure 2 in III), except that the 18-kDa fragment was not obtained for lysyl hydroxylase 3. N-terminal sequencing of the major protease-resistant lysyl hydroxylase 1 and 3 polypeptides was consistent with the presence of at least two main protease-sensitive regions in these polypeptides and suggested the existence of at least three structural domains, termed fragments A, B and C, respectively, from the N terminus (Figure 3 in III). The calculated molecular masses with these cleavage sites are 29.6, 36.2 and 15.9 kDa for the non-glycosylated lysyl hydroxylase 1 polypeptide fragments and 30.4, 36.2 and 15.8 for the lysyl hydroxylase 3 polypeptide fragments. The 18-kDa peptide obtained only from lysyl hydroxylase 1 is probably due to utilization of the N-glycosylation site present in the 16-kDa fragment C of lysyl hydroxylase 1 but not in that of lysyl hydroxylase 3. Protease cleavages were also occasionally observed at additional sites in some of the purified lysyl hydroxylase samples, probably representing protease-sensitive sites within a folded domain. The N-termini of the protease-resistant lysyl hydroxylase 2 fragments were not determined, but they were assumed to be located in the corresponding 44 regions, due to their similarity in size to those of the protease-resistant lysyl hydroxylase 1 and 3 peptides. Baculoviruses coding for the recombinant lysyl hydroxylase 1 and 3 fragments A-C were generated as described above for the full-length polypeptides. The recombinant viruses were used to infect High Five cells in suspension, and the cells and medium samples were harvested 48 h after infection. The cells were homogenized as described in detail in paper III. All the recombinant lysyl hydroxylase fragments were found to be partly soluble in the glycerol buffer, but the highest amounts were found in the SDS- soluble fraction. Fragment A in the case of lysyl hydroxylases 1 and 3 was secreted into the culture medium and could be purified to homogeneity from medium samples (Figure 4 in III), whereas the levels of fragments B and C in the culture medium were too low for any attempts to be made to purify them to homogeneity. To study whether it is possible to express the lysyl hydroxylase 1 and 3 fragment pairs A-B and B-C in folded forms, recombinant pAcGP67A vectors coding for these polypeptides were generated, and the recombinant viruses were used to infect High Five cells in suspension. Considerable portions of these fragment pairs were found to be secreted into the culture medium and could be purified to homogeneity as above. Urea gradient gel analyses indicated that the recombinant lysyl hydroxylase 1 and 3 A fragments and the A-B and B-C fragment pairs were folded, as a sigmoidal transition in mobility resulting from unfolding along the urea gradient was observed in all cases (Figure 4 in III). The far-UV CD spectra of the purified recombinant lysyl hydroxylase 1 and 3 A fragments and A-B and B-C fragment pairs were also typical of a folded protein (Figure 5 in III). The purified recombinant full-length lysyl hydroxylase 1 and 3 isoenzymes and their recombinant fragments were analyzed for lysyl hydroxylase activity with an assay based on the hydroxylation-coupled decarboxylation of 2-oxo[1-14C]glutarate (Table 1 in III). As reported previously (Heikkinen et al. 2000), the imidazole that was used to elute the nickel affinity column rapidly abolished the lysyl hydroxylase activity, especially in the case of isoenzyme 3, which lost its activity particularly fast. The recombinant lysyl hydroxylase 1 and 3 A fragments and A-B pairs that lacked the C-terminal catalytic region had no lysyl hydroxylase activity. One surprising finding was that the recombinant lysyl hydroxylase 1 pair B-C, lacking the N-terminal 30-kDa fragment, was a fully active lysyl hydroxylase. The B-C pair of lysyl hydroxylase 3 is probably also a fully active enzyme, as the activity levels obtained with it were found in some experiments to be identical to those measured for the full-length lysyl hydroxylase 3 purified at the same 2+ time. The Km values of the lysyl hydroxylase 1 B-C pair for Fe , 2-oxoglutarate, ascorbate and the peptide substrate were 5, 120, 350 and 400 µM, respectively, being essentially identical to those of the full-length lysyl hydroxylase 1 and 3 enzymes (Table 2 in III). The purified recombinant full-length lysyl hydroxylase 1 and 3 isoenzymes and their recombinant fragments and fragment pairs were analyzed for collagen galactosyltransferase and glucosyltransferase activities by a method based on the transfer of [14C]galactose or [14C]glucose from radioactive UDP-galactose or UDP-glucose to hydroxylysine or galactosylhydroxylysine residues, respectively, in a denatured citrate- soluble rat skin collagen substrate. As reported previously (Heikkinen et al. 2000), full- length recombinant lysyl hydroxylase 3 showed collagen glucosyltransferase activity but 45 not lysyl hydroxylase 1 (Table 3 in III). The level of this activity was nevertheless only about 2% of that previously obtained with purified collagen glucosyltransferase from chick embryos (Anttinen et al. 1978, Kivirikko & Myllylä 1982). The collagen glucosyltransferase activity level in two human serum samples was also assayed to make sure that the assay was not giving markedly low values due to some artifact. The values obtained for the two serum samples were 0.32 and 0.36 milliunit/ml, i.e. well within the range of the previously reported control values of 0.24-0.60 milliunit/ml (Kuutti- Savolainen 1979). The Km values of lysyl hydroxylase 3 for UDP-glucose and the denatured citrate-soluble rat skin collagen substrate were about 150 µM and 4 g/litre, respectively. Both values are higher than those reported for UDP-glucose with collagen glucosyltransferases from various sources (5-30 µM) and for denatured citrate-soluble rat skin collagen with collagen glucosyltransferase from chick embryos (0.5-1 g/litre) (Kivirikko & Myllylä 1982). Lysyl hydroxylase 3 was also found to have collagen galactosyltransferase activity, but this wea not the case with any of the five lysyl hydroxylase 1 preparations or lysyl hydroxylase 3 B-C pairs. The activity level was so low, however, that no studies could be carried out to establish the possible specificity.

5.3 Embryonic lethality of the lysyl hydroxylase 3 knock-out mice (IV)

5.3.1 Generation of a mouse strain with the lysyl hydroxylase 3 gene inactivated (IV)

ES cells carrying an inactivated lysyl hydroxylase 3 gene were generated by replacing the coding region of exon 1 with the lacZ-PGK-neo cassette, so that the bacterial β- galactosidase reporter gene was under the control of the lysyl hydroxylase 3 promoter (Figure 1a in IV). More than 20 ES cell clones with the targeted mutation were identified, three of which were injected into mouse blastocysts, resulting in high-level chimerism and germ line transmission. Heterozygous mice, which are viable and fertile, were interbred to produce homozygous mutants. Genotyping of the offspring at three weeks of age showed that only wild-type and heterozygous mice were viable, and therefore embryos at different developmental stages were analyzed. The homozygous mutant embryos were found to develop apparently normally until embryonic day 8.5, after which their development was retarded. At embryonic day 9.5 the null embryos were very fragile and about 60% of the size of the wild-type and heterozygous embryos (Figure 2a in IV). Some null embryos could be recovered at embryonic day 10.5 (Table 1 in IV), but they were so fragile that they could not be touched without disruption. Northern blot analysis of total embryonic RNA using the 3’ end of the mouse lysyl hydroxylase 3 cDNA as a probe revealed a message in the -/- lane as well (Figure 1c in IV), and this was analyzed by RT-PCR to find out whether the RNA was functional. The analysis revealed that the transcript starts from the PGK promoter, which can function 46 bidirectionally, contains sequences from the neo gene, continues to intron 2 of the lysyl hydroxylase 3 gene, and contains some of the exon sequences downstream but does not have a continuous reading frame for protein synthesis. Thus the homozygous mutant embryos did not have a functional lysyl hydroxylase 3 mRNA. Heterozygous embryos analyzed by RT-PCR at the same time were found to produce a continuous lysyl hydroxylase 3 cDNA sequence.

5.3.2 Microscopic analysis

Histologically, cell density was significantly lower in the E9.5 homozygous mutant embryos than in the controls, and the cell layers also seemed to be separated from each other, suggesting loss of contact between the developing tissues. The homozygous mutant embryos showed an intense X-Gal staining, which appeared to be ubiquitous in the mesenchyme at E8.5-9.5. Intense staining was also observed in the developing neural tube (Figure 2 in IV). A normal distribution of laminin and perlecan was seen in the basement membranes of both the wild-type and homozygous mutant embryos, but staining for type IV collagen, which was intense and localized in the basement membranes of the wild-type embryos, was almost absent in the homozygous mutants. Brightly stained intracellular and extracellular amorphous particles were observed in some areas of the homozygous mutant embryos, but not in the wild-type samples (Figure 2 in IV). Electron microscopy revealed loss of contacts and a small number of filopodia in mesenchymal cells, a dilated endoplasmic reticulum (ER) and enlarged lysosomes in the homozygous mutant embryos, but normal looking Golgi complexes. The basement membranes of the wild-type embryos were continuous, whereas the homozygous mutants had only stretches of amorphous deposits on the cell surfaces (Figure 3 in IV). Immunoelectron microscopy with antibodies against type IV collagen showed decoration of the basement membrane zone in the wild-type and heterozygous embryos., whereas in the homozygous mutant embryos the gold particles were mainly detected in the dilated ER and large lysosome-like vesicles inside the cells, and also in the aggregated material outside the cells. The cross-banded collagen fibrils that were seen in the mesoderm of the wild-type embryos were not found in the mutants (Figure 3 in IV). 6 Discussion

6.1 A novel human lysyl hydroxylase isoenzyme, lysyl hydroxylase 3

A new member of the lysyl hydroxylase family, termed lysyl hydroxylase 3, was cloned and characterized in this work. The processed novel polypeptide is 714 amino acids long, being very similar in size to the corresponding human lysyl hydroxylases 1 (Hautala et al. 1992a), 2a (Valtavaara et al. 1997) and 2b (Yeowell & Walker 1999b) and the C. elegans lysyl hydroxylase (paper I of the present work, Norman & Moerman 2000), which have 709, 712, 733 and 714 residues, respectively (Figure 1 in I). The overall amino acid sequence identity between polypeptides of all three human lysyl hydroxylases and the alternatively spliced form of human lysyl hydroxylase 2 is about 47%. The catalytically important C-terminal region is particularly well conserved, the overall identity between the human isoenzymes in this region being about 70%. The recently characterized mouse isoenzymes are 91% identical to the corresponding human enzymes (Ruotsalainen et al. 1999). Thus, the similarity between the corresponding isoenzymes in the two mammalian species is higher than that between the two isoenzymes in the same species. All the human lysyl hydroxylase polypeptides contain nine cysteine residues in conserved positions, and lysyl hydroxylases 1 and 2 contain an additional cysteine conserved between them but not found in lysyl hydroxylase 3, while lysyl hydroxylases 2 and 3 contain yet another unique cysteine each. The C. elegans lysyl hydroxylase contains seven cysteine residues, six of which are conserved in all four human polypeptides, while its extreme N-terminal cysteine is conserved only in human lysyl hydroxylase 3. The six cysteines that are conserved in all human lysyl hydroxylases and the C. elegans enzyme may be especially important structurally in that they take part in disulphide bond formation (Freedman et al. 1998, Debarbieux & Beckwith 1999). Yeowell et al. (2000a) have shown recently that mutation of residues Cys369, Cys375, Cys552 and Cys687 in the lysyl hydroxylase 1 polypeptide virtually eliminated its lysyl hydroxylase activity, these cysteines also being found in conserved positions in all the other human lysyl hydroxylase isoenzymes. On the other hand, mutation of Cys267, Cys270 and Cys680 had an intermediary effect on enzyme activity, while mutation of the other three cysteines did not cause any loss of activity (Yeowell & Walker 2000). It was 48 also shown that each of these mutant polypeptides behaved in electrophoresis under non- reducing, non-denaturing conditions in the manner of a dimer with a molecular weight similar to that of the wild-type enzyme. This suggests that even if disulphide bond formation may affect the relative contribution of each cysteine to the enzyme activity, catalytic activity does not appear to be directly related to dimerization of the enzyme (Yeowell et al. 2000a). All the lysyl hydroxylase isoenzymes have potential attachment sites for asparagine- linked oligosaccharides. Human lysyl hydroxylase 3 has three such sites, lysyl hydroxylase 1 has four, and lysyl hydroxylases 2a and 2b each have seven sites, whereas the C. elegans lysyl hydroxylase has only one. Site-directed mutagenesis studies have indicated that glycosylation of the second site present in the human lysyl hydroxylase 1 polypeptide may be required for full catalytic activity (Pirskanen et al. 1996). This position also has a potential N-glycosylation site in human lysyl hydroxylase 2, but surprisingly, not in human lysyl hydroxylase 3 or the C. elegans lysyl hydroxylase. This suggests that the various isoforms may have different glycosylation requirements for their catalytic activity. The sizes of the mRNAs of the processed human lysyl hydroxylase polypeptides show major variation, ranging from the 3.0 kb transcript for lysyl hydroxylase 3 to the 4.4 kb transcript for lysyl hydroxylase 2b (Valtavaara, M., Risteli, M., Ruotsalainen, H., Wang, C. and Myllylä, R., unpublished data). This is mainly due to variation in the lengths of the untranslated regions. The highest expression levels of the lysyl hydroxylase 3 mRNA among the tissues studied were found in the placenta, pancreas, spinal cord and heart (Figure 2 in I). The relative level of lysyl hydroxylase 3 mRNA expression in the adult liver was very much lower than that of lysyl hydroxylase 1 mRNA, and much lower than that of lysyl hydroxylase 2 mRNA in the skeletal muscle (Heikkinen et al. 1994, Valtavaara et al. 1997). The expression of lysyl hydoxylases 2 and 3 seems to be more tightly regulated both in man and in the mouse (Valtavaara et al. 1997, Ruotsalainen et al. 1999) than that of lysyl hydroxylase 1. It should be noted that comparison of mRNA levels in whole tissues does not take into account the possible existence of major differences between certain cell types. Preliminary studies at the cell level indicate that lysyl hydroxylase 1 is the major form in skin fibroblasts, pancreas cells and placenta trophoblastic tumour cells, lysyl hydroxylase 2 is the major form in fibrosarcoma, osteosarcoma and hepatoblastoma cells, and lysyl hydroxylase 3 seems to be the major form in kidney adenocarcinoma cells (Wang et al. 2000). It is also possible that differences may exist between lysyl hydroxylase isoenzymes with respect to the hydroxylation of various collagen types. The initial suggestion of the existence of collagen type-specific lysyl hydroxylase isoforms was based on the findings that lysine residues in type II, IV and V collagens are hydroxylated in EDS VI patients to the same extent as in healthy subjects (Ihme et al. 1984). Furthermore, the residual lysyl hydroxylase activity in cells from patients with this syndrome was reported to preferentially hydroxylate lysine residues in type IV collagen (Risteli et al. 1980). When the mRNA levels of the different lysyl hydroxylase isoenzymes were compared with those of the most abundant collagen types I, III, IV and V, however, no significant correlation was found (Wang et al. 2000). This result indirectly argues against any collagen type specificity in lysine hydroxylation. 49 The baculovirus expression system was used here to produce the novel lysyl hydroxylase 3 isoenzyme in insect cells. The recombinant protein was found to migrate in multiple bands in SDS/PAGE, probably because of heterogeneity in glycosylation. The Nonidet P-40 and glycerol buffer extracts were analyzed for lysyl hydroxylase activity using synthetic peptides known to be hydroxylated by lysyl hydroxylase 1 in vitro as a substrate. The recombinant expression data differ from those obtained in similar experiments with a virus coding for lysyl hydroxylase 1, in that little, if any, lysyl hydroxylase 1 could be extracted with the Nonidet P-40 buffer and only about 10% was soluble in the glycerol buffer 72 h after infection, 90% being found in the 1% SDS- soluble portion at that point. By contrast, significant amounts of the lysyl hydroxylase 3 polypeptide were found in the Nonidet P-40 extracts, as has also been reported for the lysyl hydroxylase 2 polypeptide (Valtavaara et al. 1997). These results indicate the presence of definite differences in solubility between the isoenzymes, lysyl hydroxylase 3 being solubilized more readily than lysyl hydroxylase 1. The Km values of lysyl hydroxylase 3 for the cosubstrates and peptide substrates were found to be essentially identical to those determined for the recombinant human lysyl hydroxylase 1 (Pirskanen et al. 1996). Recent preliminary studies have shown, however, that mRNA levels of lysyl hydroxylase 3 do not correlate with those of the other lysyl hydroxylase isoenzymes or that of the α subunit of type I prolyl 4-hydroxylase, thus suggesting a difference in function between lysyl hydroxylase 3 and the other isoenzymes in spite of the similarities in their Km values (Wang et al. 2000). Cloning of the human lysyl hydroxylase 3 gene revealed that it consists of 19 exons and is 11.6 kb in size, being considerably smaller than the lysyl hydroxylase 1 gene, which is 40 kb in size (Heikkinen et al. 1994). The genes for lysyl hydroxylases 1, 2 and 3 have been assigned to 1p36.2-36.3 (Hautala et al. 1992a), 3q23-24 (Szpirer et al. 1997) and 7q36 (Valtavaara et al. 1998), respectively. The exons of the lysyl hydroxylase 3 gene are well conserved in terms of size as compared with those of the lysyl hydroxylase 1 gene, but the introns are very much shorter. The exons represent altogether about 7% of the size of the lysyl hydroxylase 1 gene, but as much as about 25% in the case of the lysyl hydroxylase 3 gene. The boundaries of all the exons and introns follow the GT/AG consensus rule in both genes. The lysyl hydroxylase 3 gene contains at least 15 full-length or partial Alu repeats dispersed among six introns, while the lysyl hydroxylase 1 gene contains at least 13 Alu repeats in two introns. As discussed in section 2.3.5.1, Alu repeats are involved in many gene rearrangements that have occurred by homologous recombination, leading to many human genetic diseases (Rüdiger et al. 1995, Deininger & Batzer 1999). In the case of the lysyl hydroxylase 1 gene, an Alu-mediated homologous recombination is the most common mutation leading to Ehlers-Danlos syndrome type VI (Pousi et al. 1994, Heikkinen et al. 1994, 1997). It is currently unknown whether any heritable disorder is due to a recombination of any of the Alu repeats present in the human lysyl hydroxylase 3 gene. The 5’ flanking region of the lysyl hydroxylase 3 gene shows no nucleotide sequence similarity to that of the lysyl hydroxylase 1 gene (Heikkinen et al. 1994). The length of the 5’ untranslated region in the lysyl hydroxylase 1 mRNA is 60-70 nucleotides, whereas in the case of lysyl hydroxylase 3 it ranges from 249 to 335 nucleotides, a 311-nucleotide form being the major species. The lysyl hydroxylase 3 and 1 genes both lack a TATAA or 50 related sequence and a core promoter element, called an initiator (Inr), both elements that are capable of determining the precise site of transcription initiation (Kaufmann & Smale 1994). The two genes also share a high G+C nucleotide content in their 5’ flanking regions, a feature characteristic of genes having multiple transcription initiation sites. The 5’ flanking region of the lysyl hydroxylase 3 gene also lacks a CCAAT sequence, which is found in the lysyl hydroxylase 1 gene, but it does contain potential binding sites for several transcription factors, although the functional significance of these possible regulatory elements remains to be determined. According to phylogenetic analysis it seems likely that all the lysyl hydroxylase isoforms are derived from an ancestral gene by two duplication events, lysyl hydroxylases 1 and 2 being more closely related and having been brought about by a more recent duplication than lysyl hydroxylase 3 (Ruotsalainen et al. 1999). This is consistent with our findings concerning the origin of the Alu sequences present in the lysyl hydroxylase 3 gene, in that they seem to belong to the oldest Alu subfamilies. There is likewise considerable heterogeneity between the 5’ flanking regions of the human lysyl hydroxylase 1 and 3 genes, not only at the nucleotide level but also when comparing the potential binding sites for different transcription factors, suggesting the existence of major differences in regulation of the expression of these two genes.

6.2 Lysyl hydroxylase polypeptides consist of at least three distinct domains

We were able in this work to isolate the three recombinant human lysyl hydroxylase isoenzymes for the first time as homogeneous proteins. Previous attempts to purify recombinant human lysyl hydroxylase 1 expressed into the endoplasmic reticulum of insect cells had been hampered by the marked insolubility and aggregation tendency of the recombinant enzyme (Pirskanen et al. 1996). A key feature in the purification procedure adopted here was the use of a signal sequence of the baculoviral glycoprotein GP67 (Krol et al. 1996), which enabled these normally intracellular isoenzymes to be secreted into the insect cell culture medium. Another key feature was the use of an N- terminal histidine tag that remained in the processed polypeptides, which made it possible to purify the isoenzymes to homogeneity with a nickel affinity column. Unfortunately, it was not possible to isolate the isoenzymes with maximal catalytic activity, because the imidazole that was used to elute the proteins from the column rapidly abolishes lysyl hydroxylase activity (Heikkinen et al. 2000), especially in the case of the lysyl hydroxylase 3 polypeptide. The highest catalytic centre activity measured for any recombinant lysyl hydroxylase preparation was 65 mol/mol/min, i.e. mol of the radioactive product synthesized per mol of polypeptide in 1 min at saturating concentrations of 2-oxoglutarate and the peptide substrate, but in most cases the values were only about 10-15 mol/mol/min. The highest catalytic centre activities measured for non-recombinant lysyl hydroxylase 1 isolated by a procedure involving the use of two affinity columns reached 60-100 mol/mol/min (Turpeenniemi-Hujanen et al. 1980, 1981), but the non-recombinant enzyme also rapidly becomes inactivated. 51 The gel filtration data indicated that all three lysyl hydroxylase isoenzymes are homodimers with molecular weights of about 180 kDa. Thus the differences in their mobilities seen in analyses by non-denaturing PAGE (Figure 1 in III) are likely to represent differences in their charges rather than in their molecular weights. The calculated pI values, i.e. the pH values at which a molecule carries no charge, are 6.47, 6.14 and 5.56 for lysyl hydroxylases 1, 2 and 3, respectively. The presence of at least two protease-sensitive regions in the lysyl hydroxylase polypeptides was indicated by limited proteolysis experiments, suggesting that the polypeptides may consist of at least three distinct domains, termed A, B and C (Figure 5). These three domains were expressed as recombinant polypeptide fragments in insect cells, and purified on a nickel affinity column, but only the N-terminal recombinant fragment A was secreted into the culture medium in significant amounts, whereas the B and C fragments mainly accumulated inside the insect cells. An interesting finding was that when these domains were expressed as recombinant double-fragment polypeptides A-B and B-C, they were secreted and could be purified to homogeneity. It was thus possible to study the properties of one recombinant fragment and two fragment pairs. Urea gradient gel electrophoresis and CD spectrum analysis indicated that all these recombinant polypeptides were folded, suggesting that they may indeed represent distinct domains.

Fig. 5. Schematic representation of the human lysyl hydroxylase 1, 2 and 3 and C.elegans lysyl hydroxylase polypeptides. Numbers represent the lengths of the processed polypeptides. Cysteine residues and potential attachment sites for asparagine-linked oligosaccharides are shown below the polypeptides, and the catalytically important residues are shown above them. The possible domain structure of the human lysyl hydroxylases is represented by A, B and C. The domain structure of the C.elegans lysyl hydroxylase has not been determined, but its amino acid similarity to the human lysyl hydroxylases suggests that it is probably the same. 52 A highly surprising finding was that the N-terminal fragment A plays no critical role in the catalytic activity of lysyl hydroxylase, as the recombinant B-C polypeptide was a fully active hydroxylase. The Km values determined for the full-length enzyme and for the B-C polypeptide were also identical. Similar data have been reported previously for another 2-oxoglutarate dioxygenase, aspartyl (asparaginyl) β-hydroxylase, in which a recombinant fragment lacking the 310 N-terminal amino acids possesses full catalytic activity (Jia et al. 1994). The catalytic subunit of prolyl 4-hydroxylase has separate peptide substrate-binding and catalytic domains, the former being located between residues 138 and 244 and the latter in the C-terminal region of the 517-amino acid residue human α(I) subunit (Myllyharju & Kivirikko 1999, Hieta et al. 2003). The present data clearly show that fragment A is not a peptide substrate-binding domain of lysyl hydroxylase, but they do not indicate whether the peptide-binding site is located in fragment B, which may correspond to the peptide-binding domain in the catalytic subunit of prolyl 4-hydroxylase. The present work confirms the highly interesting and surprising finding that lysyl hydroxylase 3 also possesses collagen glucosyltransferase activity, whereas lysyl hydroxylases 1 and 2 do not (Heikkinen et al. 2000). The level of this activity was found to be very low, however, only about 2% of that reported for a purified collagen glucosyltransferase from chick embryos. The Km values of lysyl hydroxylase 3 for UDP- glucose and denatured citrate-soluble rat skin collagen substrate were also 4-5-fold relative to the highest corresponding values reported with collagen glucosyltransferases from other sources (Anttinen et al. 1978, Kivirikko & Myllylä 1982). A distinctly lower Km of lysyl hydroxylase 3 for UDP-glucose was reported by C. Wang et al. (2002), but the reason for this difference is currently unknown. The glucosyltransferase activity of lysyl hydroxylase 3, unlike the hydroxylase activity, was not abolished by imidazole (Heikkinen et al. 2000), so that the very low activity level measured is not likely to be due to a marked degree of inactivation. The highest catalytic centre activity measured for the glucosyltransferase activity of lysyl hydroxylase 3 is only about 1 mol/mol/min, and is thus markedly lower than those of prolyl 4-hydroxylase (400 mol/mol/min) with a synthetic peptide (Pro-Pro-Gly)10 as a substrate (Vuori et al. 1992) or of lysyl hydroxylase 1 (60-100 mol/mol/min) with the peptide Ala-Arg-Gly-Ile-Lys-Gly-Ile-Arg- Gly-Phe-Ser-Gly (Turpeenniemi-Hujanen et al. 1980, 1981), both after correction for saturating concentrations of 2-oxoglutarate and the peptide substrate. The hydroxylation of lysine residues and subsequent glycosylation of hydroxylysine residues must take place in at least three consecutive steps: first, some of the lysine residues are converted to hydroxylysine, second, some of the hydroxylysine residues are glycosylated to galactosylhydroxylysine, and third, some of the galactosylhydroxylysine residues are glycosylated to glucosylgalactosylhydroxylysine (Kivirikko & Myllylä 1979, Kivirikko & Pihlajaniemi 1998). The two glycosyltransferase reactions have been shown to be carried out by distinct enzymes that can be separated from each other and from lysyl hydroxylase activity in gel filtration (Kivirikko & Myllylä 1979). We have shown in this work that lysyl hydroxylase 3 may also have collagen galactosyltranferase activity, but that the catalytic centre activity is very low, less than 0.1 mol/mol/min. As the galactosyltransferase must act before the glucosyltransferase, it would seem virtually impossible for the lysyl hydroxylase 3 polypeptide to perform all three enzymatic reactions successively in vivo. C. Wang et al. (2002) have also demonstrated very 53 recently that lysyl hydroxylase 3 has collagen galactosyltransferase activity, and have suggested that one gene product could catalyze all three consecutive steps in hydroxylysine-linked carbohydrate formation. The level of galactosyltranferase activity determined by them was distinctly higher than that measured by us, but it was still only about 2 mol/mol/min, i.e. markedly lower than the level of lysyl hydroxylase activity discussed above. All the glucosyltransferase and galactosyltransferase activity present in the lysyl hydroxylase 3 polypeptide was found to reside in fragment A, which was found to play no role in the hydroxylase activity of the polypeptide. Despite the high degrees of similarity between the A fragments of human lysyl hydroxylases 1 and 3 (77%) and of human lysyl hydroxylases 2 and 3 (77%), lysyl hydroxylase 3 is the only isoenzyme possessing collagen glycosyltransferase activity. Recent in vitro mutagenesis experiments have also demonstrated that the amino acids involved in the glucosyltransferase differ from those required for lysyl hydroxylase 3 activity (Heikkinen et al. 2000, C. Wang et al. 2002). The collagen glycosyltransferase activities reported for the lysyl hydroxylase 3 polypeptide in our work are so low that they may be of little biological significance. It is thus evident that human tissues must have additional collagen glycosyltransferases that are responsible for most of the collagen glycosylation in vivo.

6.3 Hydroxylysine formed by lysyl hydroxylase 3 is essential for early mouse development

To study whether the lysyl hydroxylase isoenzymes have specific functions, the murine lysyl hydroxylase 3 gene was inactivated. Hydroxylysine residues have been regarded as non-essential for vertebrate survival, since patients with the kyphoscoliotic type of Ehlers-Danlos syndrome (EDS VI) have a virtual absence of hydroxylysine in the collagen of their skin and yet a relatively mild disease phenotype (Pinnell et al. 1972, Yeowell & Walker 2000), and since recombinant collagens I and III with no hydroxylysine form native-type fibrils in vitro (Nokelainen et al. 2001). We therefore expected that the lysyl hydroxylase 3 knock-out would lead to a relatively mild phenotype and started genotyping the mice at the age of three weeks. No homozygous mutants were obtained, however, until we moved to the prenatal stage and found that the homozygous mutant embryos die at embryonic day 9.0-10.5. The mutant embryos have a normal appearance but suffer severe developmental retardation, being about 60% of the size of the wild-type and heterozygous embryos. Our data demonstrate that loss of lysyl hydroxylase 3 activity leads to the synthesis of non-functional collagen IV, which is not incorporated into the basement membranes but is retained inside the cells or deposited in an abnormal, aggregated form. Our preliminary data indicate that synthesis of the basement membrane-associated collagen XVIII is affected in the same fashion as shown for type IV collagen, suggesting that type IV collagen is not the only substrate for lysyl hydroxylase 3. Activity of this isoenzyme is therefore essential for the assembly of a functional basement membrane. 54 Laminin and perlecan were distributed normally in the basement membranes of both the wild-type and homozygous mutant embryos. It has been shown recently that laminin is essential for the organization of the first basement membranes formed at the blastocyst stage, since deletion of the laminin γ1 gene leads to embryonic lethality at embryonic day 5.5 (Smyth et al. 1999). In contrast, lack of perlecan, another ubiquitous basement membrane component, is lethal at embryonic day 10 at the earliest (Costell et al. 1999). Type IV collagen is already expressed at the blastocyst stage (Leivo et al. 1980), but our results indicate that development can proceed normally without any of this collagen, or with only very small amounts of it, until embryonic day 8.5. When proline 4-hydroxylation is inhibited, collagen triple helix formation is prevented, leading to an accumulation of non-triple-helical collagen chains within the ER (Prockop et al. 1979). Although the dilatation of the ER seen in the lysyl hydroxylase 3 knock-out mice is similar to this, the mechanism involved must be different, as hydroxylysine residues play no role in the stability of the collagen triple helix (Nokelainen et al. 1998). Our immunoelectron microscopy data indicated that type IV collagen is retained in the ER of homozygous mutant embryos, possibly because molecules that are deficient in hydroxylysine, and consequently in hydroxylysine-linked carbohydrates, aggregate prematurely. They are then mainly transferred into lysosomes, while some are secreted as aggregated particles (Figure 3 in IV). These aggregates cannot be incorporated into the basement membranes, which therefore lack the mechanical strength necessary for further embryonic development. Norman and Moerman (2000) have shown recently that type IV collagen is expressed in C. elegans in the absence of the only lysyl hydroxylase, but it is retained within the type IV collagen-producing cells. No data have been available on any possible role for hydroxylysine in the synthesis of vertebrate non-fibrillar collagens, but it is evident from this work that hydroxylysine residues play a much more crucial role in the synthesis of type IV collagen, and probably also of other non-fibrillar collagens, than is known to be the case in the synthesis of fibril-forming collagens in vertebrates. We have also shown that lysyl hydroxylase 3 is the main, if not the only, isoenzyme involved in the synthesis of type IV collagen during early development, and that lysyl hydroxylase 1 or 2 cannot compensate for the lack of its function. Wang et al. (2000) compared lysyl hydroxylase mRNA levels with those of the most abundant collagens, types I, III, IV and V, and found no correlation between lysyl hydroxylase isoforms and the individual collagen types. According to their data, lysyl hydroxylase 3 mRNA levels do not correlate with those of lysyl hydroxylases 1 or 2 or the α subunit of the type I prolyl 4-hydroxylase, clearly indicating a difference in the regulation of expression between lysyl hydroxylase 3 and lysyl hydroxylases 1 and 2. It is therefore possible that lysyl hydroxylase 3, being the oldest form and evolutionarily less related to the others (Ruotsalainen et al. 1999), may hydroxylate other types of sequences in vivo in addition to collagenous ones (Wang et al. 2000). Lysyl hydroxylase 1 has been shown to be a helical lysyl hydroxylase catalyzing the conversion of lysine residues in triple-helical domains into hydroxylysines (Yeowell & Walker 2000, Steinmann et al. 1995), while lysyl hydroxylase 2 has very recently been reported to be a telopeptide hydroxylase and to play a significant role in the irreversible accumulation of collagen in fibrosis (van der Slot et al. 2003). Nevertheless, all three purified lysyl hydroxylase isoenzymes were found in our work to hydroxylase the 55 synthetic peptide (Ile-Lys-Gly)3, which represents a sequence typical of a triple-helical domain, with identical Km values. The substrate specificity of lysyl hydroxylase 3 has not yet been elucidated, and so far no disease has been associated with defects in it. Our data suggest, however, that it may be the main enzyme responsible for catalyzing the hydroxylation of type IV and type XVIII collagens, and possibly other non-fibrillar collagens as well.

7 Future prospects

Altogether three lysyl hydroxylase isoenzymes have been characterized in human tissues (Valtavaara et al. 1997, 1998, Yeowell & Walker 1999b, paper I of the present study) and in the mouse (Ruotsalainen et al. 1999) and rat (Mercer et al. 2003). Little is yet known about the functional differences between them, but monoclonal and polyclonal antibodies prepared against them should provide more information about their tissue distribution, and especially their expression in various cell types. Lysyl hydroxylase 3 was shown here to be essential for the synthesis of type IV collagen, as inactivation of its gene led to instability in the basement membranes, probably due to premature aggregation of the type IV collagen molecules, which are mainly transported into lysosomes, while some are secreted as aggregated particles (paper IV of the present study). Additional studies with type IV collagen that is lacking in hydroxylysine residues are needed to elucidate the mechanism involved. More research will also be needed to indicate whether hydroxylysine formed by lysyl hydroxylase 3 plays a crucial role in the synthesis of other non-fibrillar collagens. The levels of collagen glycosyltransferase activities in the lysyl hydroxylase 3 knock-out mice also remain to be determined. Experiments are in progress to generate knock-out mice for studying the roles of lysyl hydroxylase isoenzymes 1 and 2. Deficiency of lysyl hydroxylase 1 activity leads to the kyphoscoliotic type of Ehlers- Danlos syndrome in humans (Yeowell & Walker 2000, Steinmann et al. 2002), while lysyl hydroxylase 2 has very recently been reported to be a telopeptide hydroxylase and to play a significant role in the irreversible accumulation of collagen in fibrosis (van der Slot et al. 2003). More research will clearly be needed to determine the differences in catalytic properties between the lysyl hydroxylase isoenzymes. None of those characterized so far shows any distinct collagen-type-specificity (Wang et al. 2000). Instead it may be the amino acid sequences surrounding the individual lysine residues that are important for determining which lysyl hydroxylase isoenzyme catalyzes the hydroxylation reaction. Purification of lysyl hydroxylase isoenzymes 1-3 as homogeneous proteins and characterization of the possible domain structure of the lysyl hydroxylase polypeptides (paper III of the present study) should provide a means of resolving the three-dimensional structures of these enzymes and their domains by crystallization and NMR. References

Alberts B, Bray D, Lewis J, Raff M, Roberts K & Watson J (2002) Molecular biology of the cell, 4th ed. Garland Publishing, Inc. Ananthanarayanan VS, Saint-Jean A & Jiang P (1992) Conformation of a synthetic hexapeptide substrate of collagen lysyl hydroxylase. Arch Biochem Biophys 298: 21-28. Andrews PC, Hawke D, Shively JE & Dixon JE (1984) Anglerfish preprosomatostatin II is processed to somatostatin-28 and contains hydroxylysine at residue 23. J Biol Chem 259: 15021-15024. Anttinen H, Myllylä R & Kivirikko KI (1978) Further characterization of galactosylhydroxylysyl glucosyltranferase from chick embryos. Amino acid composition and acceptor specificity. Biochem J 175: 737-742. Armstrong LC & Last JA (1995) Rat lysyl hydroxylase: molecular cloning, mRNA distribution and expression in a baculovirus system. Biochim Biophys Acta 1264: 93-102. Bailey AJ, Paul RG & Knott L (1998) Mechanisms of maturation and ageing of collagen. Mech Ageing Dev 106: 1-56. Bailey AJ & Shimokomaki MS (1971) Age related changes in the reducible cross-links of collagen. FEBS Lett 16: 86-88. Bailey AJ, Wotton SF, Sims TJ & Thompson PW (1992) Post-translational modifications in the collagen of human osteoporotic femoral head. Biochem Biophys Res Commun 185: 801-805. Bank RA, Robins SP, Wijmenga C, Breslau-Siderius LJ, Bardoel AF, van der Sluijs HA, Pruijs HE & TeKoppele JM (1999) Defective collagen crosslinking in bone, but not in ligament or cartilage, in Bruck syndrome: indications for a bone-specific telopeptide lysyl hydroxylase on chromosome 17. Proc Natl Acad Sci U S A 96: 1054-1058. Bank RA, TeKoppele JM, Janus GJ, Wassen MH, Pruijs HE, van der Sluijs HA & Sakkers RJ (2000) Pyridinium cross-links in bone of patients with osteogenesis imperfecta: evidence of a normal intrafibrillar collagen packing. J Bone Miner Res 15: 1330-1336. Banyard J, Bao L & Zetter BR (2003) Type XXIII collagen, a new transmembrane collagen identified in metastatic tumor cells. J Biol Chem 278: 20989-20994. Barnes MJ, Constable BJ, Morton LF & Royce PM (1974) Age-related variations in hydroxylation of lysine and proline in collagen. Biochem J 139: 461-468. 58 Batzer MA, Deininger PL, Hellmann-Blumberg U, Jurka J, Labuda D, Rubin CM, Schmid CW, Zietkiewicz E & Zuckerkandl E (1996) Standardized nomenclature for Alu repeats. J Mol Evol 42: 3-6. Beck K & Brodsky B (1998) Supercoiled protein motifs: the collagen triple-helix and the α-helical coiled coil. J Struct Biol 122: 17-29. Beighton P (1993) The Ehlers-Danlos syndrome. In Beighton P (ed) McKusicks´s Heritable Disorders of Connective Tissue 189-251. Mosby-Year Book, Inc., St. Louis. Beighton P (1970) Serious ophthalmological complications in the Ehlers-Danlos syndrome. Br J Ophthalmol 54: 263-268. Beighton P, De Paepe A, Steinmann B, Tsipouras P & Wenstrup RJ (1998) Ehlers-Danlos syndromes: revised nosology, Villefranche, 1997. Ehlers-Danlos National Foundation (USA) and Ehlers-Danlos Support Group (UK). Am J Med Genet 77: 31-37. Berk AJ & Sharp PA (1977) Sizing and mapping of early adenovirus mRNAs by gel electrophoresis of S1 endonuclease-digested hybrids. Cell 12: 721-732. Boot-Handford RP, Tuckwell DS, Plumb DA, Farrington RC & Poulsom R (2003) A novel and highly conserved collagen [proα1(XXVII)] with a unique expression pattern and unusual molecular characteristics establishes a new clade within the vertebrate fibrillar collagen family. J Biol Chem, in press Brenner RE, Vetter U, Stoss H, Muller PK & Teller WM (1993) Defective collagen fibril formation and mineralization in osteogenesis imperfecta with congenital joint contractures (Bruck syndrome). Eur J Pediatr 152: 505-508. Brinckmann J, Acil Y, Feshchenko S, Katzer E, Brenner R, Kulozik A & Kugler S (1998) Ehlers- Danlos syndrome type VI: lysyl hydroxylase deficiency due to a novel point mutation (W612C). Arch Dermatol Res 290: 181-186. Brinckmann J, Notbohm H, Tronnier M, Acil Y, Fietzek PP, Schmeller W, Muller PK & Batge B (1999) Overhydroxylation of lysyl residues is the initial step for altered collagen cross-links and fibril architecture in fibrotic skin. J Invest Dermatol 113: 617-621. Brown JC & Timpl R (1995) The collagen superfamily. Int Arch Allergy Immunol 107: 484-490. Byers PH (2001) Disorders of Collagen Biosynthesis and Structure. In: Scriver CR, Beaudet AL, Sly WS, & Valle D (eds) The Metabolic & Molecular Bases of Inherited Disease, Vol. IV 5241- 5285. McGraw-Hill Medical Publishing Division, Byers PH (1994) Ehlers-Danlos syndrome: recent advances and current understanding of the clinical and genetic heterogeneity. J Invest Dermatol 103: 47S-52S. Cardinale GJ, Rhoads RE & Udenfriend S (1971) Simultaneous incorporation of 18 O into succinate and hydroxyproline catalyzed by collagen proline hydroxylase. Biochem Biophys Res Commun 43: 537-543. Chung CH, Au LC & Huang TF (1999) Molecular cloning and sequence analysis of aggretin, a collagen-like platelet aggregation inducer. Biochem Biophys Res Commun 263: 723-727. Costell M, Gustafsson E, Aszodi A, Morgelin M, Bloch W, Hunziker E, Addicks K, Timpl R & Fässler R (1999) Perlecan maintains the integrity of cartilage and some basement membranes. J Cell Biol 147: 1109-1122. Counts DF, Cardinale GJ & Udenfriend S (1978) Prolyl hydroxylase half reaction: peptidyl prolyl- independent decarboxylation of α-ketoglutarate. Proc Natl Acad Sci U S A 75: 2145-2149. Crossen R & Gruenwald S (1998) Baculovirus Expression Vector System, Instruction Manual.San Diego, CA. 59 Csiszar K (2001) Lysyl : a novel multifunctional family. Prog Nucleic Acid Res Mol Biol 70: 1-32. de Jong L & Kemp A (1984) Stoicheiometry and kinetics of the prolyl 4-hydroxylase partial reaction. Biochim Biophys Acta 787: 105-111. Debarbieux L & Beckwith J (1999) Electron avenue: pathways of disulphide bond formation and isomerization. Cell 99: 117-119. Deininger PL & Batzer MA (1995) SINE master genes and population biology. In: Maraia, R (ed) The Impact of Short, Interspersed Elements (SINEs) on the Host Genome 43-60.Georgetown . Deininger PL & Batzer MA (1999) Alu repeats and human disease. Mol Genet Metab 67: 183-193. Evans MJ & Kaufman MH (1981) Establishment in culture of pluripotential cells from mouse embryos. Nature 292: 154-156. Eyre D (1987) Collagen cross-linking amino acids. Methods Enzymol 144: 115-139. Ezer S, Bayes M, Elomaa O, Schlessinger D & Kere J (1999) Ectodysplasin is a collagenous trimeric type II membrane protein with a tumor necrosis factor-like domain and co-localizes with cytoskeletal structures at lateral and apical surfaces of cells. Hum Mol Genet 8: 2079-2086. Fitzgerald J & Bateman JF (2001) A new FACIT of the collagen family: COL21A1. FEBS Lett 505: 275-280. Freedman RB, Dunn AD & Ruddock LW (1998) Protein folding: a missing redox link in the endoplasmic reticulum. Curr Biol 8: R468-R470. Fujimoto D & Tamiya N (1962) Incorporation of 18O from air into hydroxyproline by chick embryos. Biochem J 84: 333. Gossler A & Zachgo J (1994) Gene targeting. A practical approach. IRL Press, Oxford. Ha VT, Marshall MK, Elsas LJ, Pinnell SR & Yeowell HN (1994) A patient with Ehlers-Danlos syndrome type VI is a compound heterozygote for mutations in the lysyl hydroxylase gene. J Clin Invest 93: 1716-1721. Halme J, Kivirikko KI & Simons K (1970) Isolation and partial characterization of highly purified protocollagen proline hydroxylase. Biochim Biophys Acta 198: 460-470. Hanauske-Abel HM (1991) Prolyl 4-hydroxylase, a target enzyme for drug development. Design of suppressive agents and the in vitro effects of inhibitors and proinhibitors. J Hepatol 13 Suppl 3: S8-15. Hanauske-Abel HM & Günzler V (1982) A stereochemical concept for the catalytic mechanism of prolylhydroxylase: applicability to classification and design of inhibitors . J Theor Biol 94: 421- 455. Hashimoto T, Wakabayashi T, Watanabe A, Kowa H, Hosoda R, Nakamura A, Kanazawa I, Arai T, Takio K, Mann DM & Iwatsubo T (2002) CLAC: a novel Alzheimer amyloid plaque component derived from a transmembrane precursor, CLAC-P/collagen type XXV. EMBO J 21: 1524-1534. Hasty P, Abuin A & Bradley A (2000) Gene targeting, principles, and practice in mammalian cells. In: Joyner, A L (ed) Gene Targeting. A Practical Approach 1-34. Oxford University Press, New York. Hautala T, Byers MG, Eddy RL, Shows TB, Kivirikko KI & Myllylä R (1992a) Cloning of human lysyl hydroxylase: complete cDNA-derived amino acid sequence and assignment of the gene (PLOD) to chromosome 1p36.3-p36.2. Genomics 13: 62-69. Hautala T, Heikkinen J, Kivirikko KI & Myllylä R (1992b) Minoxidil specifically decreases the expression of lysine hydroxylase in cultured human skin fibroblasts. Biochem J 283: 51-54. 60 Hautala T, Heikkinen J, Kivirikko KI & Myllylä R (1993) A large duplication in the gene for lysyl hydroxylase accounts for the type VI variant of Ehlers-Danlos syndrome in two siblings. Genomics 15: 399-404. Heikkinen J, Hautala T, Kivirikko KI & Myllylä R (1994) Structure and expression of the human lysyl hydroxylase gene (PLOD): introns 9 and 16 contain Alu sequences at the sites of recombination in Ehlers-Danlos syndrome type VI patients. Genomics 24: 464-471. Heikkinen J, Pousi B, Pope M & Myllylä R (1999) A null-mutated lysyl hydroxylase gene in a compound heterozygote British patient with Ehlers-Danlos syndrome type VI. Hum Mutat 14: 351. Heikkinen J, Risteli M, Wang C, Latvala J, Rossi M, Valtavaara M & Myllylä R (2000) Lysyl hydroxylase 3 is a multifunctional protein possessing collagen glucosyltransferase activity. J Biol Chem 275: 36158-36163. Heikkinen J, Toppinen T, Yeowell H, Krieg T, Steinmann B, Kivirikko KI & Myllylä R (1997) Duplication of seven exons in the lysyl hydroxylase gene is associated with longer forms of a repetitive sequence within the gene and is a common cause for the type VI variant of Ehlers- Danlos syndrome. Am J Hum Genet 60: 48-56. Helaakoski T, Annunen P, Vuori K, MacNeil IA, Pihlajaniemi T & Kivirikko KI (1995) Cloning, baculovirus expression, and characterization of a second mouse prolyl 4-hydroxylase α-subunit

isoform: formation of an α2β2 tetramer with the protein disulphide-isomerase/β subunit. Proc Natl Acad Sci U S A 92: 4427-4431. Hendershot LM & Bulleid NJ (2000) Protein-specific chaperones: the role of hsp47 begins to gel. Curr Biol 10: R912-R915. Hieta R, Kukkola L, Permi P, Pirilä P, Kivirikko KI, Kilpeläinen I & Myllyharju J (2003) The peptide-substrate binding domain of human collagen prolyl 4-hydroxylases. Backbone assignments, secondary structure and binding of proline-rich peptides. J Biol Chem, in press Hulmes DJ (1992) The collagen superfamily-diverse structures and assemblies. Essays Biochem 27: 49-67. Hyland J, Ala-Kokko L, Royce P, Steinmann B, Kivirikko KI & Myllylä R (1992) A homozygous stop codon in the lysyl hydroxylase gene in two siblings with Ehlers-Danlos syndrome type VI. Nat Genet 2: 228-231. Ihme A, Krieg T, Nerlich A, Feldmann U, Rauterberg J, Glanville RW, Edel G & Muller PK (1984) Ehlers-Danlos syndrome type VI: collagen type specificity of defective lysyl hydroxylation in various tissues. J Invest Dermatol 83: 161-165. Javahery R, Khachi A, Lo K, Zenzie-Gregory B & Smale ST (1994) DNA sequence requirements for transcriptional initiator activity in mammalian cells. Mol Cell Biol 14: 116-127. Jenkins CL & Raines RT (2002) Insights on the conformational stability of collagen. Nat Prod Rep 19: 49-59. Jia S, McGinnis K, VanDusen WJ, Burke CJ, Kuo A, Griffin PR, Sardana MK, Elliston KO, Stern AM & Friedman PA (1994) A fully active catalytic domain of bovine aspartyl (asparaginyl) beta- hydroxylase expressed in Escherichia coli: characterization and evidence for the identification of an active-site region in vertebrate alpha-ketoglutarate-dependent dioxygenases. Proc Natl Acad Sci U S A 91: 7227-7231. Jiang P & Ananthanarayanan VS (1991) Conformational requirement for lysine hydroxylation in collagen. Structural studies on synthetic peptide substrates of lysyl hydroxylase. J Biol Chem 266: 22960-22967. 61 Judisch GF, Waziri M & Krachmer JH (1976) Ocular Ehlers-Danlos syndrome with normal lysyl hydroxylase activity. Arch Ophthalmol 94: 1489-1491. Kadler K (1995) Extracellular matrix 1: Fibril-forming collagens. Protein Profile 2: 491-619. Kadler KE, Holmes DF, Trotter JA & Chapman JA (1996) Collagen fibril formation. Biochem J 316: 1-11. Kagan HM & Li W (2003) Lysyl Oxidase: Properties, specificity, and biological roles inside and outside of the cell. J Cell Biol 88: 660-672. Kaufmann J & Smale ST (1994) Direct recognition of initiator elements by a component of the transcription factor IID complex. Genes Dev 8: 821-829. Kellokumpu S, Sormunen R, Heikkinen J & Myllylä R (1994) Lysyl hydroxylase, a collagen processing enzyme, exemplifies a novel class of luminally-oriented peripheral membrane proteins in the endoplasmic reticulum. J Biol Chem 269: 30524-30529. Kielty CM & Grant ME (2002) The collagen family: Structure, assembly, and organization in the extracellular matrix. In:Royce, PM & Steinmann, B (eds) Connective Tissue and Its Heritable Disorders. Molecular, Genetics, and Medical Aspects 2nd ed.: 159-221. Wiley-Liss, Inc., New York. Kivirikko KI (1993) Collagens and their abnormalities in a wide spectrum of diseases. Ann Med 25: 113-126.

Kivirikko KI, Kishida Y, Sakakibara S & Prockop DJ (1972a) Hydroxylation of (X-Pro-Gly)n by protocollagen proline hydroxylase. Effect of chain length, helical conformation and amino acid sequence in the substrate. Biochim Biophys Acta 271: 347-356. Kivirikko KI & Myllylä R (1979) Collagen glycosyltransferases. Int Rev Connect Tissue Res 8: 23- 72. Kivirikko KI & Myllylä R (1980) The hydroxylation of prolyl and lysyl residues. In: Freedman BR & Hawkins HC (eds) The Enzymology of Post-Translational Modifications of Proteins 53-104. Academic Press, London. Kivirikko KI & Myllylä R (1982) Posttranslational enzymes in the biosynthesis of collagen: intracellular enzymes. Methods Enzymol 82: 245-304. Kivirikko KI, Myllylä R & Pihlajaniemi T (1992) Hydroxylation of proline and lysine residues in collagens and other animal and plant proteins. In: Harding, JJ & Crabbe, M (eds) Post- Translational Modifications of Proteins 1-51. CRC Press, Inc., Boca Raton, Florida. Kivirikko KI & Pihlajaniemi T (1998) Collagen hydroxylases and the protein disulphide isomerase subunit of prolyl 4-hydroxylases. Adv Enzymol Relat Areas Mol Biol 72: 325-398. Kivirikko KI & Prockop DJ (1967) Enzymatic hydroxylation of proline and lysine in protocollagen. Proc Natl Acad Sci U S A 57: 782-789. Kivirikko KI & Prockop DJ (1972) Partial purification and characterization of protocollagen lysine hydroxylase from chick embryos. Biochim Biophys Acta 258: 366-379. Kivirikko KI, Ryhänen L, Anttinen H, Bornstein P & Prockop DJ (1973) Further hydroxylation of lysyl residues in collagen by protocollagen lysyl hydroxylase in vitro. Biochemistry 12: 4966- 4971. Kivirikko KI, Shudo K, Sakakibara S & Prockop DJ (1972b) Studies on protocollagen lysine hydroxylase. Hydroxylation of synthetic peptides and the stoichiometric decarboxylation of α- ketoglutarate. Biochemistry 11: 122-129. Knott L, Whitehead CC, Fleming RH & Bailey AJ (1995) Biochemical changes in the collagenous matrix of osteoporotic avian bone. Biochem J 310: 1045-1051. 62 Koch M, Foley JE, Hahn R, Zhou P, Burgeson RE, Gerecke DR & Gordon MK (2001) α1(XX) collagen, a new member of the collagen subfamily, fibril-associated collagens with interrupted triple helices. J Biol Chem 276: 23120-23126. Koch M, Laub F, Zhou P, Hahn R, Tanaka S, Burgeson RE, Gerecke DR, Rao VH & Gordon MK (2003) Collagen XXIV, a vertebrate fibrillar collagen with structural features of invertebrate collagens: selective expression in developing cornea and bone. J Biol Chem, in press Krane SM, Pinnell SR & Erbe RW (1972) Lysyl-protocollagen hydroxylase deficiency in fibroblasts from siblings with hydroxylysine-deficient collagen. Proc Natl Acad Sci U S A 69: 2899-2903. Krol BJ, Murad S, Walker LC, Marshall MK, Clark WL, Pinnell SR & Yeowell HN (1996) The expression of a functional, secreted human lysyl hydroxylase in a baculovirus system. J Invest Dermatol 106: 11-16. Kuutti-Savolainen ER (1979) Enzymes of collagen biosynthesis in skin and serum and dermatological diseases. II. Serum enzymes. Clin Chim Acta 96: 53-58. Lamande SR & Bateman JF (1999) Procollagen folding and assembly: the role of endoplasmic reticulum enzymes and molecular chaperones. Semin Cell Dev Biol 10: 455-464. Lamberg A, Pihlajaniemi T & Kivirikko KI (1995) Site-directed mutagenesis of the α subunit of human prolyl 4-hydroxylase. Identification of three histidine residues critical for catalytic activity. J Biol Chem 270: 9926-9931. Leeson TS, Leeson CR & Paparo AA (1988) Connective Tissue Proper. In: Wonsiewicz M (ed) Text/Atlas of Histology 126-158. W.B. Saunders Company, Lehmann HW, Rimek D, Bodo M, Brenner RE, Vetter U, Worsdorfer O, Karbowski A & Muller PK (1995a) Hydroxylation of collagen type I: evidence that both lysyl and prolyl residues are overhydroxylated in osteogenesis imperfecta. Eur J Clin Invest 25: 306-310. Lehmann HW, Wolf E, Roser K, Bodo M, Delling G & Muller PK (1995b) Composition and posttranslational modification of individual collagen chains from osteosarcomas and osteofibrous dysplasias. J Res Clin Oncol 121: 413-418. Leivo I, Vaheri A, Timpl R & Wartiovaara J (1980) Appearance and distribution of collagens and laminin in the early mouse embryo. Dev Biol 76: 100-114. Lo Cascio V, Bertoldo F, Gambaro G, Gasperi E, Furlan F, Colapietro F, Lo CC & Campagnola M (1999) Urinary galactosyl-hydroxylysine in postmenopausal osteoporotic women: A potential marker of bone fragility. J Bone Miner Res 14: 1420-1424. Luckow VA (1993) Baculovirus systems for the expression of human gene products. Curr Opin Biotechnol 4: 564-572. Majamaa K, Günzler V, Hanauske-Abel HM, Myllylä R & Kivirikko KI (1986) Partial identity of the 2-oxoglutarate and ascorbate binding sites of prolyl 4-hydroxylase. J Biol Chem 261: 7819- 7823. Majamaa K, Hanauske-Abel HM, Günzler V & Kivirikko KI (1984) The 2-oxoglutarate binding site of prolyl 4-hydroxylase. Identification of distinct subsites and evidence for 2-oxoglutarate decarboxylation in a ligand reaction at the enzyme-bound ferrous ion. Eur J Biochem 138: 239- 245. Majamaa K, Turpeenniemi-Hujanen TM , Latipää P, Günzler V, Hanauske-Abel HM, Hassinen IE & Kivirikko KI (1985) Differences between collagen hydroxylases and 2-oxoglutarate dehydrogenase in their inhibition by structural analogues of 2-oxoglutarate. Biochem J 229: 127-133. 63 Makalowski W, Mitchell GA & Labuda D (1994) Alu sequences in the coding regions of mRNA: a source of protein variability. Trends Genet 10: 188-193. Martin GR (1981) Isolation of a pluripotent cell line from early mouse embryos cultured in medium conditioned by teratocarcinoma stem cells. Proc Natl Acad Sci U S A 78: 7634-7638. Matise M, Auerbach W & Joyner A (2000) Production of targeted embryonic stem cell clones. In: Joyner, A L (ed) Gene Targeting. A Practical Approach 101-131. Oxford University Press, New York. McKusick V (1983) Mendelian inheritance in man. Catalogs of autosomal dominant, autosomal recessive, and X-linked phenotypes. John Hopkins University Press, Baltimore. Mercer DK, Nicol PF, Kimbembe C & Robins SP (2003) Identification, expression, and tissue distribution of the three rat lysyl hydroxylase isoforms. Biochem Biophys Res Commun 307: 803-809. Mighell AJ, Markham AF & Robinson PA (1997) Alu sequences. FEBS Lett 417: 1-5. Müller U (1999) Ten years of gene targeting: targeted mouse mutants, from vector design to phenotype analysis. Mech Dev 82: 3-21. Murad S & Pinnell SR (1987) Suppression of fibroblast proliferation and lysyl hydroxylase activity by minoxidil. J Biol Chem 262: 11973-11978. Murad S, Tennant MC & Pinnell SR (1992) Structure-activity relationship of minoxidil analogs as inhibitors of lysyl hydroxylase in cultured fibroblasts. Arch Biochem Biophys 292: 234-238. Myllyharju J (2003) Prolyl 4-hydroxylases, the key enzymes of collagen biosynthesis. Matrix Biol 22: 15-24. Myllyharju J & Kivirikko KI (1997) Characterization of the iron- and 2-oxoglutarate-binding sites of human prolyl 4-hydroxylase. EMBO J 16: 1173-1180. Myllyharju J & Kivirikko KI (1999) Identification of a novel proline-rich peptide-binding domain in prolyl 4-hydroxylase. EMBO J 18: 306-312. Myllyharju J & Kivirikko KI (2001) Collagens and collagen-related diseases. Ann Med 33: 7-21. Myllyharju J & Kivirikko KI (2003) Collagens and their mutations: from man to Drosophila and Caenorhabditis elegans. Trends Genet (in press) Myllylä R, Günzler V, Kivirikko KI & Kaska DD (1992) Modification of vertebrate and algal prolyl 4-hydroxylases and vertebrate lysyl hydroxylase by diethyl pyrocarbonate. Evidence for histidine residues in the catalytic site of 2-oxoglutarate-coupled dioxygenases. Biochem J 286: 923-927. Myllylä R, Kuutti-Savolainen ER & Kivirikko KI (1978) The role of ascorbate in the prolyl hydroxylase reaction. Biochem Biophys Res Commun 83: 441-448. Myllylä R, Majamaa K, Günzler V, Hanauske-Abel HM & Kivirikko KI (1984) Ascorbate is consumed stoichiometrically in the uncoupled reactions catalyzed by prolyl 4-hydroxylase and lysyl hydroxylase. J Biol Chem 259: 5403-5405. Myllylä R, Pajunen L & Kivirikko KI (1988) Polyclonal and monoclonal antibodies to human lysyl hydroxylase and studies on the molecular heterogeneity of the enzyme. Biochem J 253: 489- 496. Myllylä R, Pihlajaniemi T, Pajunen L, Turpeenniemi-Hujanen T & Kivirikko KI (1991) Molecular cloning of chick lysyl hydroxylase. Little homology in primary structure to the two types of subunit of prolyl 4-hydroxylase. J Biol Chem 266: 2805-2810. Myllylä R, Risteli L & Kivirikko KI (1975) Glucosylation of galactosylhydroxylysyl residues in collagen in vitro by collagen glucosyltransferase. Inhibition by triple-helical conformation of the substrate. Eur J Biochem 58: 517-521. 64 Myllylä R, Schubotz LM, Weser U & Kivirikko KI (1979) Involvement of superoxide in the prolyl and lysyl hydroxylase reactions. Biochem Biophys Res Commun 89: 98-102. Myllylä R, Tuderman L & Kivirikko KI (1977) Mechanism of the prolyl hydroxylase reaction. 2. Kinetic analysis of the reaction sequence. Eur J Biochem 80: 349-357. Nagata K (1998) Expression and function of heat shock protein 47: a collagen-specific molecular chaperone in the endoplasmic reticulum. Matrix Biol 16: 379-386. Nokelainen M, Helaakoski T, Myllyharju J, Notbohm H, Pihlajaniemi T, Fietzek PP & Kivirikko KI (1998) Expression and characterization of recombinant human type II collagens with low and high contents of hydroxylysine and its glycosylated forms. Matrix Biol 16: 329-338. Nokelainen M, Tu H, Vuorela A, Notbohm H, Kivirikko KI & Myllyharju J (2001) High-level production of human type I collagen in the yeast . Yeast 18: 797-806. Norman KR & Moerman DG (2000) The let-268 of Caenorhabditis elegans encodes a procollagen lysyl hydroxylase that is essential for type IV collagen secretion. Dev Biol 227: 690-705. Notbohm H, Nokelainen M, Myllyharju J, Fietzek PP, Muller PK & Kivirikko KI (1999) Recombinant human type II collagens with low and high levels of hydroxylysine and its glycosylated forms show marked differences in fibrillogenesis in vitro. J Biol Chem 274: 8988- 8992. Pace JM, Corrado M, Missero C & Byers PH (2003) Identification, characterization and expression analysis of a new fibrillar collagen gene, COL27A1. Matrix Biol 22: 3-14. Pajunen L, Suokas M, Hautala T, Kellokumpu S, Tebbe B, Kivirikko KI & Myllylä R (1998) A splice-site mutation that induces exon skipping and reduction in lysyl hydroxylase mRNA levels but does not create a nonsense codon in Ehlers-Danlos syndrome type VI. DNA Cell Biol 17: 117-123. Papaioannou V & Johnson R (2000) Production of chimeras by blastocyst and morula injection of targeted ES cells. In: Joyner, A L (ed) Gene Targeting. A practical approach 2nd ed.: 133-174. Oxford University Press Inc., New York. Parks C-L & Shenk T (1996) The serotonin 1a receptor gene contains a TATA-less promoter that responds to MAZ and Sp1. J Biol Chem 271: 4417-4430. Parks CL & Shenk T (1997) Activation of the adenovirus major late promoter by transcription factors MAZ and Sp1. J Virol 71: 9600-9607. Passoja K, Myllyharju J, Pirskanen A & Kivirikko KI (1998) Identification of arginine-700 as the residue that binds the C-5 carboxyl group of 2-oxoglutarate in human lysyl hydroxylase 1. FEBS Lett 434: 145-148. Peterkofsky B & Assad R (1979) Localization of chick embryo limb bone microsomal lysyl hydroxylase at intracisternal and intramembrane sites. J Biol Chem 254: 4714-4720. Pihlajaniemi T & Myers JC (1987) Characterization of a pro-α2(I) collagen gene mutation by nuclease S1 mapping. Meth Enzymol 145: 213-222. Pinnell SR, Krane SM, Kenzora JE & Glimcher MJ (1972) A heritable disorder of connective tissue. Hydroxylysine-deficient collagen disease. N Engl J Med 286: 1013-1020. Pirskanen A, Kaimio AM, Myllylä R & Kivirikko KI (1996) Site-directed mutagenesis of human lysyl hydroxylase expressed in insect cells. Identification of histidine residues and an residue critical for catalytic activity. J Biol Chem 271: 9398-9402. Pousi B, Hautala T, Heikkinen J, Pajunen L, Kivirikko KI & Myllylä R (1994) Alu-Alu recombination results in a duplication of seven exons in the lysyl hydroxylase gene in a patient with the type VI variant of Ehlers- Danlos syndrome. Am J Hum Genet 55: 899-906. 65 Pousi B, Hautala T, Hyland JC, Schroter J, Eckes B, Kivirikko KI & Myllylä R (1998) A compound heterozygote patient with Ehlers-Danlos syndrome type VI has a deletion in one allele and a splicing defect in the other allele of the lysyl hydroxylase gene. Hum Mutat 11: 55- 61. Pousi B, Heikkinen J, Schroter J, Pope M & Myllylä R (2000) A nonsense codon of exon 14 reduces lysyl hydroxylase mRNA and leads to aberrant RNA splicing in a patient with Ehlers- Danlos syndrome type VI. Mutat Res 432: 33-37. Prockop DJ & Kivirikko KI (1995) Collagens: molecular biology, diseases, and potentials for therapy. Annu Rev Biochem 64: 403-434. Prockop DJ, Sieron AL & Li SW (1998) Procollagen N-proteinase and procollagen C-proteinase. Two unusual metalloproteinases that are essential for procollagen processing probably have important roles in development and cell signaling. Matrix Biol 16: 399-408. Puistola U, Turpeenniemi-Hujanen TM, Myllylä R & Kivirikko KI (1980a) Studies on the lysyl hydroxylase reaction. I. Initial velocity kinetics and related aspects. Biochim Biophys Acta 611: 40-50. Puistola U, Turpeenniemi-Hujanen TM, Myllylä R & Kivirikko KI (1980b) Studies on the lysyl hydroxylase reaction. II. Inhibition kinetics and the reaction mechanism. Biochim Biophys Acta 611: 51-60. Quinn RS & Krane SM (1976) Abnormal properties of collagen lysyl hydroxylase from skin fibroblasts of siblings with hydroxylysine-deficient collagen. J Clin Invest 57: 83-93. Risteli L, Risteli J, Ihme A, Krieg T & Muller PK (1980) Preferential hydroxylation of type IV collagen by lysyl hydroxylase from Ehlers-Danlos syndrome type VI fibroblasts. Biochem Biophys Res Commun 96: 1778-1784. Roach PL, Clifton IJ, Fulop V, Harlos K, Barton GJ, Hajdu J, Andersson I, Schofield CJ & Baldwin JE (1995) Crystal structure of isopenicillin N synthase is the first from a new structural family of enzymes. Nature 375: 700-704. Ruotsalainen H, Sipilä L, Kerkelä E, Pospiech H & Myllylä R (1999) Characterization of cDNAs for mouse lysyl hydroxylase 1, 2 and 3, their phylogenetic analysis and tissue-specific expression in the mouse. Matrix Biol 18: 325-329. Ryhänen L (1975) Hydroxylation of lysyl residues in lysine-rich and arginine-rich histones by lysyl hydroxylase in vitro. Biochim Biophys Acta 397: 50-57. Ryhänen L (1976) Lysyl hydroxylase. Further purification and characterization of the enzyme from chick embryos and chick embryo cartilage. Biochim Biophys Acta 438: 71-89. Ryhänen L & Kivirikko KI (1974) Hydroxylation of lysyl residues in native and denatured protocollagen by protocollagen lysyl hydroxylase in vitro. Biochim Biophys Acta 343: 129-137. Rüdiger NS, Gregersen N & Kielland-Brandt MC (1995) One short well conserved region of Alu- sequences is involved in human gene rearrangements and has homology with prokaryotic chi. Nucleic Acids Res 23: 256-260. Sambrook J, Fritsch EF & Maniatis T (1989) Mapping RNA with nuclease S1. In: Molecular Cloning, a Laboratory Manual 7.51-7.62. Cold Spring Harbor, New York. Samimi A & Last JA (2001a) Inhibition of lysyl hydroxylase by malathion and malaoxon. Toxicol Appl Pharmacol 172: 203-209. Samimi A & Last JA (2001b) Mechanism of inhibition of lysyl hydroxylase activity by the organophosphates malathion and malaoxon. Toxicol Appl Pharmacol 176: 181-186. 66 Sato K, Yomogida K, Wada T, Yorihuzi T, Nishimune Y, Hosokawa N & Nagata K (2002) Type XXVI collagen, a new member of the collagen family, is specifically expressed in the testis and ovary. J Biol Chem 277: 37678-37684. Shapiro FD & Eyre DR (1982) Collagen polymorphism in extracellular matrix of human osteosarcoma. J Natl Cancer Inst 69: 1009-1016. Shigematsu T, Tajima S, Nishikawa T, Murad S, Pinnell SR & Nishioka I (1994) Inhibition of collagen hydroxylation by lithospermic acid magnesium salt, a novel compound isolated from Salviae miltiorrhizae Radix. Biochim Biophys Acta 1200: 79-83. Slot J & Geuze H (1985) A novel method to make gold probes for multiple labeling cytochemistry. Eur J Cell Biol 38: 87-93. Smith-Mungo LI & Kagan HM (1998) Lysyl oxidase: properties, regulation and multiple functions in biology. Matrix Biol 16: 387-398. Smyth N, Vatansever HS, Murray P, Meyer M, Frie C, Paulsson M & Edgar D (1999) Absence of basement membranes after targeting the LAMC1 gene results in embryonic lethality due to failure of endoderm differentiation. J Cell Biol 144: 151-160. Steinmann B, Eyre DR & Shao P (1995) Urinary pyridinoline cross-links in Ehlers-Danlos syndrome type VI. Am J Hum Genet 57: 1505-1508. Steinmann B, Royce PM & Superti-Furga A (2002) The Ehlers-Danlos syndrome. In: Royce PM & Steinmann B (eds) Connective tissue and its heritable disorders. 2nd ed.: 431-523. Wiley-Liss, Inc, New York, Suokas M, Lampela O, Myllylä R & Kellokumpu S (2003) Retrieval-independent localization of lysyl hydroxylase in the endoplasmic reticulum via a peptide fold in its iron-binding domain. Biochem J 370: 913-920. Suokas M, Myllylä R & Kellokumpu S (2000) A single C-terminal peptide segment mediates both membrane association and localization of lysyl hydroxylase in the endoplasmic reticulum. J Biol Chem 275: 17863-17868. Sussman M, Lichtenstein JR, Nigra TP, Martin GR & McKusick VA (1974) Hydroxylysine- deficient skin collagen in a patient with a form of the Ehlers-Danlos syndrome. J Bone Joint Surg 56: 1228-1234. Szpirer C, Szpirer J, Riviere M, Vanvooren P, Valtavaara M & Myllylä R (1997) Localization of the gene encoding a novel isoform of lysyl hydroxylase. Mamm Genome 8: 707-708. Thomas KR & Capecchi MR (1987) Site-directed mutagenesis by gene targeting in mouse embryo- derived stem cells. Cell 51: 503-512. Timpl R (1996) Macromolecular organization of basement membranes. Curr Opin Cell Biol 8: 618- 624. Tuckwell D (2002) Identification and analysis of collagen α1(XXI), a novel member of the FACIT collagen family. Matrix Biology 21: 63-66. Tuderman L, Myllylä R & Kivirikko KI (1977) Mechanism of the prolyl hydroxylase reaction. 1. Role of co-substrates. Eur J Biochem 80: 341-348. Turpeenniemi-Hujanen TM, Puistola U & Kivirikko KI (1980) Isolation of lysyl hydroxylase, an enzyme of collagen synthesis, from chick embryos as a homogeneous protein. Biochem J 189: 247-253. Turpeenniemi-Hujanen TM, Puistola U & Kivirikko KI (1981) Human lysyl hydroxylase: purification to homogeneity, partial characterization and comparison of catalytic properties with those of a mutant enzyme from Ehlers-Danlos syndrome type VI fibroblasts. Coll Relat Res 1: 355-366. 67 Turpeenniemi TM, Puistola U, Anttinen H & Kivirikko KI (1977) Affinity chromatography of lysyl hydroxylase on concanavalin A-agarose. Biochim Biophys Acta 483: 215-219. Uzawa K, Grzesik WJ, Nishiura T, Kuznetsov SA, Robey PG, Brenner DA & Yamauchi M (1999) Differential expression of human lysyl hydroxylase genes, lysine hydroxylation, and cross- linking of type I collagen during osteoblastic differentiation in vitro. J Bone Miner Res 14: 1272-1280. Valtavaara M, Papponen H, Pirttilä AM, Hiltunen K, Helander H & Myllylä R (1997) Cloning and characterization of a novel human lysyl hydroxylase isoform highly expressed in pancreas and muscle. J Biol Chem 272: 6831-6834. Valtavaara M, Szpirer C, Szpirer J & Myllylä R (1998) Primary structure, tissue distribution, and chromosomal localization of a novel isoform of lysyl hydroxylase (lysyl hydroxylase 3). J Biol Chem 273: 12881-12886. van der Rest M & Garrone R (1991) Collagen family of proteins. FASEB J 5: 2814-2823. van der Slot AJ, Zuurmond AM, Bardoel AF, Wijmenga C, Pruijs HE, Sillence DO, Brinckmann J, Abraham DJ, Black CM, Verzijl N, DeGroot J, Hanemaaijer R, TeKoppele JM, Huizinga TW & Bank RA (2003) Identification of PLOD2 as telopeptide lysyl hydroxylase, an important enzyme in fibrosis. J Biol Chem, in press von Heijne G (1986) A new method for predicting signal sequence cleavage sites. Nucleic Acids Res 14: 4683-4690. von Melchner H, DeGregori JV, Rayburn H, Reddy S, Friedel C & Ruley HE (1992) Selective disruption of genes expressed in totipotent embryonal stem cells. Genes Dev 6: 919-927. Vuori K, Pihlajaniemi T, Marttila M & Kivirikko KI (1992) Characterization of the human prolyl 4-hydroxylase tetramer and its multifunctional protein disulphide-isomerase subunit synthesized in a baculovirus expression system. Proc Natl Acad Sci U S A 89: 7467-7470. Walker LC, Marini JC, Grange DK, Filie J & Yeowell HN (1999) A patient with Ehlers-Danlos syndrome type VI is homozygous for a premature termination codon in exon 14 of the lysyl hydroxylase 1 gene. Mol Genet Metab 67: 74-82. Wang C, Luosujarvi H, Heikkinen J, Risteli M, Uitto L & Myllyla R (2002) The third activity for lysyl hydroxylase 3: galactosylation of hydroxylysyl residues in collagens in vitro. Matrix Biol 21: 559-566. Wang C, Risteli M, Heikkinen J, Hussa AK, Uitto L & Myllyla R (2002) Identification of amino acids important for the catalytic activity of the collagen glucosyltransferase associated with the multifunctional lysyl hydroxylase 3 (LH3). J Biol Chem 277: 18568-18573. Wang C, Valtavaara M & Myllyla R (2000) Lack of collagen type specificity for lysyl hydroxylase isoforms. DNA Cell Biol 19: 71-77. Wang Y, Xu A, Knight C, Xu LY & Cooper GJ (2002) Hydroxylation and glycosylation of the four conserved lysine residues in the collagenous domain of adiponectin. Potential role in the modulation of its insulin-sensitizing activity. J Biol Chem 277: 19521-19529. Weinstein E, Blumenkrantz N & Prockop DJ (1969) Hydroxylation of proline and lysine in protocollagen involves two separate enzymatic sites. Biochim Biophys Acta 191: 747-750. Wingfield PT & Pain RH (1998) Transverse urea-gradient gel electrophoresis. In: Coligan, J E, Dunn, B E, Ploegh, H L, Speicher, D W, & Wingfield, P T (eds) Current Protocols in Protein Science 7.4.1-7.4.13. John Wiley and Sons Inc., New York. Yeowell HN, Allen JD, Walker LC, Overstreet MA, Murad S & Thai SF (2000a) Deletion of cysteine 369 in lysyl hydroxylase 1 eliminates enzyme activity and causes Ehlers-Danlos syndrome type VI. Matrix Biol 19: 37-46. 68 Yeowell HN, Ha V, Walker LC, Murad S & Pinnell SR (1992) Characterization of a partial cDNA for lysyl hydroxylase from human skin fibroblasts; lysyl hydroxylase mRNAs are regulated differently by minoxidil derivatives and hydralazine. J Invest Dermatol 99: 864-869. Yeowell HN & Pinnell SR (1993) The Ehlers-Danlos syndromes. Semin Dermatol 12: 229-240. Yeowell HN & Walker LC (1997) Ehlers-Danlos syndrome type VI results from a nonsense mutation and a splice site-mediated exon-skipping mutation in the lysyl hydroxylase gene. Proc Assoc Am Physicians 109: 383-396. Yeowell HN & Walker LC (1999a) Prenatal exclusion of Ehlers-Danlos syndrome type VI by mutational analysis. Proc Assoc Am Physicians 111: 57-62. Yeowell HN & Walker LC (1999b) Tissue specificity of a new splice form of the human lysyl hydroxylase 2 gene. Matrix Biol 18: 179-187. Yeowell HN & Walker LC (2000) Mutations in the lysyl hydroxylase 1 gene that result in enzyme deficiency and the clinical phenotype of Ehlers-Danlos syndrome type VI. Mol Genet Metab 71: 212-224. Yeowell HN, Walker LC, Farmer B, Heikkinen J & Myllylä R (2000b) Mutational analysis of the lysyl hydroxylase 1 gene (PLOD) in six unrelated patients with Ehlers-Danlos syndrome type VI: prenatal exclusion of this disorder in one family. Hum Mutat 16: 90.