Evolution of the contact phase of vertebrate blood coagulation

Michal B. Ponczek,1,3 David Gailani2 and Russell F. Doolittle3 1Department of General Biochemistry, University of Lodz, Lodz, Poland, European Union 2Departments of Pathology and Medicine, Vanderbilt University, Nashville, Tennessee, United States 3Department of Chemistry & Biochemistry, University of California San Diego, La Jolla, California, United States

Separate Genes in the Opossum. Introduction Results The opossum fXI gene has several features that are highly similar In placental mammals, the contact phase proteases of blood Genes for FXII In Vertebrate Genomes. to fXI in placental mammals. Human and opossum fXI are 70% coagulation are factor XI (fXI), factor XII (fXII) and plasma Genes for hepatocyte growth factor activator (HGFA), a paralog identical at the amino acid level. Human and mouse fXI are 78% prekallikrein (PK). FXII and PK are activated in vitro in the presence of fXII, were found in lamprey, puffer fish (but not zebra fish), frog, identical. A conserved sequence (amino acids 183-191) in the third of negatively charged surfaces such as kaolin. Activated factor XII chicken, platypus and opossum. Genes for authentic fXII were PAN domain in placental mammals that likely represents a binding (fXIIa) converts fXI to the protease factor XIa, which in turn activates identified in frog, platypus and opossum genomes; the gene is not exosite for the substrate factor IX is conserved in the opossum factor IX to IXa (Fig. 1). The availability of numerous whole genome present in the chicken genome (Table 1). Phylogenetic evidence and (Fig 7). FXI is a bond-linked homodimeric protein with the sequences of vertebrates enables bioinformatics reconstruction of the chromosomal considerations indicate that the gene has been lost on the fourth PAN domain forming the interchain interface. Amino acids step-by-step evolution of complex pathways like blood coagulation. lineage leading to birds (Fig. 5). in human fXI that are critical for forming the homodimer are We searched in genomes of opossum, platypus, chicken, frog, puffer conserved in the opossum, including residues Leu284, Ile290, and fish, zebra fish and lamprey for PK, fXI and fXII . A simple phylogeny Tyr329 that form the interface, and an unpaired Cys321 that forms the Table 1. Occurrence of genes for contact phase factors and some paralogs in assorted of these creatures is presented in (Fig. 2). interchain disulfide bond. Like human PK, opossum PK appears to be vertebrate genomes. Figure 6. Phylogenetic tree generated from the alignment of serine protease portions of various coagulation factors. In deference to the artifactually truncated platypus factor a monomer, with a cysteine residue at position 326 that forms FXI Prekal FXII HGFA HGF Plg tPA XI-prekallikrein predecessor, which lacks 13 residues at its carboxy-terminus, all sequences an intrachain disulfide bond with Cys321 (Fig. 7). were shortened to the same point. The circled regions labeled 1, 2 and 3 correspond to key Human Yes Yes Yes Yes Yes Yes Yes gene duplication sites. Trees with virtually the same topology were generated independently Opossum Yes Yes Yes Yes Yes Yes Yes by other methods. Organism abbreviations: HU, human; MO, mouse; OP, opossum; PL, platypus; CH, chicken; FR, frog, ZF, zebra fish; FU, puffer fish; LA, lamprey. Factor Phylogenetic Trees. Platypus [Yes - Yes]* Yes Yes Yes Yes Yes abbreviations: PK, prekallikrein; F11, factor XI; F11PK, factor XI-prekallikrein predecessor; In a perfect phylogenetic tree, predecessor genes are expected F12, factor XII; PLG, plasminogen; HGF, hepatocyte growth factor; HGFA, hepatocyte growth Chicken [Yes - Yes]* No Yes Yes Yes Yes factor activator; TPA, tissue plasminogen activator; UPA, urokinase. to appear prior to duplications leading to new . If the Frog [Yes - Yes]* Yes Yes Yes Yes Yes duplication occurs within a short interval of a species divergence, Zebrafish No No No No† Yes Yes Yes The Platypus Prekallikrein-Factor XI Predecessor Gene. however, the tree can be slightly muddled. In the case of the putative Pufferfish No No No Yes Yes Yes Yes The platypus genome sequence is still at the “draft” stage, PK-fXI predecessor, the frog sequence appears well in advance of the Figure 1. Simplified outline of "contact factor” Figure 2. Diagrammatic depiction of the Lamprey No No No Yes Yes Yes Yes and numerous short regions remain un-sequenced. The contigs duplication, as expected, but the chicken and platypus sequences are involvement in mammalian blood coagulation phylogenetic relationships of organisms on which the relevant exons occur are not fully assembled. There near the bottoms of the fXI and PK clusters, respectively. The internal (activations of factors IX and X by other routes discussed in this poster. A whole genome not shown). sequence is not yet available for any reptile. * Denotes single gene for evolutionary predecessor of factor XI and prekallikrein. is an anomalous termination codon (TAG) in the middle of the branch lengths are very short, however, and either entry could be † We did not find a gene corresponding to HGFA in the zebra fish, but it does have one for HGF, which protease portion of the PK-fXI predecessor sequence; in reality it would need activation. shifted on the tree merely by a few amino acid replacements (Fig. 6). most likely encodes a tryptophan (TGG). The rest of the region Methods Genes for Factor XI and Prekallikrein In Vertebrate Genomes. is free of termination codons, and the sequence following the anomalous codon fits exactly into the expected pattern at the same Several different data sources were used during this bioinformatics A single paralog of fXI and PK occurs in frog, chicken and platypus, high level of similarity (>60% identity). Moreover, when the Conclusions analysis, including the National Center for Biotechnology Information suggesting its first appearance among early tetrapods (Table 1). putative sequence is examined phylogenetically with the serine Factor XII is absent from fish; it is present in frog, platypus and (NCBI), European Bioinformatics Institute (EMBL-EBI) and In contrast, the opossum genome has genes for both PK and fXI, protease domains of other clotting factors, it assumes an appropriate opossum, but is absent in chicken, an apparent example of gene loss. Washington University of St. Louis Genome Center. The protocols indicating that the gene duplication leading to separate place at the base of the cluster composed of fXI and PK (Fig. 6). A single gene corresponding to the evolutionary predecessor of factor available at the EMBL-EBI and Wellcome Trust-Sanger Centre factors occurred early in mammalian evolution The amino acid sequence of the protease domain of the platypus XI and prekallikrein occurs in frog, chicken and platypus. (WTSI), including ENSEMBL were particularly helpful. BLAST but after the divergence of monotremes (platypus). predecessor is about 68% identical with the corresponding regions The opossum has both prekallikrein and factor XI, completing the full searches of the sequences of the human factors against the various of either human fXI or PK. complement of these genes that occurs in eutherian mammals. whole genome sequence databases were followed by BLAST against The Chicken Prekallikrein-Factor XI Predecessor Gene. The expansion of the vertebrate clotting system to include fXI, PK the NCBI database. Reconstructions between exons were made with The case for there being one PK-fXI gene in the chicken genome is and fXII has occurred by way of a series of widely spaced gene GeneScan, but final versions were made manually. Alignments were greatly strengthened not only by there being a single gene with duplications during the course of several hundred million years made both with CLUSTAL and by an older progressive method. appropriate sequence similarity and domain arrangement, but also by its of evolution, beginning with the appearance of fXII and a PK-fXI Phylogenetic trees were calculated from a multiple alignment of the detailed chromosomal location relative to neighboring genes. In humans, predecessor in amphibians (Fig. 6, 8). serine protease domains of numerous proteases, including some not the genes for PK and fXI are adjacent to each other and between genes involved in the contact phase of clotting (Fig. 3, 6). Phylogenetic trees for a cytochrome P-450 and a melatonin receptor (MTNR1). In the were drawn on the phylodendron website (iubio.bio.Indiana.edu). chicken, only a single gene occurs between the same two genes (Fig. 4). Literature The possibility of syntenic gene arrangements for these various Davidson, C.J., Hirt R.P., Lal K., Elgar G., Tuddenham E.G.D., McVey J.H. 2003. Molecular evolution of the vertebrate blood organisms was explored with ENSEMBL (www.ensembl.org). coagulation network. Thromb. Haemost. 89: 420-428. Figure 7. Alignment of factor IX-binding regions Figure 8. Depiction of vertebrate evolutionary tree. Davidson C.J., Tuddenham E.G., McVey J.H. 2003. 450 Million years of hemostasis. J. Thromb. Haemost. 1:1487-1494. of various mammalian factors XI and corresponding 1. Duplication of HGF- or plasminogen. Doolittle, R.F., Jiang Y., Nand J. 2008. Genomic evidence for a simpler clotting scheme in jawless vertebrates. J. Mol. Evol. * regions of human PK and putative predecessor 2. Duplication of gene for HGFA gives rise to fXII. 66:185-196. proteins from frog, chicken and platypus. 3. Duplication of gene for factor XI-PK predecessor. Jiang Y., Doolittle R.F. 2003. The evolution of vertebrate blood coagulation as viewed from a comparison of puffer fish and sea 4. Loss of fXII gene on lineage leading to birds. squirt genomes. Proc. Natl. Acad. Sci., USA 100:7527-7532. The terminal three PAN domains of the platypus PK-fXI predecessor are clustered on a single supercontig (co29087); the fourth PAN domain is on a contig that also contains the amino- Acknowledgments terminal half of the serine protease domain (co41273). Figure 4. Locations of genes neighboring factor Figure 5. Arrangement of neighboring No terminator codons occur in either of these contigs. The region Funding for this project was provided by the European Molecular Biology Organization XI-prekallikrein on human and chicken genes for factor XII and/or HGFA in human, containing the second half of the serine protease domain is on a third (EMBO). M. Ponczek was the recipient of EMBO Short Term Fellowship 83-2008. chromosomes (both coincidentally numbered 4). chicken (HGFA only) and frog. Genes: Genes: CYTP450, cytochrome P450; NOVEL , RGS14, Regulator of G-protein signaling contig (co73301) containing the anomalous terminator codon. factor XI/PK predecessor; PK, prekallikrein; F11, 14; Na+/Pi, sodium-dependent phosphate Homologs - descended from a common ancestor, The platypus, frog, and chicken proteins all lack the putative fIX- are divided into two kinds: factor XI; PG, pseudogene; MNTR1A, transport protein 2A; F12, factor XII; - orthologs straight-line genetic descendants melatonin receptor type 1A (slightly simplified GRK6, G protein-coupled receptor kinase binding site found in the fXI third PAN domain. An examination that are thought to have the same function For further information - paralogs - the result of gene duplications. from the EMBL-EBI website as viewed with 6; RGS12, regulator of G-protein signaling that may have different, albeit similar, functions. of sequences for the fourth PAN domain shows Cys residues ENSEMBL). 12; HGFA, hepatocyte growth factor Please contact [email protected]. More information on this and related projects can activator; DOK7, protein Dok-7 at positions 321 and 326 (Fig. 7). It indicates that the predecessor be obtained at www.biol.uni.lodz.pl/~biochogl. A link to an online, PDF-version of the poster: (downstream of tyrosine kinase 7). http://www.biol.uni.lodz.pl/~biochogl/ecpvbc.pdf. Figure 3. Methods applied to search for orthologs of mammalian contact phase protein, like PK, is a monomer. It is not clear if the predecessor coagulation factors. * Factors and some paralogs used and their domains: SP, activity is more like PK or fXI, however, the dimeric structure serine protease domain; K, , E, EGF domain; P, PAN domain; F1, fibronectin type 1 domain; F2, fibronectin type 2 domain. of fXI is a feature acquired after duplication of the predecessor gene.