View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

Genomics 103 (2014) 147–153

Contents lists available at ScienceDirect

Genomics

journal homepage: www.elsevier.com/locate/ygeno

Identification and characterization of transforming growth factor β induced (TGFBIG) from Branchiostoma belcheri:Insightsinto evolution of TGFBI family

Xiaojun Song a,b,LuCaia,b,YafangLia,b, Jiu Zhu a,b,PingJina,b,LimingChenc,FeiMaa,b,⁎

a Laboratory for Comparative Genomics and Bioinformatics, College of Life Science, Nanjing Normal University, Nanjing 210046, PR China b Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Science, Nanjing Normal University, Nanjing 210046, PR China c The Key Laboratory of Developmental and Human Disease, Ministry of Education, Institute of Life Science, Southeast University, Nanjing 210009, PR China

article info abstract

Article history: The transforming growth factor β induced gene (TGFBIG)encodesaprotein(TGFBI)whichplaysimportantrolesin Received 5 August 2013 many biological processes. However, no TGFBIG homolog has been reported in B. belcheri. Here, we identified a Accepted 7 October 2013 TGFBI-like gene from B. belcheri and extensively studied the evolutionary history of TGFBI family. We found that Available online 16 October 2013 the amphioxus genome contains a TGFBIG homolog designated as AmphiTGFBI which encodes a with 5 Fas1 domains. The TGFBIGs were present in a common ancestor with Amphimedon queenslandica.Wealso Keywords: demonstrated expression patterns of AmphiTGFBI in five amphioxus tissues. Interestingly, the gene structures and Amphioxus Domain loss conserved motifs of invertebrate TGFBIGs were found to present regular changes in the evolution. Positive selection Evolution and Fas1 domain loss might cause the regular changes of gene structures and conserved motifs in invertebrate Gene duplication TGFBIGs during evolution. Together, our findings provided an insight into the evolution of the TGFBI family. TGFBIG © 2013 Elsevier Inc. All rights reserved. TGFBI family Persiostin

1. Introduction [12,13]. In addition, TGFBI was required for embryogenesis through regulation of canonical Wnt signaling and suppression of mesothelioma The transforming growth factor β induced gene (TGFBIG) encoding an progression through the Akt/mTOR pathway [8,14]. (ECM) protein (TGFBI), was identified more than two TGFBI family contains two genes: TGFBI gene and POSTN (periostin, decades ago from a TGFβ1 treated lung adenocarcinoma cell line[1].The OSF-2) gene. Interestingly, the TGFBI paralog POSTN which is a secreted TGFBI contains a secretary signal sequence, an N-terminal cysteine-rich extracellular matrix glycoprotein has a similar domain structure to (EMI) domain, four homologous internal domains, and a cell attachment TGFBI with the following exceptions: 1) TGFBI has 17 exons compared (Arg-Gly-Asp, RGD) site recognizing [2].TheTGFBIplaysan to 23 exons in POSTN in human; 2) TGFBI is notably shorter and lacks important role in cell adhesion, migration, proliferation, apoptosis, and completely the C-terminal region that is subject to alternative splicing angiogenesis, and mutations of the TGFBIG have been shown to be in POSTN [15]. The fundamentally divergence of TGFBI and POSTN involved in several corneal dystrophies [3–5]. Recent studies showed sequences are after the fourth fasciclin domain. The POSTN protein as that the TGFBI can serve as a tumor suppressor in several cancers, such a secreted protein also has an N-terminal signal sequence, in addition as mesothelioma, breast cancer, and radiation carcinogenesis [6–8].The to an EMI domain encoded by exons 2 and 3, and four Fasciclin (Fas1) expression levels of TGFBI in tumor cells were significantly lower than domains [16]. Periostin were initially characterized as an adhesion that in normal cells [6,7]. And TGFBI can sensitize ovarian cancer cells to protein on the basis of its apparent homology to insect fasciclin and paclitaxel by inducing microtubule stabilization and that the loss of this characterization subsequently were refined by expanding the TGFBI induces drug resistance and mitotic spindle abnormalities in collection of known ECM binding partners and illuminating functional ovarian cancer cells [9]. TGFBI can also participate in mesoderm aspects consistent with a role in contact-signaling [17]. Periostin, differentiation and tissue branching morphogenesis [10,11].TGFBI- whose expression has been found to be promoted by several factors knockdown disrupts zebrafish somitogenesis and homozygous null including TGFβ1, 2, 3 and BMP-2 in multiple studies [18–20], can bind mutations in mice may retard postnatal development and has been to certain receptors, which subsequently activate the Akt/PKB reported to predispose the animals to spontaneous and induced cancers pathway via FAK and PI3K, leading to an enhanced migratory or invasive phenotype [21]. Otherwise, periostin may be also involved in the development of FVMs (fibrovascular membranes) [22]. ⁎ Corresponding author at: Laboratory for Comparative Genomics and Bioinformatics, College of Life Science, Nanjing Normal University, Nanjing 210046, PR China. Although TGFBI and periostin have important roles in cell adhesion, E-mail address: [email protected] (F. Ma). migration, proliferation, apoptosis and cancer, there are also mounting

0888-7543/$ – see front matter © 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ygeno.2013.10.002 148 X. Song et al. / Genomics 103 (2014) 147–153 evidence that periostin and TGFBI can have complementary or opposite with tBLASTn against the genome sequence and EST libraries of the roles in cancer and development. In cancers, reports highlighted tumor- relevant organism. These data are shown in Table S1. suppressive roles of TGFBI, and complementary expression patterns of POSTN and TGFBI in the developing heart were explicitly studied 2.3. Cloning of full-length AmphiTGFBI cDNA by 3′-and5′-RACE [13,23,24]. TGFBI is dedicated to certain relatively common mutations causing a variety for corneal dystrophies [25], a functional aspect for Total RNA of the whole organism was extracted using Trizol reagent which there is no known counterpart for periostin. Like periostin, (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's TGFBI is a TGFβ-induced, secreted, integrin-binding ECM protein instruction. The first strand of cDNA was synthesized with Moloney expressed during cardiac development and differentially expressed in Murine Leukemia Virus reverse transcriptase (M-MLV; TaKaRa, Dalian, certain epithelial cancers [26]. Although an RGD binding motif for a China). According to the expressed sequence tag (EST) sequence, a subgroup of integrin adhesion receptors is found in TGFBI, TGFBI putative B. floridae TGFBI gene partial sequence was obtained from α3β1-integrin binding has been shown to be mediated not via the AmphiEST database [28]. Two primers, forward primer: 5′-TCGGATG canonical RGD binding motif, which via two pentapeptides containing ATCCGTTATGAGTGC -3′ and reverse primer: 5′-TAGCGCGAATCATC a central Asp-Ile (DI) dimer found in the second and fourth to be TCCAACTT-3′, were designed to amplify the Branchiostoma belcheri conformationary similar to RGD peptides [27]. While an RGD motif is TGFBI gene sequence by RT-PCR. To obtain a complete cDNA sequence, not present in periostin, the DI dimers in Fas1 domains 2 and 4 are 5′-and3′-RACE were performed using a SMARTer RACE cDNA conserved between the two strongly suggesting a mechanism Amplification Kit (Clontech, CA, USA) according to the manufacturer's for periostin integrin binding [15]. instruction. The specific primers 5′-CCTGGTGATGAAGGCGGAGCA-3′ The amphioxus is the modern survivors of an ancient chordate and 5′-TTGCTGGGTGCGAAGACGGTG-3′ for 5′-RACE, and 5′-CACCG lineage, with a fossil record dating back to the Cambrian period. The ACACGGCGTTCCAGAAGG-3′ and 5′-GCCAGATGTGGCACCGCAAGG-3′ unique evolutionary position indicates that amphioxus is the basal for 3′-RACE, were designed according to the RT-PCR product. Primers chordate and a crucial organism for the study of chordate evolution. 5′-GACAGACGATTATCCGTTATGAGTGC-3′ (forward) and 5′-GCTAGA Thus, identification and characterization of TGFBIGs in amphioxus and TACTAACACGGAACAAGGGA-3′ (reverse) were used to amplify the studying the origin and evolution of TGFBI family will not only improve full-length cDNA sequence. The amplified fragments were cloned into our understanding of the process of development and disease of a pMD_19-T Simple Vector (TaKaRa, Dalian, China) and sequenced. amphioxus, but also provide the proximate evolution of TGFBIGs from invertebrates to vertebrates. Here we identified and characterized a 2.4. The spatial express patterns of B. belcheri TGFBI gene detected by TGFBI-like gene from B. belcheri, and also studied extensively the real-time PCR evolutionary history of TGFBI family genes. The amphioxus genome contains an AmphiTGFBI gene encoding a TGFBI homolog with 5 fas1 In the study of spatial distribution of the AmphiTGFBI in different domains. We also found positive selection and lost of Fas1 domain in tissues, total RNA was extracted from muscles, gills, intestine, notochord the evolution history of invertebrate. and hepatic cecum of fifteen healthy amphioxuses. The forward primer 5′-GCCAAGGACGAGGAGAGTAACG-3′ and reverse primer 5′-CCGCC 2. Material and methods GCCGACGATGACG-3′ were used in real-time PCR, and PCR reactions were run in triplicates using the following conditions: 10 min at 95 °C 2.1. Amphioxus cultivation followed by 40 cycles of 30 s at 95 °C and 15 s at 60 °C. Real-time PCR was carried out using IQ™ SYBRs Green Supermix (Bio-Rad) with a Mature amphioxus adults (B. belcheri) were obtained from Rotor-Gene 3000 real-time PCR machine (Corbett Research). Data were Zhanjiang (Guangdong province, China), cultured at 24–25 °C in a quantified using the 2−ΔΔCt method based on Ct values of AmphiTGFBI tank filled with air-pumped circulating artificial seawater, and fed and β-actin. Values were considered to be significant at P b 0.05. daily with Chlorella. 2.5. Alignment, conserved motifs and phylogenetic analysis 2.2. Data collection A multiple sequence alignment was performed for homology To identify homologs of TGFBI family genes, we employed BLASTp sequences of TGFBI family genes using MUSCLE [29] with default and tBLASTn to search TGFBI-homologous proteins in the genomes of parameters, which were manually curated if necessary. Gene structure the following organisms: single-celled eukaryotes (Saccharomyces and position of motifs were checked by hand using data from cerevisiae, Monosiga brevicollis), sponge (Amphimedon queenslandica), genes, and domain conservation was predicted by SMART software sea anemone (Nematostella vectensis), nematode (Caenorhabditis (http://smart.embl-heidelberg.de/). Conserved motifs were analyzed elegans), fruit fly(Drosophila melanogaster), amphioxus (Branchiostoma online using a MEME system Version 4.8.1 (http://meme.ebi.edu.au/ floridae), sea squirt (Ciona intestinalis), medaka (Oryzias latipes), meme/cgi-bin/meme.cgi) with default settings. Except for the (zebrafish Danio rerio), stickleback (Gasterosteus aculeatus), fugufish distribution of motif occurrences, the minimal and maximal motif (Takifugu rubripes), frog (Xenopus tropicalis), chicken (Gallus gallus), widths and the number of different motifs were defined as any number turkey (Meleagris gallopavo), rat (Rattus norvegicus), mouse (Mus of repetitions, 6, 30, and 15, respectively. Phylogenetic analyses were musculus) and human (Homo sapiens). All human TGFBI and POSTN conducted using MrBayes for a Bayesian analysis [30] employing a sequence were searched against the Refseq protein data from different mixed amino acid substitution model, using PhyML [31] for a maximum organisms. All blast hits were filtered, and only sequences with a blast likelihood (ML) analysis with default parameters and using MEGA5 [32] score of N150 and length of N50 were examined. We then tested each for Neighbor Joining (NJ) analysis. Phylogenetic support was verified sequence with a reciprocal blast search, and assigned it as a homologous with a bootstrap consensus tree inferred from 1000 replicates. gene if the best hit of one blast search matched the best hit of the other. If the sequences were not found reciprocally in two genomes, we picked 2.6. Codon-based sequence analysis the one with the best coverage, with a score of N150, a relative identity of N30% and a relative similarity of N40%. If the parameters of a protein The ratio of nonsynonymous/synonymous substitution rates (ω = were lower or domains of a protein were not similar to its Refseq dn/ds) can provide a measurement for the change of selective pressures. protein, we would assign it as “non-homologous gene”. When we did Values of ω =1,b1andN1 will indicate neutral evolution, purifying not find an ortholog, we verified the lack of orthologous sequences selection, and positive selection on the target gene, respectively. Here, X. Song et al. / Genomics 103 (2014) 147–153 149

Fig. 1. The AmphiTGFBI of Chinese amphioxus and a broad phylogenetic survey of TGFBI family genes. A) The nucleotide, deduced amino acid sequences and structure characterization of AmphiTGFBI from Chinese amphioxus (Brachiastoma belcheri). The AmphiTGFBI gene contains a SP (signal peptide) domain, an EMI domain and five Fas1 domains. B) The line chart showing the conservation index of aligned TGFBI family genes. These Fas1 domains have high conservation index except Fas1e domain. C) Diagram of the conservation of TGFBI family genes in different species. The topology of different animals was downloaded from NCBI database. 150 X. Song et al. / Genomics 103 (2014) 147–153 a codon-substitution model implemented in the CODEML program in homolog (POSTN) in the A. queenslandica genome and two TGFBI family the PAML4.4 software package was used to analyze changes in selective genes (one TGFBI and one POSTN) in the N. vectensis genome, but only pressure, which allow for variable selection patterns among amino acid one TGFBI family genes (best match to human TGFBI gene) in C. elegans, sites, M0, M1a (nearly neutral), M2a (positive selection), M7 (beta), M8 D. melanogaster, B. floridae,andC. intestinalis genomes. Therefore, it can (beta & ω), to test for the presence of sites under positive selection [33]. be inferred that the POSTN gene had been lost in bilateria and chordata The presence of codons evolving under positive selection was further in the evolutionary history. However, there have been two TGFBI family tested by contrasting the M1a and M2a models, and the M7 and M8 genes (one TGFBI and one POSTN) in vertebrates. Otherwise, the models by likelihood ratio tests (LRTs). To explore the divergence of invertebrate TGFBI family genes don't have EMI domain (Fig. 1C). In different branches of TGFBI family genes in the evolutionary history, a conclusion, the TGFBI family genes might be present in a common branch model of the CODEML software was used to compute the ancestor of A. queenslandica and duplication in N. vectensis genome. nonsynonymous/synonymous substitution rate ratio of different Subsequently, the POSTN gene had been lost in bilateria and chordata branches. Finally, positive selection sites in the B. belcheri AmphiTGFBI and happened gene duplication in vertebrate in the evolutionary history sequence were detected by applying a branch-site model and statistical of TGFBI family genes. analysis by Bayes empirical Bayes (BEB) methods. 3.2. Differential AmphiTGFBI in different tissues 3. Results The spatial expression pattern of AmphiTGFBI in adult B. belcheri was 3.1. Cloning and characterization of B. belcheri AmphiTGFBI gene investigated by quantitative real-time RT-PCR analysis of total RNA samples from five different tissues (intestine, muscles, gills, notochord AfragmentoftheAmphiTGFBI gene transcription product (1707 bp) and hepatic cecum) obtained from 15 individuals. Expression of was obtained by RT-PCR. Then based on the 1707 bp fragment, a AmphiTGFBI was detected in all tissues, and varied expression levels 1855 bp and a 489 bp fragments were amplified by 3′ RACE and 5′ were observed among the different tissues. As shown in Fig. 2,the RACE, respectively. After sequence assembly, full length AmphiTGFBI expression of AmphiTGFBI was high in notochord and intestine, low in cDNA of 3095 nucleotides was identified and an open reading frame the muscles, gills and hepatic cecum. (ORF) with 2604 bp was predicted. Besides the ORF, AmphiTGFBI cDNA ′ ′ contains a 5 -UTR with 84 bp and a 407 bp 3 -UTR including a 11 bp 3.3. Phylogenetic, conservation motifs and gene structure analyses of TGFBI poly (A) tail and a canonical polyadenylation signal (AATAAA). The family genes ORF encodes a putative AmphiTGFBI protein containing 867 amino acids (Fig. 1A). The complete AmphiTGFBI cDNA sequence and deduced To explore the evolutionary dynamics of TGFBI family genes, TGFBI AmphiTGFBI protein sequence have been submitted to GenBank and POSTN protein sequences from different animals were used to (GenBank ID: KF366657). construct a phylogenetic tree. The evolutionary relationships of TGFBI fi The AmphiTGFBI contains a signal peptide (SP), an EMI domain, ve family genes were revealed by the topology of a phylogenetic tree Fas1 domains (Fas1a-e) compared to no more than 4 Fas1 domains in supported by high bootstrap values (Fig. 3). As expected, the TGFBI family other animals. These Fas1 domains have high conservation index genes were classified into three groups: vertebrate TGFBI group, except Fas1e domain (Fig. 1). The AmphiTGFBI was conserved from vertebrate POSTN group and invertebrate TGFBI/POSTN-like group, – A. queenslandica to H. sapiens and shared 27 87.8% similarity to other invertebrate TGFBI/POSTN group may be further classified into three known TGFBI homolog (Table 1). Interestingly, AmphiTGFBI was more subclusters: a chordata TGFBI-like subcluster, a bilateria TGFBI-like similar to vertebrate POSTN (range from 47.1 to 50.9) than vertebrate subcluster and a metazoa TGFBI/POSTN-like subcluster. Therefore, we TGFBI (range from 47.2 to 47.9), while the AmphiTGFBI has the similar could assume that the TGFBI family genes has happened one time gene gene structure (the number of exons an introns) with the vertebrate loss and two times gene duplications in the evolutionary history, resulting TGFBIG.Therefore,theAmphiTGFBI might be an ancestor gene of in the POSTN genes loss in the genomes of bilateria and chordata. vertebrate TGFBI family genes. The conserved motifs of TGFBI family genes were changed in the When TGFBI family genes from different animals was analyzed, only evolutionary history. Although the conserved motif orders of vertebrate one TGFBI family gene was detected in the genome of A. queenslandica TGFBI and POSTN genes and chordata TGFBI gene are very similar, the and the gene was similar to human POSTN (Fig. 1C and Table S1). invertebrate TGFBI family genes conserved motif orders are not Therefore, the TGFBI family genes might be present in a common conserved with that of vertebrate (Table S2). An Arg-Gly-Asp (RGD) ancestor of A. queenslandica. Gene loss and duplication can be detected binding motif for a subgroup of integrin adhesion receptors was found in the evolutionary history of TGFBI family genes. There has one TGFBI in human and mouse TGFBI, while TGFBI α3β1-integrin binding has been shown to be mediated not via the canonical RGD binding motif, but via two pentapeptides containing a central Asp-Ile (DI). The Table 1 Homology analysis of AmphiTGFBI amino acid sequence with other known TGFBI family alignment analysis found that there were two DI motifs in vertebrate amino acid sequences determined by MatGat 2.01 software.

Species Similarity Identify

H. sapiens TGFBI 47.3 31.9 G. gallus TGFBI 47.9 31.6 X. tropicalis TGFBI 47.2 31.3 D. rerio TGFBI 47.6 29.7 H. sapiens POSTN 50.9 29.6 G. gallus POSTN 49.5 29.1 X. tropicalis POSTN 47.1 28.9 D. rerio POSTN 49.1 29.6 B. floridae TGFBI 87.8 81.2 D. melanogaster Mfas 34.4 16.5 Fig. 2. AmphiTGFBI gene expression in different tissues. Real-time RT-PCR was carried out C. elegans F26E4.7 33.9 15.6 with RNA samples from muscles, gills, intestine, notochord and hepatic cecum of fifteen N. vectensis POSTN 31.9 18 healthy B. belcheri. Each sample was run in triplicates. Vertical bars represent means ± N. vectensis TGFBI 34.9 19.5 SE (n = 3). The β-actin gene of B. belcheri was used as an internal control to calibrate A. queenslandica POSTN 27 15.1 the cDNA template for all samples. X. Song et al. / Genomics 103 (2014) 147–153 151

Fig. 3. Phylogenetic tree of TGFBI family proteins including B. belcheri TGFBI protein (AmphiTGFBI). The ML tree and NJ tree have a similar topology. Phylogenetic support was verified with the bootstrap consensus tree inferred from 1000 replicates. The ω value on the different branches of phylogenetic tree represents nonsynonymous/synonymous substitution rate (ω = dn/ds), which provides a measurement for the change of selective pressures.

TGFBI and POSTN and one RGD motif in TGFBI/POSTN from human and been found in chordata and invertebrate [1,2,14].Thevertebrate mouse. However, we only found one DI motif in B. belcheri TGFBI and chordata TGFBI family genes contain a signal peptide, EMI without RGD motif (Fig. S1). domain, and Fas1 domain, while the invertebrate TGFBI family genes To further explore the difference of conserved motifs between lost EMI domain in the evolutionary history (Fig. 1C). In this paper, vertebrates and invertebrates, the Fas1 domain and gene structure of we not only identified a TGFBIG homolog in B. belcheri AmphiTGFBI, TGFBI family genes were identified. The Fas1 domains and gene struc- but also discovered TGFBI family genes in N. vectensis genome and tures of TGFBI family genes have changed in the evolutionary history. As shown in Fig. 4,theFas1domainofA. queenslandica, N. vectensis and C. elegans which were more similar to vertebrate Fas1c were not clustered together with their relevant groups. The results indicated that the Fas1 domains in TGFBI genes may happen in domain rearrangement and lost in the evolutionary history. Although the gene structure and phase between vertebrate TGFBI/POSTN genes and B. floridae TGFBI gene are very similar, the invertebrate TGFBI family genes exhibits variable gene structures and phases (Fig. 5). All these results show that the TGFBI family genes have happened very big changes in invertebrate, but very conserved in vertebrate in the evolutionary history.

3.4. Selective pressure analysis of TGFBI family genes

ToexploretheselectivepressureofTGFBIfamilygenes,theCODEML software of the PAML package was used to compute the nonsynonymous to synonymous substitution ratio (ω). The M0 model showed that the TGFBI family genes underwent very strong purifying selection (ω =0.11)(Table 2). Both M1a vs. M2a and M7 vs. M8 all detected four positive selection sites, which is 5 F, 22 P, 23 A, 766 P (Table 2). We also detected the positive selection of different branches and found that there are many positive selection branches in invertebrate (Fig. 3). Therefore, the positive selection sites and positive selection branches in TGFBI family genes may be indicated functional diversity and structural changes between different TGFBI family genes.

4. Discussion Fig. 4. The phylogenetic tree of different Fas1 domains. Fas1 domains have happened 4.1. TGFBI family genes are conserved from metazoa to mammals domain duplication and loss in the evolutionary history. The topology was built by MrBaye 3.2 software. Hs: Homo sapiens,Mm:Mus musculus,Rn:Rattus norvegicus,Gg: Gallus gallus,Xt:Xenopus (Silurana) tropicalis,Dr:Danio rerio,Dm:Drosophila melanogaster, Since the discovery of the TGFBI and POSTN, the TGFBI and POSTN Ce: Caenorhabditis elegans,Nv:Nematostella vectensis,Aq:Amphimedon queenslandica,-T-: genes have been found in many vertebrates, but none of them have TGFBI, -P-: POSTN. 152 X. Song et al. / Genomics 103 (2014) 147–153

Fig. 5. Genomic structure and conservation domains of TGFBI family genes.

A. queenslandica genome in addition to investigations of the origin and [42]. Proteins are composed of different domains and the combinations evolution of TGFBI family genes. The C-terminal part of POSTN is of these domains, also known as domain architectures or domain markedly more variable among vertebrates than others in terms of arrangements, change during evolution, generating rearrangements exon count, length, and splicing pattern, which could be interpreted as [43]. These more complicated events could play very important roles in a consequence of neofunctionalization after the split between periostin generating novel genes and function, as they are the primary source of and its paralog transforming growth factor beta-induced (TGFBI) [15]. new domain architectures that are thought to be a main source of The Xenopus TGFBI is required for embryogenesis through regulation biological complexity in the and other species [44]. of canonical Wnt signaling and the extracellular matrix protein TGFBI With the derivation of a validated gold-standard corpus and better promotes myofibril bundling and muscle fiber growth in the zebrafish data integration with big experiments, gene fusion detection can truly embryo [12,14]. In addition, the expression of TGFBI is either decreased become a valuable tool for large-scale experimental biology [45]. or absent in 14 human tumor cell lines as compared to their normal Positive selection, also known as Darwinian selection, which is the counterparts, indicating that TGFBI may exert an inhibition on main mechanism that Darwin envisioned as giving rise to evolution, tumorigenesis and/or tumor progression [34]. TGFBI protein can also specific molecular genetic examples are very difficult to detect, is the suppress many diseases and cancer through different signaling process by which new advantageous genetic variants sweep a pathways [8,14,35,36]. Recent studies showed that the transforming population. The nonsynonymous/synonymous substitution rate ratio growth factor beta 1 (TGF-β1) may be involved into the human immune (ω = dn/ds) can provide a measurement for the change of selective response [37,38], so TGFBI may be also involved into immune response. pressures. Values of ω =1,b1andN1 will indicate neutral evolution, This should be further studied in the future. purifying selection, and positive selection on the target gene, respectively. Recent studies have shown that positive selection plays 4.2. The positive selection of TGFBI family genes has some relationships with an important role in gene duplication, functional diversification and their gene architecture genomic structural divergence [46,47]. Otherwise, the positive selection evolution may also be involved in adaptive evolution, gene function Evolution can change the structures and functions of genes in many losses and pseudogene [48–50]. Fortunately, we found four positive ways [39]. For example, gene duplication has long been identified as a selection sites and positive selection branches in the evolutionary major mechanism for generating new genes and functions, whereas history of TGFBI family genes. These results suggested that the gene gene loss plays a similar important role in shaping genomics content duplication, domain losses and duplications of TGFBI family genes may [40,41]. These events, as well as several others such as horizontal gene correlate with their positive selections. However, we can't conclude transfer, gene conversion, and domain rearrangement, interact together the TGFBI family genes underwent adaptive evolution, because the to generate “gene families” clusters of orthologous and paralogous genes function of TGFBI-like genes of low invertebrate is still unknown. with detectable common ancestry. However, gene duplication, loss, and In conclusion, the TGFBI family genes play very important roles in other events can occur at the subgene level, and it has been suggested animal development and diseases including cancers. Our studies extend that homology inference be applied to domains rather than proteins our understanding of the evolution of TGFBI family.

Table 2 Likelihood values and parameter estimates of computing position selection site by site-specific model for the TGFBI family genes.

Model lnL Parameter Positive selection site 2ΔL LRT

M0 (one rate) −39652.633623 ω = 0.11 None M1a (neutral) −39506.280287 ω0 = 0.10312, p0 = 0.90787 Not allowed 0 P N 0.9 ω1 = 1.00000, p1 = 0.09213 (M1a vs. M2a) M2a (selection) −39506.280287 ω0 = 0.10312, p0 = 0.90787 5 F, 22 P, 23 A, 766 P ω1 = 1.00000, p1 = 0.00015 ω2 = 1.00000, p2 = 0.17119 M7 (beta) −39189.188791 p = 1.35537, q = 8.98939 Not allowed M8 (beta & ω) −39182.022771 p0 = 0.98455, p = 1.46535, 5 F, 22 P, 23 A, 766 P 7.166 P b 0.05 q = 10.48036, p1 = 0.01545, (M7 vs. M8) ω = 1.00000 X. Song et al. / Genomics 103 (2014) 147–153 153

Supplementary data to this article can be found online at http://dx. [21] M. Shimazaki, K. Nakamura, I. Kii, T. Kashima, N. Amizuka, M. Li, M. Saito, K. Fukuda, T. Nishiyama, S. Kitajima, Y. Saga, M. Fukayama, M. Sata, A. Kudo, Periostin is doi.org/10.1016/j.ygeno.2013.10.002. essential for cardiac healing after acute myocardial infarction, J. Exp. Med. 205 (2008) 295–303. [22] S. Yoshida, K. Ishikawa, R. Asato, M. Arima, Y. Sassa, A. Yoshida, H. Yoshikawa, K. Acknowledgments Narukawa, S. Obika, J. Ono, S. Ohta, K. Izuhara, T. Kono, T. Ishibashi, Increased expression of periostin in vitreous and fibrovascular membranes obtained from This work was jointly supported by the Major Program of Natural patients with proliferative diabetic retinopathy, Invest. Ophthalmol. Vis. Sci. 52 – Science Research of Jiangsu Higher Education Institutions (No. (2011) 5670 5678. [23] R.A. Norris, C.B. Kern, A. Wessels, E.E. Wirrig, R.R. Markwald, C.H. Mjaatvedt, 12KJA180005), the Ph.D. Programs Foundation of Ministry of Education Detection of betaig-H3, a TGFbeta induced gene, during cardiac development and of China (No. 20113207110009), the Jiangsu Province Ordinary its complementary pattern with periostin, Anat. Embryol. (Berl.) 210 (2005) 13–23. University Innovative Research Project (No. CXZZ13_0411), and the [24] G.M. Calaf, C. Echiburu-Chau, Y.L. Zhao, T.K. Hei, BigH3 protein expression as a marker for breast cancer, Int. J. Mol. Med. 21 (2008) 561–568. Priority Academic Program Development of Jiangsu Higher Education [25] G.K. Klintworth, The molecular genetics of the corneal dystrophies—current status, Institutions. Front. Biosci. 8 (2003) d687–d713. [26] C. Ma, Y. Rong, D.R. Radiloff, M.B. Datto, B. Centeno, S. Bao, A.W. Cheng, F. Lin, S. Jiang, T.J. Yeatman, X.F. Wang, Extracellular matrix protein betaig-h3/TGFBI References promotes metastasis of colon cancer by enhancing cell extravasation, Genes Dev. 22 (2008) 308–321. [1] J. Skonier, M. Neubauer, L. Madisen, K. Bennett, G.D. Plowman, A.F. Purchio, cDNA [27] S.J. Park, S. Park, H.C. Ahn, I.S. Kim, B.J. Lee, Conformational resemblance between cloning and sequence analysis of beta ig-h3, a novel gene induced in a human the structures of integrin-activating pentapetides derived from betaig-h3 and RGD adenocarcinoma cell line after treatment with transforming growth factor-beta, peptide analogues in a membrane environment, Peptides 25 (2004) 199–205. DNA Cell Biol. 11 (1992) 511–522. [28] P. Jin, X. Ji, H. Wang, J. Li-Ling, F. Ma, AmphiEST: enabling comparative analysis of [2] J. Skonier, K. Bennett, V. Rothwell, S. Kosowski, G. Plowman, P. Wallace, S. Edelhoff, ESTs from five developmental stages of amphioxus, Mar. Genomics 3 (2010) C. Disteche, M. Neubauer, H. Marquardt, et al., beta ig-h3: a transforming growth 151–155. factor-beta-responsive gene encoding a secreted protein that inhibits cell [29] R.C. Edgar, MUSCLE: multiple sequence alignment with high accuracy and high attachment in vitro and suppresses the growth of CHO cells in nude mice, DNA throughput, Nucleic Acids Res. 32 (2004) 1792–1797. Cell Biol. 13 (1994) 571–584. [30] F. Ronquist, J.P. Huelsenbeck, MrBayes 3: Bayesian phylogenetic inference under [3] P. Romero, M. Vogel, J.M. Diaz, M.P. Romero, L. Herrera, Anticipation in familial mixed models, Bioinformatics 19 (2003) 1572–1574. lattice corneal dystrophy type I with R124C mutation in the TGFBI (BIGH3) gene, [31] S. Guindon, O. Gascuel, A simple, fast, and accurate algorithm to estimate large Mol. Vis. 14 (2008) 829–835. phylogenies by maximum likelihood, Syst. Biol. 52 (2003) 696–704. [4] V. Poulaki, K. Colby, Genetics of anterior and stromal corneal dystrophies, Semin. [32] K. Tamura, D. Peterson, N. Peterson, G. Stecher, M. Nei, S. Kumar, MEGA5: molecular Ophthalmol. 23 (2008) 9–17. evolutionary genetics analysis using maximum likelihood, evolutionary distance, [5] B.H. Lee, J.S. Bae, R.W. Park, J.E. Kim, J.Y. Park, I.S. Kim, betaig-h3 triggers signaling and maximum parsimony methods, Mol. Biol. Evol. 28 (2011) 2731–2739. pathways mediating adhesion and migration of vascular smooth muscle cells [33] Z. Yang, PAML 4: phylogenetic analysis by maximum likelihood, Mol. Biol. Evol. 24 through alphavbeta5 integrin, Exp. Mol. Med. 38 (2006) 153–161. (2007) 1586. [6] T.K. Hei, Y. Zhao, H. Zhou, V. Ivanov, Mechanism of radiation carcinogenesis: role of [34] Y.L. Zhao, C.Q. Piao, T.K. Hei, Downregulation of Betaig-h3 gene is causally linked to the TGFBI gene and the inflammatory signaling cascade, Adv. Exp. Med. Biol. 720 tumorigenic phenotype in asbestos treated immortalized human bronchial (2011) 163–170. epithelial cells, Oncogene 21 (2002) 7471–7477. [7] B. Li, G. Wen, Y. Zhao, J. Tong, T.K. Hei, The role of TGFBI in mesothelioma and breast [35] S. Bao, G. Ouyang, X. Bai, Z. Huang, C. Ma, M. Liu, R. Shao, R.M. Anderson, J.N. Rich, cancer: association with tumor suppression, BMC Cancer 12 (2012) 239. X.F. Wang, Periostin potently promotes metastatic growth of colon cancer by [8] G. Wen, M. Hong, B. Li, W. Liao, S.K. Cheng, B. Hu, G.M. Calaf, P. Lu, M.A. Partridge, J. augmenting cell survival via the Akt/PKB pathway, Cancer Cell 5 (2004) 329–339. Tong, T.K. Hei, Transforming growth factor-beta-induced protein (TGFBI) [36] G. Ouyang, M. Liu, K. Ruan, G. Song, Y. Mao, S. Bao, Upregulated expression of suppresses mesothelioma progression through the Akt/mTOR pathway, Int. J. periostin by hypoxia in non-small-cell lung cancer cells promotes cell survival via Oncol. 39 (2011) 1001–1009. the Akt/PKB pathway, Cancer Lett. 281 (2009) 213–219. [9] A.A. Ahmed, A.D. Mills, A.E. Ibrahim, J. Temple, C. Blenkiron, M. Vias, C.E. Massie, N.G. [37] K. Kawamoto, A. Pahuja, B.J. Hering, P. Bansal-Pakala, Transforming growth factor Iyer, A. McGeoch, R. Crawford, B. Nicke, J. Downward, C. Swanton, S.D. Bell, H.M. Earl, beta 1 (TGF-beta1) and rapamycin synergize to effectively suppress human T cell R.A. Laskey, C. Caldas, J.D. Brenton, The extracellular matrix protein TGFBI induces responses via upregulation of FoxP3+ Tregs, Transpl. Immunol. 23 (2010) 28–33. microtubule stabilization and sensitizes ovarian cancers to paclitaxel, Cancer Cell [38] N.J. Thornburg, B. Shepherd, J.E. Crowe Jr., Transforming growth factor beta is a 12 (2007) 514–527. major regulator of human neonatal immune responses following respiratory [10] S.Y.Chan,A.K.Chan,B.P.Cheung,Y.Liang,M.P.Leung,Identification of genes expressed syncytial virus infection, J. Virol. 84 (2010) 12895–12902. during myocardial development, Chin. Med. J. (Engl.) 116 (2003) 1329–1332. [39] Y.C. Wu, M.D. Rasmussen, M. Kellis, Evolution at the subgene level: domain [11] K. Schwab, L.T. Patterson, B.J. Aronow, R. Luckas, H.C. Liang, S.S. Potter, A catalogue of rearrangements in the Drosophila phylogeny, Mol. Biol. Evol. 29 (2012) 689–705. gene expression in the developing kidney, Kidney Int. 64 (2003) 1588–1604. [40] M. Lynch, J.S. Conery, The evolutionary fate and consequences of duplicate genes, [12] H.R. Kim, P.W. Ingham, The extracellular matrix protein TGFBI promotes myofibril Science 290 (2000) 1151–1155. bundling and muscle fibre growth in the zebrafish embryo, Dev. Dyn. 238 (2009) [41] M. Long, E. Betran, K. Thornton, W. Wang, The origin of new genes: glimpses from 56–65. the young and old, Nat. Rev. Genet. 4 (2003) 865–875. [13] Y. Zhang, G. Wen, G. Shao, C. Wang, C. Lin, H. Fang, A.S. Balajee, G. Bhagat, T.K. Hei, Y. [42] C.P. Ponting, R.R. Russell, The natural history of protein domains, Annu. Rev. Zhao, TGFBI deficiency predisposes mice to spontaneous tumor development, Biophys. Biomol. Struct. 31 (2002) 45–71. Cancer Res. 69 (2009) 37–44. [43] A.K. Bjorklund, D. Ekman, S. Light, J. Frey-Skott, A. Elofsson, Domain rearrangements [14] F. Wang, W. Hu, J. Xian, S. Ohnuma, J.D. Brenton, The Xenopus Tgfbi is required for in protein evolution, J. Mol. Biol. 353 (2005) 911–923. embryogenesis through regulation of canonical Wnt signalling, Dev. Biol. 379 [44] S. Pasek, J.L. Risler, P. Brezellec, Gene fusion/fission is a major contributor to (2013) 16–27. evolution of multi-domain bacterial proteins, Bioinformatics 22 (2006) 1418–1423. [15] S. Hoersch, M.A. Andrade-Navarro, Periostin shows increased evolutionary plasticity [45] V.J. Promponas, C.A. Ouzounis, I. Iliopoulos, Experimental evidence validating the in its alternatively spliced region, BMC Evol. Biol. 10 (2010) 30. computational inference of functional associations from gene fusion events: a critical [16] Y. Kudo, B.S. Siriwardena, H. Hatano, I. Ogawa, T. Takata, Periostin: novel diagnostic survey, Brief. Bioinform. (2013), http://dx.doi.org/10.1093/bib/bbs072(in press). and therapeutic target for cancer, Histol. Histopathol. 22 (2007) 1167–1174. [46] X. Song, P. Jin, S. Qin, L. Chen, F. Ma, The evolution and origin of animal Toll-like [17] T.K. Borg, R. Markwald, Periostin: more than just an adhesion molecule, Circ. Res. receptor signaling pathway revealed by network-level molecular evolutionary 101 (2007) 230–231. analyses, PLoS One 7 (12) (2012) e51657. [18] K. Inai, R.A. Norris, S. Hoffman, R.R. Markwald, Y. Sugi, BMP-2 induces cell migration [47] X. Song, P. Jin, J. Hu, S. Qin, L. Chen, J. Li-Ling, F. Ma, Involvement of AmphiREL, a and periostin expression during atrioventricular valvulogenesis, Dev. Biol. 315 Rel-like gene identified in Brachiastoma belcheri, in LPS-induced response: (2008) 383–396. implication for evolution of Rel subfamily genes, Genomics 99 (2012) 361–369. [19] P. Snider, R.B. Hinton, R.A. Moreno-Rodriguez, J. Wang, R. Rogers, A. Lindsley, F. Li, [48] H. Zhao, J.R. Yang, H. Xu, J. Zhang, Pseudogenization of the umami taste receptor D.A. Ingram, D. Menick, L. Field, A.B. Firulli, J.D. Molkentin, R. Markwald, S.J. gene Tas1r1 in the giant panda coincided with its dietary switch to bamboo, Mol. Conway, Periostin is required for maturation and extracellular matrix stabilization Biol. Evol. 27 (2010) 2669–2673. of noncardiomyocyte lineages of the heart, Circ. Res. 102 (2008) 752–760. [49] L. Yu, X.Y. Wang, W. Jin, P.T. Luan, N. Ting, Y.P. Zhang, Adaptive evolution of digestive [20] D.P. Wallace, M.T. Quante, G.A. Reif, E. Nivens, F. Ahmed, S.J. Hempson, G. Blanco, T. RNASE1 genes in leaf-eating monkeys revisited: new insights from ten additional Yamaguchi, Periostin induces proliferation of human autosomal dominant colobines, Mol. Biol. Evol. 27 (2010) 121–131. polycystic kidney cells through alphaV-integrin receptor, Am. J. Physiol. Renal [50] H. Zhao, S.J. Rossiter, E.C. Teeling, C. Li, J.A. Cotton, S. Zhang, The evolution of color Physiol. 295 (2008) F1463–F1471. vision in nocturnal mammals, Proc. Natl. Acad. Sci. U.S.A. 106 (2009) 8980–8985.