Analysis of Genetic Association Using Hierarchical Clustering and Cluster Validation Indices

Total Page:16

File Type:pdf, Size:1020Kb

Analysis of Genetic Association Using Hierarchical Clustering and Cluster Validation Indices Analysis of genetic association using Hierarchical Clustering and Cluster Validation Indices Inti A. Pagnuco1,2,3, Juan I. Pastore1,2,3, Guillermo Abras1, Marcel Brun1,2, and Virginia L. Ballarin1 1Digital Image Processing Lab, ICyTE, UNMdP, 2Department of Mathematics, School of Engineering, UNMdP, 3CONICET Abstract It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. 1 Introduction approach, while still providing, as shown in the results, a New technologies for genomic analysis measure the good approach to the optimal results. expression of thousand of genes simultaneously. An The Matlab code is available from https://subversion. important goal in some studies is to find sets of assembla.com/svn/pdilab-gaa/tags/Tag.1.1.1 co-expressed genes, to study co-regulation and biological In the next section we present a) an introduction to functions. This task is usually performed by clustering pattern recognition tools, b) the proposed algorithm, c) algorithms, where the whole family of genes, or a subset of the result in genomic data. The final conclusions show them, are clustered into meaningful groups based on their that these algorithms could result in a useful tool to the expression values in a set of experiment. These techniques researchers as a preliminar technique to data analysis. provide insight on the possible co-regulation between genes, under the hypothesis that co-expression may 2 Approach indicate evidence for co-regulation, and these hypotheses 2.1 Pattern Recognition of co-regulation must eventually be corroborated or be Pattern Recognition techniques have been widely used rejected by further experiments. Existing clustering to identify objects according their features [7]. Clustering algorithms are based, usually, on a global approach algorithms is a collection of these techniques that groups [1, 2, 3, 4, 5], or on just pairwise analysis [6]. While unclassified patterns based on similarity. some of them provide methods for the selection of the best In this work, the objects to be classified are genes. number of clusters [2], they result in a very large number The data consists of a set of m samples S1;S2; ::; Sm and of clusters, depending on the number of groups required n genes g1; g2; ::gn that are normally represented by a or determined by the algorithm. The large size of these matrix of two dimension M where M(i; j) represents the sets makes their use as hypothesis generator improbable. expression of the gen gi for the sample Sj. The expression A more comprehensive approach consist on the analysis of each gen gi, across all samples, correspond to a row of of co-expression of all possible sub-sets of genes. While the matrix M, and it is represented by a feature vector simple, this approach fails because of the need to analyze Xi(xi1; xi2; ::; xim), where each value xij represent the an exponential number of candidate sets. This search of expression of the gen gi for the sample Sj. The feature meaningful sets would be impractical if the number of vector is then composed by m samples, so that the regions available genes for the analysis is large, which is usual of the partition H belong to Rm. in this field. Each gen can be assigned to one of k possible groups, In this work we present a new algorithm, that combine and the result of a clustering algorithm is a partition of the the particularities of clustering hierarchical algorithm with set of genes, as a family W = W1;W2; ::; Wk, where each individual cluster validation based on the Silhoutte and Wi is a group of genes with a certain degree of similarity. Dunn indices, to generate a ranked list of gene group, avoiding the exhaustive search, but providing high quality 2.2 Hierarchical Clustering results. As in [2] we use the cluster validation index as a Hierarchical Clustering is one of most used clustering quality/homogeneity measure, but instead of using it to algorithms in bioinformatics [8]. Given a set of N items to generate the partitions, we use it to select the best sets, be clustered, and a N × N distance (or similarity) matrix, from several partitions, obtained by different variants of the basic process of hierarchical clustering [9] is described hierarchical clustering. This approach permits the use of here: (i) Start by assigning each item to a different cluster, many different clustering algorithms, combining the best so that for N items, there are N clusters, each containing results of each one of them, avoiding the use of a full search just one item. Let the distances between the clusters be 1 the same as the distances between the items they contain. (ii) Find the closest pair of clusters and merge them into a single cluster, so that now there is one cluster less. (iii) Compute distances between the new cluster and each of the old clusters. (iv) Repeat steps 2 and 3 until all items are clustered into a single cluster of size N.A more detailed description of the hierarchical clustering algorithms can be found in [9, 10]. Step 3 can be done in different ways, which is what distinguishes single-linkage, complete-linkage and average-linkage clustering. In single-linkage clustering, the distance between two clusters is the shortest distance from any member of one cluster to any member of the other cluster. If the data consist of similarities, the similarity between two clusters is the largest similarity from any member of one cluster to any member of the other cluster. In complete-linkage clustering, the distance between two clusters is the largest distance from any member of one cluster to any member of the other cluster. In average-linkage clustering, the distance between two clusters is the average distance from any member of one cluster to any member of the other cluster. 2.3 Quality measures A common problem of clustering algorithms is the validation of results. In the next subsection we explain the two types of internal validation [11] used in this work. 2.3.1 Silhouette Index One of the methods of internal validation used is Silhouette index, which measures the quality of clustering as the average quality of its elements [12]. The Silhouette value for an element ranges between −1 and 1. A high Silhouette value indicates that the element is closer to its own cluster elements than to the ones that do not belong to its own cluster. To define the Silhouette index for the points, two measures need to be defined first. Lets N be the number of clusters. Let X 2 Ck a point (element) that belongs to the cluster Ck, and let nk the number of elements of Ck, then the first measure a(X), the average distance of X to the points of Ck is defined by: a(X) = 1 P d(X; Y ) nk−1 Y 2Ck;Y 6=X The second measure, b(X), the average distance to the nearest cluster, for a point X 2 Ck, is defined by: 1 P b(X) = minh=1::N;h6=k d(X; Y ) nh Y 2Ch Finally, the Silhouette index S(X) is defined by: b(X S(X) = on a list. (c) Once a list of sets and scores is obtained, ∗ d(Ck;C ) D(Ck) = (5) sort the list based on score (higher to lower). ∆(Ck) As an additional approach, we combine all the variants, ∗ where C is the set of points not in Ck (the same idea forming a large sets of candidate groups, still smaller ∗ N used with Silhoutte), d(Ck;C ) is the distance between than the maximum 2 sets, and select the best groups the two clusters and (∆Ck) is the size of the cluster Ck. from this set. This approach extends the search space, We call this index a Cluster Specific Dunn Index. relative to the use of only one algorithm, maintaining the A key aspect of this work is the selection of a cluster computational requirements still low. Another advantage score that reflects the compactness and separateness of of this approach is that new clustering algorithms may each set of profiles relative to the rest of the profiles. be added to the pool, generating more candidate groups, Many cluster validation measures can be found in the therefore improving the overall quality of the result. In literature grouped in three categories: internal, relative addition, this approach does not depend on the selection and external [17, 18]. On this specific problem, external of a specific combination. Every algorithm, or variant of indices are not viable because we just don't know what is an algorithm, used to partition the set of profiles, can be the expected partition. In the other hand, relative indices added to the combination, because it could only improve were proven to be unreliable regarding cluster quality in the results, by adding new sets not previously detected [21, a probabilistic framework [19].
Recommended publications
  • A University of Sussex Phd Thesis Available Online Via
    A University of Sussex PhD thesis Available online via Sussex Research Online: http://sro.sussex.ac.uk/ This thesis is protected by copyright which belongs to the author. This thesis cannot be reproduced or quoted extensively from without first obtaining permission in writing from the Author The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the Author When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given Please visit Sussex Research Online for more information and further details Exploring interactions between Epstein- Barr virus transcription factor Zta and The Human Genome By IJIEL BARAK NARANJO PEREZ FERNANDEZ A Thesis submitted for the degree of Doctor of Philosophy University Of Sussex School of Life Sciences September 2017 ii I hereby declare that this thesis has not been and will not be, submitted in whole or in part to another University for the award of any other degree. Signature:…………………………..…………………………..……………………… iii Acknowledgements I want to thank Professor Alison J Sinclair for her guidance, mentoring and above all continuous patience. During the time that I’ve been part of her lab I’ve appreciated her wisdom as an educator her foresight as a scientist and tremendous love as a parent. I wish that someday soon rather than later her teachings are reflected in my person and career; hopefully inspiring others like me. Thanks to Professor Michelle West for her help whenever needed or offered. Her sincere and honest feedback, something that I only learned to appreciate after my personal scientific insight was developed.
    [Show full text]
  • Revealing the Role of the Human Blood Plasma Proteome in Obesity Using Genetic Drivers
    ARTICLE https://doi.org/10.1038/s41467-021-21542-4 OPEN Revealing the role of the human blood plasma proteome in obesity using genetic drivers Shaza B. Zaghlool 1,11, Sapna Sharma2,3,4,11, Megan Molnar 2,3, Pamela R. Matías-García2,3,5, Mohamed A. Elhadad 2,3,6, Melanie Waldenberger 2,3,7, Annette Peters 3,4,7, Wolfgang Rathmann4,8, ✉ Johannes Graumann 9,10, Christian Gieger2,3,4, Harald Grallert2,3,4,12 & Karsten Suhre 1,12 Blood circulating proteins are confounded readouts of the biological processes that occur in 1234567890():,; different tissues and organs. Many proteins have been linked to complex disorders and are also under substantial genetic control. Here, we investigate the associations between over 1000 blood circulating proteins and body mass index (BMI) in three studies including over 4600 participants. We show that BMI is associated with widespread changes in the plasma proteome. We observe 152 replicated protein associations with BMI. 24 proteins also associate with a genome-wide polygenic score (GPS) for BMI. These proteins are involved in lipid metabolism and inflammatory pathways impacting clinically relevant pathways of adiposity. Mendelian randomization suggests a bi-directional causal relationship of BMI with LEPR/LEP, IGFBP1, and WFIKKN2, a protein-to-BMI relationship for AGER, DPT, and CTSA, and a BMI-to-protein relationship for another 21 proteins. Combined with animal model and tissue-specific gene expression data, our findings suggest potential therapeutic targets fur- ther elucidating the role of these proteins in obesity associated pathologies. 1 Department of Physiology and Biophysics, Weill Cornell Medicine-Qatar, Doha, Qatar.
    [Show full text]
  • S41467-020-18249-3.Pdf
    ARTICLE https://doi.org/10.1038/s41467-020-18249-3 OPEN Pharmacologically reversible zonation-dependent endothelial cell transcriptomic changes with neurodegenerative disease associations in the aged brain Lei Zhao1,2,17, Zhongqi Li 1,2,17, Joaquim S. L. Vong2,3,17, Xinyi Chen1,2, Hei-Ming Lai1,2,4,5,6, Leo Y. C. Yan1,2, Junzhe Huang1,2, Samuel K. H. Sy1,2,7, Xiaoyu Tian 8, Yu Huang 8, Ho Yin Edwin Chan5,9, Hon-Cheong So6,8, ✉ ✉ Wai-Lung Ng 10, Yamei Tang11, Wei-Jye Lin12,13, Vincent C. T. Mok1,5,6,14,15 &HoKo 1,2,4,5,6,8,14,16 1234567890():,; The molecular signatures of cells in the brain have been revealed in unprecedented detail, yet the ageing-associated genome-wide expression changes that may contribute to neurovas- cular dysfunction in neurodegenerative diseases remain elusive. Here, we report zonation- dependent transcriptomic changes in aged mouse brain endothelial cells (ECs), which pro- minently implicate altered immune/cytokine signaling in ECs of all vascular segments, and functional changes impacting the blood–brain barrier (BBB) and glucose/energy metabolism especially in capillary ECs (capECs). An overrepresentation of Alzheimer disease (AD) GWAS genes is evident among the human orthologs of the differentially expressed genes of aged capECs, while comparative analysis revealed a subset of concordantly downregulated, functionally important genes in human AD brains. Treatment with exenatide, a glucagon-like peptide-1 receptor agonist, strongly reverses aged mouse brain EC transcriptomic changes and BBB leakage, with associated attenuation of microglial priming. We thus revealed tran- scriptomic alterations underlying brain EC ageing that are complex yet pharmacologically reversible.
    [Show full text]
  • Brain-Specific Knock-Out of Hypoxia-Inducible Factor-1Α
    The Journal of Neuroscience, April 20, 2005 • 25(16):4099–4107 • 4099 Neurobiology of Disease Brain-Specific Knock-Out of Hypoxia-Inducible Factor-1␣ Reduces Rather Than Increases Hypoxic–Ischemic Damage Rob Helton,1* Jiankun Cui,2* John R. Scheel,1* Julie A. Ellison,1 Chris Ames,1 Claire Gibson,2 Barbara Blouw,3 Ling Ouyang,1 Ioannis Dragatsis,4 Scott Zeitlin,5 Randall S. Johnson,3 Stuart A. Lipton,2 and Carrolee Barlow1 1Laboratory of Genetics, The Salk Institute for Biological Studies, and 2Center for Neuroscience and Aging, The Burnham Institute, La Jolla, California 92037, 3Molecular Biology Section, Division of Biology, University of California, San Diego, La Jolla, California 92093, 4Department of Physiology, The University of Tennessee, Health Science Center, Memphis, Tennessee 38163, and 5Department of Neuroscience, University of Virginia School of Medicine, Charlottesville, Virginia 22908 ␣ ␣ Hypoxia-inducible factor-1 (HIF-1 ) plays an essential role in cellular and systemic O2 homeostasis by regulating the expression of genes important in glycolysis, erythropoiesis, angiogenesis, and catecholamine metabolism. It is also believed to be a key component of the cellular response to hypoxia and ischemia under pathophysiological conditions, such as stroke. To clarify the function of HIF-1␣ in the brain, we exposed adult mice with late-stage brain deletion of HIF-1␣ to hypoxic injuries. Contrary to expectations, the brains from the HIF-1␣-deficient mice were protected from hypoxia-induced cell death. These surprising findings suggest that decreas- ing the level of HIF-1␣ can be neuroprotective. Gene chip expression analysis revealed that, contrary to expectations, the majority of hypoxia-dependent gene-expression changes were unaltered, whereas a specific downregulation of apoptotic genes was observed in the HIF-1␣-deficient mice.
    [Show full text]
  • Gene-Based and Pathway-Based Genome-Wide Association Study of Alcohol Dependence
    • 111 • Shanghai Archives of Psychiatry, 2015, Vol. 27, No. 2 •Original research article• Gene-based and pathway-based genome-wide association study of alcohol dependence Lingjun ZUO1, Clarence K. ZHANG2,3*, Frederick G. SAYWARD4,5, Kei-Hoi CHEUNG4, Kesheng WANG6, John H. KRYSTAL1, Hongyu ZHAO2, Xingguang LUO1* Background: The organization of risk genes within signaling pathways may provide clues about the converging neurobiological effects of risk genes for alcohol dependence. Aims: Identify risk genes and risk gene pathways for alcohol dependence. Methods: We conducted a pathway-based genome-wide association study (GWAS) of alcohol dependence using a gene-set-rich analytic approach. Approximately one million genetic markers were tested in the discovery sample which included 1409 European-American (EA) alcohol dependent individuals and 1518 EA healthy comparison subjects. An additional 681 African-American (AA) cases and 508 AA healthy subjects served as the replication sample. Results: We identified several genome-wide replicable risk genes and risk pathways that were significantly associated with alcohol dependence. After applying the Bonferroni correction for multiple testing, the ‘cell- extracellular matrix interactions’ pathway (p<2.0E-4 in EAs) and the PXN gene (which encodes paxillin) (p=3.9E-7 in EAs) within this pathway were the most promising risk factors for alcohol dependence. There were also two nominally replicable pathways enriched in alcohol dependence-related genes in both EAs (0.015≤p≤0.035) and AAs (0.025≤p≤0.050): the ‘Na+/Cl- dependent neurotransmitter transporters’ pathway and the ‘other glycan degradation’ pathway. Conclusions: These findings provide new evidence highlighting several genes and biological signaling processes that may be related to the risk for alcohol dependence.
    [Show full text]
  • Mouse Ccnl2 Conditional Knockout Project (CRISPR/Cas9)
    https://www.alphaknockout.com Mouse Ccnl2 Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Ccnl2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Ccnl2 gene (NCBI Reference Sequence: NM_207678 ; Ensembl: ENSMUSG00000029068 ) is located on Mouse chromosome 4. 11 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 11 (Transcript: ENSMUST00000030944). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Ccnl2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-128M14 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 5 starts from about 37.9% of the coding region. The knockout of Exon 5 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 2559 bp, and the size of intron 5 for 3'-loxP site insertion: 2341 bp. The size of effective cKO region: ~565 bp. The cKO region does not have any other known gene. Page 1 of 7 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele gRNA region 5' gRNA region 3' 1 5 11 Targeting vector Targeted allele Constitutive KO allele (After Cre recombination) Legends Exon of mouse Ccnl2 Homology arm cKO region loxP site Page 2 of 7 https://www.alphaknockout.com Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats.
    [Show full text]
  • Circulating Protein Signatures and Causal Candidates for Type 2 Diabetes
    Diabetes Volume 69, August 2020 1843 Circulating Protein Signatures and Causal Candidates for Type 2 Diabetes Valborg Gudmundsdottir,1,2 Shaza B. Zaghlool,3 Valur Emilsson,2,4 Thor Aspelund,1,2 Marjan Ilkov,2 Elias F. Gudmundsson,2 Stefan M. Jonsson,1 Nuno R. Zilhão,2 John R. Lamb,5 Karsten Suhre,3 Lori L. Jennings,6 and Vilmundur Gudnason1,2 Diabetes 2020;69:1843–1853 | https://doi.org/10.2337/db19-1070 GENETICS/GENOMES/PROTEOMICS/METABOLOMICS The increasing prevalence of type 2 diabetes poses More than 240 genetic loci have been associated with type a major challenge to societies worldwide. Blood-based 2 diabetes in genome-wide association studies (GWAS) factors like serum proteins are in contact with every (1–5), and blood-based biomarker candidates for type organ in the body to mediate global homeostasis and 2 diabetes have begun to emerge, perhaps most notably may thus directly regulate complex processes such as the branched-chain amino acids (BCAAs) (6,7), the catab- aging and the development of common chronic dis- olism of which has recently been proposed as a novel eases. We applied a data-driven proteomics approach, treatment target for obesity-associated insulin resistance measuring serum levels of 4,137 proteins in 5,438 elderly (8). However, only fragmentary data are available for Icelanders, and identified 536 proteins associated with serum protein links to type 2 diabetes (9). While few prevalent and/or incident type 2 diabetes. We validated biomarker candidates provide much improvement in a subset of the observed associations in an independent type 2 diabetes prediction over conventional measures case-control study of type 2 diabetes.
    [Show full text]
  • Gene Expression Responses to DNA Damage Are Altered in Human Aging and in Werner Syndrome
    Oncogene (2005) 24, 5026–5042 & 2005 Nature Publishing Group All rights reserved 0950-9232/05 $30.00 www.nature.com/onc Gene expression responses to DNA damage are altered in human aging and in Werner Syndrome Kasper J Kyng1,2, Alfred May1, Tinna Stevnsner2, Kevin G Becker3, Steen Klvra˚ 4 and Vilhelm A Bohr*,1 1Laboratory of Molecular Gerontology, National Institute on Aging, National Institutes of Health, 5600 Nathan Shock Drive, Baltimore, MD 21224, USA; 2Danish Center for Molecular Gerontology, Department of Molecular Biology, University of Aarhus, DK-8000 Aarhus C, Denmark; 3Gene Expression and Genomics Unit, National Institute on Aging, National Institutes of Health, Baltimore, MD 21224, USA; 4Institute for Human Genetics, University of Aarhus, Denmark The accumulation of DNA damage and mutations is syndromes, caused by heritable mutations inactivating considered a major cause of cancer and aging. While it is proteins that sense or repair DNA damage, which known that DNA damage can affect changes in gene accelerate some but not all signs of normal aging (Hasty expression, transcriptional regulation after DNA damage et al., 2003). Age is associated withan increase in is poorly understood. We characterized the expression of susceptibility to various forms of stress, and sporadic 6912 genes in human primary fibroblasts after exposure to reports suggest that an age-related decrease in DNA three different kinds of cellular stress that introduces repair may increase the susceptibility of cells to agents DNA damage: 4-nitroquinoline-1-oxide (4NQO), c-irra- causing DNA damage. Reduced base excision repair has diation, or UV-irradiation. Each type of stress elicited been demonstrated in nuclear extracts from aged human damage specific gene expression changes of up to 10-fold.
    [Show full text]
  • The Limb-Girdle Muscular Dystrophies and the Dystrophinopathies Review Article
    Review Article 04/25/2018 on mAXWo3ZnzwrcFjDdvMDuzVysskaX4mZb8eYMgWVSPGPJOZ9l+mqFwgfuplwVY+jMyQlPQmIFeWtrhxj7jpeO+505hdQh14PDzV4LwkY42MCrzQCKIlw0d1O4YvrWMUvvHuYO4RRbviuuWR5DqyTbTk/icsrdbT0HfRYk7+ZAGvALtKGnuDXDohHaxFFu/7KNo26hIfzU/+BCy16w7w1bDw== by https://journals.lww.com/continuum from Downloaded Downloaded from Address correspondence to https://journals.lww.com/continuum Dr Stanley Jones P. Iyadurai, Ohio State University, Wexner The Limb-Girdle Medical Center, Department of Neurology, 395 W 12th Ave, Columbus, OH 43210, Muscular Dystrophies and [email protected]. Relationship Disclosure: by mAXWo3ZnzwrcFjDdvMDuzVysskaX4mZb8eYMgWVSPGPJOZ9l+mqFwgfuplwVY+jMyQlPQmIFeWtrhxj7jpeO+505hdQh14PDzV4LwkY42MCrzQCKIlw0d1O4YvrWMUvvHuYO4RRbviuuWR5DqyTbTk/icsrdbT0HfRYk7+ZAGvALtKGnuDXDohHaxFFu/7KNo26hIfzU/+BCy16w7w1bDw== Dr Iyadurai has received the Dystrophinopathies personal compensation for serving on the advisory boards of Allergan; Alnylam Stanley Jones P. Iyadurai, MSc, PhD, MD; John T. Kissel, MD, FAAN Pharmaceuticals, Inc; CSL Behring; and Pfizer, Inc. Dr Kissel has received personal ABSTRACT compensation for serving on a consulting board of AveXis, Purpose of Review: The classic approach to identifying and accurately diagnosing limb- Inc; as journal editor of Muscle girdle muscular dystrophies (LGMDs) relied heavily on phenotypic characterization and & Nerve; and as a consultant ancillary studies including muscle biopsy. Because of rapid advances in genetic sequencing for Novartis AG. Dr Kissel has received research/grant methodologies,
    [Show full text]
  • Human Induced Pluripotent Stem Cell–Derived Podocytes Mature Into Vascularized Glomeruli Upon Experimental Transplantation
    BASIC RESEARCH www.jasn.org Human Induced Pluripotent Stem Cell–Derived Podocytes Mature into Vascularized Glomeruli upon Experimental Transplantation † Sazia Sharmin,* Atsuhiro Taguchi,* Yusuke Kaku,* Yasuhiro Yoshimura,* Tomoko Ohmori,* ‡ † ‡ Tetsushi Sakuma, Masashi Mukoyama, Takashi Yamamoto, Hidetake Kurihara,§ and | Ryuichi Nishinakamura* *Department of Kidney Development, Institute of Molecular Embryology and Genetics, and †Department of Nephrology, Faculty of Life Sciences, Kumamoto University, Kumamoto, Japan; ‡Department of Mathematical and Life Sciences, Graduate School of Science, Hiroshima University, Hiroshima, Japan; §Division of Anatomy, Juntendo University School of Medicine, Tokyo, Japan; and |Japan Science and Technology Agency, CREST, Kumamoto, Japan ABSTRACT Glomerular podocytes express proteins, such as nephrin, that constitute the slit diaphragm, thereby contributing to the filtration process in the kidney. Glomerular development has been analyzed mainly in mice, whereas analysis of human kidney development has been minimal because of limited access to embryonic kidneys. We previously reported the induction of three-dimensional primordial glomeruli from human induced pluripotent stem (iPS) cells. Here, using transcription activator–like effector nuclease-mediated homologous recombination, we generated human iPS cell lines that express green fluorescent protein (GFP) in the NPHS1 locus, which encodes nephrin, and we show that GFP expression facilitated accurate visualization of nephrin-positive podocyte formation in
    [Show full text]
  • Comparative Transcriptomics Reveals Similarities and Differences
    Seifert et al. BMC Cancer (2015) 15:952 DOI 10.1186/s12885-015-1939-9 RESEARCH ARTICLE Open Access Comparative transcriptomics reveals similarities and differences between astrocytoma grades Michael Seifert1,2,5*, Martin Garbe1, Betty Friedrich1,3, Michel Mittelbronn4 and Barbara Klink5,6,7 Abstract Background: Astrocytomas are the most common primary brain tumors distinguished into four histological grades. Molecular analyses of individual astrocytoma grades have revealed detailed insights into genetic, transcriptomic and epigenetic alterations. This provides an excellent basis to identify similarities and differences between astrocytoma grades. Methods: We utilized public omics data of all four astrocytoma grades focusing on pilocytic astrocytomas (PA I), diffuse astrocytomas (AS II), anaplastic astrocytomas (AS III) and glioblastomas (GBM IV) to identify similarities and differences using well-established bioinformatics and systems biology approaches. We further validated the expression and localization of Ang2 involved in angiogenesis using immunohistochemistry. Results: Our analyses show similarities and differences between astrocytoma grades at the level of individual genes, signaling pathways and regulatory networks. We identified many differentially expressed genes that were either exclusively observed in a specific astrocytoma grade or commonly affected in specific subsets of astrocytoma grades in comparison to normal brain. Further, the number of differentially expressed genes generally increased with the astrocytoma grade with one major exception. The cytokine receptor pathway showed nearly the same number of differentially expressed genes in PA I and GBM IV and was further characterized by a significant overlap of commonly altered genes and an exclusive enrichment of overexpressed cancer genes in GBM IV. Additional analyses revealed a strong exclusive overexpression of CX3CL1 (fractalkine) and its receptor CX3CR1 in PA I possibly contributing to the absence of invasive growth.
    [Show full text]
  • Nº Ref Uniprot Proteína Péptidos Identificados Por MS/MS 1 P01024
    Document downloaded from http://www.elsevier.es, day 26/09/2021. This copy is for personal use. Any transmission of this document by any media or format is strictly prohibited. Nº Ref Uniprot Proteína Péptidos identificados 1 P01024 CO3_HUMAN Complement C3 OS=Homo sapiens GN=C3 PE=1 SV=2 por 162MS/MS 2 P02751 FINC_HUMAN Fibronectin OS=Homo sapiens GN=FN1 PE=1 SV=4 131 3 P01023 A2MG_HUMAN Alpha-2-macroglobulin OS=Homo sapiens GN=A2M PE=1 SV=3 128 4 P0C0L4 CO4A_HUMAN Complement C4-A OS=Homo sapiens GN=C4A PE=1 SV=1 95 5 P04275 VWF_HUMAN von Willebrand factor OS=Homo sapiens GN=VWF PE=1 SV=4 81 6 P02675 FIBB_HUMAN Fibrinogen beta chain OS=Homo sapiens GN=FGB PE=1 SV=2 78 7 P01031 CO5_HUMAN Complement C5 OS=Homo sapiens GN=C5 PE=1 SV=4 66 8 P02768 ALBU_HUMAN Serum albumin OS=Homo sapiens GN=ALB PE=1 SV=2 66 9 P00450 CERU_HUMAN Ceruloplasmin OS=Homo sapiens GN=CP PE=1 SV=1 64 10 P02671 FIBA_HUMAN Fibrinogen alpha chain OS=Homo sapiens GN=FGA PE=1 SV=2 58 11 P08603 CFAH_HUMAN Complement factor H OS=Homo sapiens GN=CFH PE=1 SV=4 56 12 P02787 TRFE_HUMAN Serotransferrin OS=Homo sapiens GN=TF PE=1 SV=3 54 13 P00747 PLMN_HUMAN Plasminogen OS=Homo sapiens GN=PLG PE=1 SV=2 48 14 P02679 FIBG_HUMAN Fibrinogen gamma chain OS=Homo sapiens GN=FGG PE=1 SV=3 47 15 P01871 IGHM_HUMAN Ig mu chain C region OS=Homo sapiens GN=IGHM PE=1 SV=3 41 16 P04003 C4BPA_HUMAN C4b-binding protein alpha chain OS=Homo sapiens GN=C4BPA PE=1 SV=2 37 17 Q9Y6R7 FCGBP_HUMAN IgGFc-binding protein OS=Homo sapiens GN=FCGBP PE=1 SV=3 30 18 O43866 CD5L_HUMAN CD5 antigen-like OS=Homo
    [Show full text]