Kingsford, C. School of Computer Science H-Index: 26

Total Page:16

File Type:pdf, Size:1020Kb

Kingsford, C. School of Computer Science H-Index: 26 I. CURRICULUM VITAE CARLETON KINGSFORD EDUCATION Ph.D. in Computer Science (advisor: Mona Singh) Princeton University 2005 M.A. in Computer Science Princeton University 2002 B.S. in Computer Science (second major: Mathematics) Duke University 2000 EMPLOYMENT 2019 – Present Professor, Computational Biology Department, School of Computer Science, Carnegie Mellon University. (Courtesy appointment Department of Biological Sciences.) 2016 – 2019 Associate Professor (with tenure), Computational Biology Department, School of Computer Science, Carnegie Mellon University. (Courtesy appointment Department of Biological Sciences.) 2012 – 2016 Associate Professor (without tenure), Computational Biology Department (formerly the Lane Center for Computational Biology), School of Computer Science, Carnegie Mellon University. 2007 – 2012 Assistant Professor, Computer Science Department, University of Maryland, College Park (Courtesy appointments in Institute for Advanced Computer Studies, Applied Mathematics & Statistics, Biological Sciences Graduate Program, and Department of Bioengineering) 2005 – 2007 Postdoctoral Fellow, Steven L. Salzberg Group, Center for Bioinformatics and Computational Biology, University of Maryland, College Park II. PUBLICATION LIST H-index: 26 CHAPTERS IN BOOKS 1. P. Spealman, H. Wang, G.E. May, C. Kingsford, and C.J. McManus. Exploring ribosome positioning on translating transcripts with ribosome profiling. Methods in Molecular Biology, Springer (2015). 2. H. Lee and C. Kingsford. Accurate assembly and typing of HLA using a graph-guided assembler Kourami. HLA Typing: Methods and Protocols, Sebastian Boegel, editor. (2018). REFEREED JOURNAL PAPERS – PUBLISHED Kingsford, C. School of Computer Science 1. B. Chazelle, C. Kingsford, and M. Singh. A semidefinite programming approach to side- chain positioning with new rounding strategies. INFORMS Journal on Computing, Special Issue on Computational Molecular Biology / Bioinformatics, 16:380-392 (2004). [Cited ≥ 91 times; Conference version appeared as: The side-chain positioning problem: a semidefinite programming formulation with new rounding schemes. In Proceedings of ACM FCRC 2003, Principles of Computing and Knowledge: Paris Kanellakis Memorial Workshop, pages 86-94 (2003).] 2. C. Kingsford, B. Chazelle, and M. Singh. Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics 21:1028-1039 (2005). [Cited ≥ 184 times.] 3. C. Kingsford, K. Ayanbule, and S. Salzberg. Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake. Genome Biology 8:R22 (2007). [Cited ≥ 346 times.] 4. S. L. Salzberg, C. Kingsford, G. Cattoli, D. J. Spiro, D. A. Janies, M. Mehrez Aly, I. H. Brown, E. Couacy-Hymann, G. Mario De Mia, D.H. Dung, A. Guercio, T. Joannis, A. S. Maken Ali, A. Osmani, I. Padalino, M. D. Saad, V. Savic, N. A. Sengamalay, S. Yingst, J. Zaborsky, O. Zorman-Rojs, E. Ghedin, and I. Capua. Genome analysis linking recent European and African influenza (H5N1) viruses. Emerging Infectious Diseases 13(5):713-718 (2007). [Cited ≥ 199 times.] 5. C. Kingsford, A. Delcher, and S. L. Salzberg. A unified model explaining the offsets of overlapping and near-overlapping prokaryotic genes. Molecular Biology and Evolution 24:2091-2098 (2007). 6. C. Kingsford and S. L. Salzberg. What are decision trees? (Review), Nature Biotechnology 26(9):1011-1013 (2008). [Cited ≥ 140 times.] 7. S. Navlakha, M. C. Schatz, and C. Kingsford. Revealing biological modules using graph summarization. Journal of Computational Biology 16(2):253-264 (2009). [Presented at RECOMB-SB/RG satellite conference, 2008; cited ≥ 75 times.] 8. C. Kingsford†, N. Nagarajan†, and S. Salzberg. 2009 Swine-Origin Influenza A (H1N1) resembles previous influenza isolates. PLoS ONE 4(7):e6402 (2009). †First two authors contributed equally; C.K. corresponding author. [Cited ≥ 68 times.] 9. S. Navlakha, J. White, N. Nagarajan, M. Pop, and C. Kingsford. Finding biologically accurate clusterings in hierarchical decompositions using the variation of information. Journal of Computational Biology 17(3):503-516 (2010). [Conference version appeared in Proceedings of 13th Annual International Conference on Research in Computational Molecular Biology (RECOMB), pages 400-418 (2009).] 10. C. Kingsford, M. Schatz, M. Pop. Assembly complexity of prokaryotic genomes using short reads. BMC Bioinformatics 11:21 (2010). [Designated highly accessed; top-10 most-viewed articles Jan/Feb 2010; top-10 cited article in BMC Bioinformatics for 2010; cited ≥ 102 times.] 11. S. Navlakha and C. Kingsford. The power of protein interaction networks for associating diseases with genes. Bioinformatics 26 (8):1057-1063 (2010). [Recommended by the Faculty of 1000 (http://f1000.com/3314959); cited ≥ 255 times.] 12. J. White, S. Navlakha, N. Nagarajan, C. Kingsford, and M. Pop. Alignment and clustering of phylogenetic markers – implications for microbial diversity studies. BMC Kingsford, C. School of Computer Science Bioinformatics 11:152 (2010). [Designated highly accessed by the journal; cited ≥ 73 times.] 13. C. Kingsford, E. Zaslavsky, and M. Singh. A cost-aggregating integer linear program for motif finding. Journal of Discrete Algorithms 9(4):326-334 (2011). [Conference version appeared as: A compact mathematical programming formulation for DNA motif finding. In Proceedings of the 17th Annual Symposium on Combinatorial Pattern Matching, LNCS 4009, pages 233-245 (2006).] 14. N. Nagarajan and C. Kingsford. GiRaF: Robust, computational identification of influenza reassortments via graph mining. Nucleic Acids Research 39(6):e34 (2011). [Cited ≥ 39 times.] 15. G. Marçais and C. Kingsford. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27(6):764-770 (2011). [Cited ≥ 650 times.] 16. D. R. Kelley and C. Kingsford. Extracting between-pathway models from E-MAP interactions using expected graph compression. Journal of Computational Biology 18(3):379-390 (2011). [Conference version appeared in Proceedings of the 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB), pages 248-262 (2010).] 17. S. Navlakha and C. Kingsford. Network archaeology: Uncovering ancient networks from present-day interactions. PLoS Computational Biology 7(4):e1001119. [Selected for oral presentation at the RECOMB-Systems Biology satellite conference, 2010; cited ≥ 57 times.] 18. J. Wetzel, C. Kingsford, and M. Pop. Assessing the benefits of using mate-pairs to resolve repeats in de novo short-read prokaryotic assemblies. BMC Bioinformatics 12:95, 2011. [Cited ≥ 53 times. Designated “Highly Accessed” by the journal.] 19. R. Patro, E. Sefer, J. Malin, G. Marcais, S. Navlakha, and C. Kingsford. Parsimonious reconstruction of network evolution. Algorithms for Molecular Biology 7:25 (2012). [Conference version appeared in Proceedings of the 11th Workshop on Algorithms in Bioinformatics (WABI), pages 237-249 (2011).] 20. R. Patro and C. Kingsford. Global network alignment using multiscale spectral signatures. Bioinformatics 28(23):3105-3114 (2012). [Cited ≥ 104 times.] 21. G. Duggal and C. Kingsford. Graph rigidity reveals well-constrained regions of chromosome conformation embeddings. BMC Bioinformatics 13:241 (2012). 22. D. Filippova, A. Gadani, C. Kingsford. Coral: an integrated suite of visualizations for comparing clusterings. BMC Bioinformatics 13:276 (2012). [Designated “Highly Accessed” by the journal.] 23. G. Duggal, R. Patro, E. Sefer, H. Wang, D. Filippova, S. Khuller and C. Kingsford. Resolving spatial inconsistencies in chromosome conformation measurements. Algorithms for Molecular Biology 8:8 (2013). [Conference version appeared as: Resolving spatial inconsistencies in chromosome conformation data. In Proceedings of 12th Annual Workshop on Algorithms in Bioinformatics (WABI), pp 288-300, 2012.] 24. G. Duggal, H. Wang, and C. Kingsford. Higher-order chromatin domains link eQTLs with the expression of far-away genes. Nucleic Acids Research 42(1):87-96 (2014). Kingsford, C. School of Computer Science 25. R. Patro, S. M. Mount, and C. Kingsford. Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nature Biotechnology 32:462-464 (2014). [Cited ≥ 270 times.] 26. D. Filippova, R. Patro, G. Duggal, and C. Kingsford. Identification of alternative topological domains in chromatin. Algorithms for Molecular Biology 9:14 (2014). [Cited ≥ 79 times; Designated “Highly Accessed” by the journal. Conference version appeared as: Multiscale identification of topological domains in chromatin. In Proceedings of Workshop on Algorithms in Bioinformatics (WABI), pages 300-312 (2013).] 27. H. Xin, J. Greth, J. Emmons, G. Pekhimenko, C. Kingsford, C. Alkan, and O. Mutlu. Shifted Hamming Distance: A fast and accurate SIMD-friendly filter for local alignment in read mapping. Bioinformatics 31(10):1553-60 (2015). 28. C. Kingsford and R. Patro. Compression of short-read sequences using path encoding. Bioinformatics 31(12):1920-1928 (2015). http://dx.doi.org/10.1093/bioinformatics/btv071 29. R. Patro and C. Kingsford. Data-dependent bucketing improves reference-free compression of sequencing reads. Bioinformatics 31(17):2770-2777 (2015). http://dx.doi.org/10.1093/bioinformatics/btv248 30. Hongyi Xin, Richard Zhu, Sunny Nahar, John Emmons, Gennady Pekhimenko, Carl Kingsford, Can Alkan and Onur Mutlu. Optimal seed solver: Optimizing seed selection in read mapping.
Recommended publications
  • BIOINFORMATICS Doi:10.1093/Bioinformatics/Bti144
    Vol. 00 no. 0 2004, pages 1–11 BIOINFORMATICS doi:10.1093/bioinformatics/bti144 Solving and analyzing side-chain positioning problems using linear and integer programming Carleton L. Kingsford, Bernard Chazelle and Mona Singh∗ Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics, Princeton University, 35, Olden Street, Princeton, NJ 08544, USA Received on August 1, 2004; revised on October 10, 2004; accepted on November 8, 2004 Advance Access publication … ABSTRACT set of possible rotamer choices (Ponder and Richards, 1987; Motivation: Side-chain positioning is a central component of Dunbrack and Karplus, 1993) for each Cα position on the homology modeling and protein design. In a common for- backbone. The goal is to choose a rotamer for each position mulation of the problem, the backbone is fixed, side-chain so that the total energy of the molecule is minimized. This conformations come from a rotamer library, and a pairwise formulation of SCP has been the basis of some of the more energy function is optimized. It is NP-complete to find even a successful methods for homology modeling (e.g. Petrey et al., reasonable approximate solution to this problem. We seek to 2003; Xiang and Honig, 2001; Jones and Kleywegt, 1999; put this hardness result into practical context. Bower et al., 1997) and protein design (e.g. Dahiyat and Mayo, Results: We present an integer linear programming (ILP) 1997; Malakauskas and Mayo, 1998; Looger et al., 2003). In formulation of side-chain positioning that allows us to tackle homology modeling, the goal is to predict the structure for a large problem sizes.
    [Show full text]
  • Bonnie Berger Named ISCB 2019 ISCB Accomplishments by a Senior
    F1000Research 2019, 8(ISCB Comm J):721 Last updated: 09 APR 2020 EDITORIAL Bonnie Berger named ISCB 2019 ISCB Accomplishments by a Senior Scientist Award recipient [version 1; peer review: not peer reviewed] Diane Kovats 1, Ron Shamir1,2, Christiana Fogg3 1International Society for Computational Biology, Leesburg, VA, USA 2Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel 3Freelance Writer, Kensington, USA First published: 23 May 2019, 8(ISCB Comm J):721 ( Not Peer Reviewed v1 https://doi.org/10.12688/f1000research.19219.1) Latest published: 23 May 2019, 8(ISCB Comm J):721 ( This article is an Editorial and has not been subject https://doi.org/10.12688/f1000research.19219.1) to external peer review. Abstract Any comments on the article can be found at the The International Society for Computational Biology (ISCB) honors a leader in the fields of computational biology and bioinformatics each year with the end of the article. Accomplishments by a Senior Scientist Award. This award is the highest honor conferred by ISCB to a scientist who is recognized for significant research, education, and service contributions. Bonnie Berger, Simons Professor of Mathematics and Professor of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology (MIT) is the 2019 recipient of the Accomplishments by a Senior Scientist Award. She is receiving her award and presenting a keynote address at the 2019 Joint International Conference on Intelligent Systems for Molecular Biology/European Conference on Computational Biology in Basel, Switzerland on July 21-25, 2019. Keywords ISCB, Bonnie Berger, Award This article is included in the International Society for Computational Biology Community Journal gateway.
    [Show full text]
  • BIOGRAPHICAL SKETCH NAME: Berger
    BIOGRAPHICAL SKETCH NAME: Berger, Bonnie eRA COMMONS USER NAME (credential, e.g., agency login): BABERGER POSITION TITLE: Simons Professor of Mathematics and Professor of Electrical Engineering and Computer Science EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, include postdoctoral training and residency training if applicable. Add/delete rows as necessary.) EDUCATION/TRAINING DEGREE Completion (if Date FIELD OF STUDY INSTITUTION AND LOCATION applicable) MM/YYYY Brandeis University, Waltham, MA AB 06/1983 Computer Science Massachusetts Institute of Technology SM 01/1986 Computer Science Massachusetts Institute of Technology Ph.D. 06/1990 Computer Science Massachusetts Institute of Technology Postdoc 06/1992 Applied Mathematics A. Personal Statement Advances in modern biology revolve around automated data collection and sharing of the large resulting datasets. I am considered a pioneer in the area of bringing computer algorithms to the study of biological data, and a founder in this community that I have witnessed grow so profoundly over the last 26 years. I have made major contributions to many areas of computational biology and biomedicine, largely, though not exclusively through algorithmic innovations, as demonstrated by nearly twenty thousand citations to my scientific papers and widely-used software. In recognition of my success, I have just been elected to the National Academy of Sciences and in 2019 received the ISCB Senior Scientist Award, the pinnacle award in computational biology. My research group works on diverse challenges, including Computational Genomics, High-throughput Technology Analysis and Design, Biological Networks, Structural Bioinformatics, Population Genetics and Biomedical Privacy. I spearheaded research on analyzing large and complex biological data sets through topological and machine learning approaches; e.g.
    [Show full text]
  • DREAM: a Dialogue on Reverse Engineering Assessment And
    DREAM:DREAM: aa DialogueDialogue onon ReverseReverse EngineeringEngineering AssessmentAssessment andand MethodsMethods Andrea Califano: MAGNet: Center for the Multiscale Analysis of Genetic and Cellular Networks C2B2: Center for Computational Biology and Bioinformatics ICRC: Irving Cancer RResearchesearch Center Columbia University 1 ReverseReverse EngineeringEngineering • Inference of a predictive (generative) model from data. E.g. argmax[P(Data|Model)] • Assumptions: – Model variables (E.g., DNA, mRNA, Proteins, cellular sub- structures) – Model variable space: At equilibrium, temporal dynamics, spatio- temporal dynamics, etc. – Model variable interactions: probabilistics (linear, non-linear), explicit kinetics, etc. – Model topology: known a-priori, inferred. • Question: – Model ~= Reality? ReverseReverse EngineeringEngineering Data Biological System Expression Proteomics > NFAT ATGATGGATG CTCGCATGAT CGACGATCAG GTGTAGCCTG High-throughput GGCTGGA Structure Sequence Biology … Biochemical Model Validation Control X-Y- Control X+Y+ Y X Z X+Y- X-Y+ Control Control Specific Prediction SomeSome ReverseReverse EngineeringEngineering MethodsMethods • Optimization: High-Dimensional objective function max corresponds to best topology – Liang S, Fuhrman S, Somogyi (REVEAL) – Gat-Viks and R. Shamir (Chain Functions) – Segal E, Shapira M, Regev A, Pe’er D, Botstein D, KolKollerler D, and Friedman N (Prob. Graphical Models) – Jing Yu, V. Anne Smith, Paul P. Wang, Alexander J. Hartemink, Erich D. Jarvis (Dynamic Bayesian Networks) – … • Regression: Create a general model of biochemical interactions and fit the parameters – Gardner TS, di Bernardo D, Lorentz D, and Collins JJ (NIR) – Alberto de la Fuente, Paul Brazhnik, Pedro Mendes – Roven C and Bussemaker H (REDUCE) – … • Probabilistic and Information Theoretic: Compute probability of interaction and filter with statistical criteria – Atul Butte et al. (Relevance Networks) – Gustavo Stolovitzky et al. (Co-Expression Networks) – Andrea CaCalifanolifano et al.
    [Show full text]
  • Research Report 2006 Max Planck Institute for Molecular Genetics, Berlin Imprint | Research Report 2006
    MAX PLANCK INSTITUTE FOR MOLECULAR GENETICS Research Report 2006 Max Planck Institute for Molecular Genetics, Berlin Imprint | Research Report 2006 Published by the Max Planck Institute for Molecular Genetics (MPIMG), Berlin, Germany, August 2006 Editorial Board Bernhard Herrmann, Hans Lehrach, H.-Hilger Ropers, Martin Vingron Coordination Claudia Falter, Ingrid Stark Design & Production UNICOM Werbeagentur GmbH, Berlin Number of copies: 1,500 Photos Katrin Ullrich, MPIMG; David Ausserhofer Contact Max Planck Institute for Molecular Genetics Ihnestr. 63–73 14195 Berlin, Germany Phone: +49 (0)30-8413 - 0 Fax: +49 (0)30-8413 - 1207 Email: [email protected] For further information about the MPIMG please see our website: www.molgen.mpg.de MPI for Molecular Genetics Research Report 2006 Table of Contents The Max Planck Institute for Molecular Genetics . 4 • Organisational Structure. 4 • MPIMG – Mission, Development of the Institute, Research Concept. .5 Department of Developmental Genetics (Bernhard Herrmann) . 7 • Transmission ratio distortion (Hermann Bauer) . .11 • Signal Transduction in Embryogenesis and Tumor Progression (Markus Morkel). 14 • Development of Endodermal Organs (Heiner Schrewe) . 16 • Gene Expression and 3D-Reconstruction (Ralf Spörle). 18 • Somitogenesis (Lars Wittler). 21 Department of Vertebrate Genomics (Hans Lehrach) . 25 • Molecular Embryology and Aging (James Adjaye). .31 • Protein Expression and Protein Structure (Konrad Büssow). .34 • Mass Spectrometry (Johan Gobom). 37 • Bioinformatics (Ralf Herwig). .40 • Comparative and Functional Genomics (Heinz Himmelbauer). 44 • Genetic Variation (Margret Hoehe). 48 • Cell Arrays/Oligofingerprinting (Michal Janitz). .52 • Kinetic Modeling (Edda Klipp) . .56 • In Vitro Ligand Screening (Zoltán Konthur). .60 • Neurodegenerative Disorders (Sylvia Krobitsch). .64 • Protein Complexes & Cell Organelle Assembly/ USN (Bodo Lange/Thorsten Mielke). .67 • Automation & Technology Development (Hans Lehrach).
    [Show full text]
  • ABSTRACT HISTORICAL GRAPH DATA MANAGEMENT Udayan
    ABSTRACT Title of dissertation: HISTORICAL GRAPH DATA MANAGEMENT Udayan Khurana, Doctor of Philosophy, 2015 Dissertation directed by: Professor Amol Deshpande Department of Computer Science Over the last decade, we have witnessed an increasing interest in temporal analysis of information networks such as social networks or citation networks. Finding temporal interaction patterns, visualizing the evolution of graph properties, or even simply com- paring them across time, has proven to add significant value in reasoning over networks. However, because of the lack of underlying data management support, much of the work on large-scale graph analytics to date has largely focused on the study of static properties of graph snapshots. Unfortunately, a static view of interactions between entities is often an oversimplification of several complex phenomena like the spread of epidemics, informa- tion diffusion, formation of online communities, and so on. In the absence of appropriate support, an analyst today has to manually navigate the added temporal complexity of large evolving graphs, making the process cumbersome and ineffective. In this dissertation, I address the key challenges in storing, retrieving, and analyzing large historical graphs. In the first part, I present DeltaGraph, a novel, extensible, highly tunable, and distributed hierarchical index structure that enables compact recording of the historical information, and that supports efficient retrieval of historical graph snapshots. I present analytical models for estimating required storage space and snapshot retrieval times which aid in choosing the right parameters for a specific scenario. I also present optimizations such as partial materialization and columnar storage to speed up snapshot retrieval. In the second part, I present Temporal Graph Index that builds upon DeltaGraph to support version-centric retrieval such as a node’s 1-hop neighborhood history, along with snapshot reconstruction.
    [Show full text]
  • John Anthony Capra
    John Anthony Capra Contact Vanderbilt University e-mail: tony.capra-at-vanderbilt.edu Information Dept. of Biological Sciences www: http://www.capralab.org/ VU Station B, Box 35-1634 office: U5221 BSB/MRB III Nashville, TN 37235-1634 phone: (615) 343-3671 Research • Applying computational methods to problems in genetics, evolution, and biomedicine. Interests • Integrating genome-scale data to understand the functional effects of genetic differences between individuals and species. • Modeling evolutionary processes that drive the creation of lineage-specific traits and diseases. Academic Vanderbilt University, Nashville, Tennessee USA Employment Assistant Professor, Department of Biological Sciences August 2014 { Present Assistant Professor, Department of Biomedical Informatics February 2013 { Present Investigator, Center for Human Genetics Research Education And Gladstone Institutes, University of California, San Francisco, CA USA Training Postdoctoral Fellow, October 2009 { December 2012 • Advisor: Katherine Pollard Princeton University, Princeton, New Jersey USA Ph.D., Computer Science, June 2009 • Advisor: Mona Singh • Thesis: Algorithms for the Identification of Functional Sites in Proteins M.A., Computer Science, October 2006 Columbia College, Columbia University, New York, New York USA B.A., Computer Science, May 2004 B.A., Mathematics, May 2004 Pembroke College, Oxford University, Oxford, UK Columbia University Oxford Scholar, October 2002 { June 2003 • Subject: Mathematics Honors and Gladstone Institutes Award for Excellence in Scientific Leadership 2012 Awards Society for Molecular Biology and Evolution (SMBE) Travel Award 2012 PhRMA Foundation Postdoctoral Fellowship in Informatics 2011 { 2013 Princeton University Wu Graduate Fellowship 2004 { 2008 Columbia University Oxford Scholar 2002 { 2003 Publications Capra JA* and Kostka D*. Modeling DNA methylation dynamics with approaches from phyloge- netics.
    [Show full text]
  • Machine Learning and Statistical Methods for Clustering Single-Cell RNA-Sequencing Data Raphael Petegrosso 1, Zhuliu Li 1 and Rui Kuang 1,∗
    i i “main” — 2019/5/3 — 12:56 — page 1 — #1 i i Briefings in Bioinformatics doi.10.1093/bioinformatics/xxxxxx Advance Access Publication Date: Day Month Year Manuscript Category Subject Section Machine Learning and Statistical Methods for Clustering Single-cell RNA-sequencing Data Raphael Petegrosso 1, Zhuliu Li 1 and Rui Kuang 1,∗ 1Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN, USA ∗To whom correspondence should be addressed. Associate Editor: XXXXXXX Received on XXXXX; revised on XXXXX; accepted on XXXXX Abstract Single-cell RNA-sequencing (scRNA-seq) technologies have enabled the large-scale whole-transcriptome profiling of each individual single cell in a cell population. A core analysis of the scRNA-seq transcriptome profiles is to cluster the single cells to reveal cell subtypes and infer cell lineages based on the relations among the cells. This article reviews the machine learning and statistical methods for clustering scRNA- seq transcriptomes developed in the past few years. The review focuses on how conventional clustering techniques such as hierarchical clustering, graph-based clustering, mixture models, k-means, ensemble learning, neural networks and density-based clustering are modified or customized to tackle the unique challenges in scRNA-seq data analysis, such as the dropout of low-expression genes, low and uneven read coverage of transcripts, highly variable total mRNAs from single cells, and ambiguous cell markers in the presence of technical biases and irrelevant confounding biological variations. We review how cell-specific normalization, the imputation of dropouts and dimension reduction methods can be applied with new statistical or optimization strategies to improve the clustering of single cells.
    [Show full text]
  • Lecture 10: Phylogeny 25,27/12/12 Phylogeny
    גנומיקה חישובית Computational Genomics פרופ' רון שמיר ופרופ' רודד שרן Prof. Ron Shamir & Prof. Roded Sharan ביה"ס למדעי המחשב אוניברסיטת תל אביב School of Computer Science, Tel Aviv University , Lecture 10: Phylogeny 25,27/12/12 Phylogeny Slides: • Adi Akavia • Nir Friedman’s slides at HUJI (based on ALGMB 98) •Anders Gorm Pedersen,Technical University of Denmark Sources: Joe Felsenstein “Inferring Phylogenies” (2004) 1 CG © Ron Shamir Phylogeny • Phylogeny: the ancestral relationship of a set of species. • Represented by a phylogenetic tree branch ? leaf ? Internal node ? ? ? Leaves - contemporary Internal nodes - ancestral 2 CG Branch© Ron Shamir length - distance between sequences 3 CG © Ron Shamir 4 CG © Ron Shamir 5 CG © Ron Shamir 6 CG © Ron Shamir “classical” Phylogeny schools Classical vs. Modern: • Classical - morphological characters • Modern - molecular sequences. 7 CG © Ron Shamir Trees and Models • rooted / unrooted “molecular • topology / distance clock” • binary / general 8 CG © Ron Shamir To root or not to root? • Unrooted tree: phylogeny without direction. 9 CG © Ron Shamir Rooting an Unrooted Tree • We can estimate the position of the root by introducing an outgroup: – a species that is definitely most distant from all the species of interest Proposed root Falcon Aardvark Bison Chimp Dog Elephant 10 CG © Ron Shamir A Scally et al. Nature 483, 169-175 (2012) doi:10.1038/nature10842 HOW DO WE FIGURE OUT THESE TREES? TIMES? 12 CG © Ron Shamir Dangers of Paralogs • Right species distance: (1,(2,3)) Sequence Homology Caused
    [Show full text]
  • Graduation 2019
    Department of Graduation Computer Science Celebration & Awards Dinner 2019 Evening Schedule 6:00pm Social Time 7:00pm Welcome Dr. Sanjeev Setia, Chair Department of Computer Science 7:10pm Dinner 8:00pm Presentation of Awards Dr. Sanjeev Setia, Chair Department of Computer Science Doctor of Philosophy Computer Science Indranil Banerjee Dissertation Title: Problems on Sorting, Sets and Graphs Major Professor: Dana Richards, PhD Arda Gumusalan Dissertation Title: Dynamic Modulation Scaling Enabled Real Time Transmission Scheduling For Wireless Sensor Networks Major Professor: Robert Simon, PhD Yun Guo Dissertation Title: Towards Automatically Localizing and Repairing SQL Faults Major Professors: Jeff Offut, PhD & Amihai Motro, PhD Mohan Krishnamoorthy Dissertation Title: Stochastic Optimization based on White-box Deterministic Approximations: Models, Algorithms and Application to Service Networks Major Professors: Alexander Brodsky, PhD & Daniel Menascé, PhD Arsalan Mousavian Dissertation Title: Semantic and 3D Understanding of a Scene for Robot Perception Major Professor : Jana Kosecka, PhD Zhiyun Ren Dissertation Title: Academic Performance Prediction with Machine Learning Techniques Major Professor : Huzefa Rangwala, PhD Md A. Reza Dissertation Title: Scene Understanding for Robotic Applications Major Professor : Jana Kosecka, PhD Venkateshwar Tadakamalla Dissertation Title: Analysis and Autonomic Elasticity Control for Multi-Server/Queues Under Traffic Surges in Cloud Environments Major Professor : Daniel A. Menascé, PhD Jianchao Tan
    [Show full text]
  • Emerging Topics in Biological Networks and Systems Biology Symposium at the Swedish Collegium for Advanced Study (Scas), Uppsala 9-11 October, 2017
    Emerging Topics in Biological Networks and Systems Biology Symposium at the Swedish Collegium for Advanced Study (scas), Uppsala 9-11 October, 2017 mona singh, Princeton University, usa Network-based Methods for Identifying Cancer Genes Abstract: A central goal in cancer genomics is to identify the somatic alterations that underpin tumor initiation and progression. While commonly mutated cancer genes are readily identifiable, those that are rarely mutated across samples are difficult to distinguish from the large numbers of other infrequently mutated genes. Molecular interactions and networks provide a powerful frame- work with which to tackle some of the difficulties arising from the diverse somatic mutational landscapes of cancers. In this talk, I will first demonstrate that cancer genes can be discovered by identifying genes whose interaction interfaces are enriched in somatic mutations. Next, I will show how to leverage per-individual mutational profiles within the context of protein-protein interaction networks in order to identify small connected subnetworks of genes that, while not individually frequently mutated, comprise pathways that are altered across (i.e., “cover”) a large fraction of individuals. Overall, these two approaches recapitulate known cancer driver genes, and discover novel, and sometimes rarely-mutated, genes with likely roles in cancer. About: Mona Singh obtained her AB and SM degrees at Harvard University, and her PhD at MIT, all three in Computer Science. She did postdoctoral work at the Whitehead Institute for Biomedical Research. She has been on the faculty at Princeton since 1999, and currently she is Professor of Computer Science in the computer science department and the Lewis-Sigler Institute for Integrative Genomics.
    [Show full text]
  • Annual Meeting 2016
    The Genetics Society of Israel Annual Meeting 2016 The Hebrew University of Jerusalem Edmond J. Safra Campus Givat Ram January 25, 2016 FRONTIERS IN GENETICS X PROGRAM MEETING ORGANIZER Target Conferences Ltd. P.O. Box 51227 Tel Aviv 6713818, Israel Tel: +972 3 5175150, Fax: +972 3 5175155 E-mail: [email protected] 79 Honorary Membership in the Genetic Society of Israel is awarded to Prof. Eliezer Lifschitz from the Technion – Israel Institute of Technology, Israel We are indebted to you for your contributions to Genetics in Israel תואר חבר כבוד מוענק בשם החברה לגנטיקה בישראל לפרופ' אליעזר ליפשיץ הטכניון מכון טכנולוגי לישראל אנו מוקירים את תרומתך למחקר בתחום הגנטיקה בישראל 2 TABLE OF CONTENTS Page Board Members .......................................................................................................... 4 Acknowledgements .................................................................................................... 4 General Information .................................................................................................. 5 Exhibitors ................................................................................................................... 6 Scientific Program ....................................................................................................... 8 Invited Speakers ....................................................................................................... 10 Poster Presentations................................................................................................
    [Show full text]