<<

Dr. rer. nat. Johannes Köster

Homepage https://koesterlab.github.io Birth May 5, 1985 in , Nationality German Family status Married, two children Positions since June DKTK Investigator at the German Consortium of Transla- 2020 tional Cancer Research (DKTK) of WTZ and DKFZ. since August Group leader (permanent position), Algorithms for re- 2017 producible bioinformatics (https://koesterlab.github.io), Genome Informatics, Institute of Human Genetics, Faculty of Medicine, University of Duisburg-. Currently five group members (graduate students). August 2016 – Researcher (NWO Veni funded) at Life Sciences, Centrum August 2017 Wiskunde & Informatica, Amsterdam, . May 2016 – Postdoctoral research fellow under the supervision of Alexan- July 2016 der Schönhuth, Life Sciences, Centrum Wiskunde & Informatica, Amsterdam, Netherlands. since May Affiliated consulting scientist in Myles Brown’s group (Divi- 2016 sion of Molecular and Cellular Oncology, Department of Medi- cal Oncology, Dana-Farber Cancer Institute, Harvard Medical School), USA. 2016 Consultant (data analysis and workflow management) for Juno therapeutics, 307 Westlake Avenue, Seattle, USA. April 2015 – Postdoctoral research fellow in Xiaole Shirley Liu’s group April 2016 (Department of Biostatistics and Computational Biology, Dana- Farber Cancer Institute, Harvard School of Public Health) and Myles Brown’s group (Division of Molecular and Cellular On- cology, Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School), USA. 2013 – 2014 Volunteership at NIH regarding the adaptation of Snakemake for the biowulf computer cluster in cooperation with Sean Davis, Center for Cancer Research, NCI/NIH, USA.

1/9 2011 – April Research fellow at both the Chair of Genome Informatics of 2015 Sven Rahmann, University of Duisburg-Essen and in the “Euro- pean Network for Cancer in Children and Adolescents” project under Alexander Schramm, University Hospital Essen, Germany. 2011 Research fellow in Sven Rahmann’s group Bioinformatics for High-Throughput-Technologies, Computer Science XI, TU Dort- mund, Germany. 2009 – 2015 Guest member in the group of Eli Zamir, Systems Biology of Cell-Matrix Adhesion, Department of Systemic Cell Biology, Max Planck Institute of Molecular Physiology , Germany. Education 2011-2015 Ph.D. in Computer Science (grade “ausgezeich- net”/summa cum laude) at TU Dortmund, Germany. Thesis title: “Parallelization, Scalability, and Reproducibility in Next- Generation Sequencing Analysis”. Supervisor: Sven Rahmann, Chair of Genome Informatics, University of Duisburg-Essen. Sec- ond assessor: Axel Mosig, Bioinformatics Group, University . 2005 – 2010 Diploma (equiv. to M.Sc.) in Computer Science (grade 1.0/A with honors) at TU Dortmund, Germany. Thesis ti- tle: “Propagating Interaction Logic toward Predictive Protein Hypernetworks”. Supervisors: Sven Rahmann, Bioinformatics for High-Throughput-Technologies, Computer Science XI, TU Dort- mund and Eli Zamir, Systems Biology of Cell-Matrix Adhesion, Department of Systemic Cell Biology, Max Planck Institute of Molecular Physiology Dortmund. Third-party funding 2020-2021 Chan Zuckerberg Initiative (CZI) grant for maintenance and further development of the Bioconda project. Amount: 80,000 e 2019-2020 Google funding for implementing Google Cloud Pipelines API support in Snakemake. The project will be conducted in collabo- ration with Stanford University. Amount: 80,000 e 2019-2022 Workpackage leader in immuno-oncology study NEOpredict- Lung funded by the International Immuno-Oncology Network (II-ON) of Brystol-Myers Squibb. Amount for workpackage: 204,000 e 2017 Lorentz center workshop “Making sense of millions of single cells: approaching the era of Single Cell Data Science, experimen- tal advances and computational challenges”. Amount: 12,000 e

2/9 2016-2020 NWO Veni Grant “Fully reproducible workflows scaling from workstations to the cloud” (016.Veni.173.076). Amount: 250,000 e since 2015 Free unlimited organizational plan from Anaconda Inc. for the Bioconda project. Amount: 1176 $ per year Awards 2019 Friedmund Neumann Prize 2019 for outstanding fundamen- tal research on reproducibility in biomedicine. Amount: 10,000 e 2016 Travel fellowship for HiTSeq 2016, Orlando, USA. Amount: 1000 $ 2014 NVIDIA Hardware Grant received for publication “Massively parallel read mapping on GPUs with the q-group index and PEANUT”. Amount: 2000 e (Tesla GPU) 2011 Hans-Uhde-Preis 2011 by the ThyssenKrupp Uhde GmbH for extraordinary diploma thesis with practical application in Computer Science. Amount: 500 e Travel award of the 9th International Conference on Pathways, Networks and Systems Medicine, Crete, . Amount: 1000 $ Honorable mention at the ”Doktorandenkolleg Informatik Ruhr“, Germany. Best poster award at the science day of the University Hospital Essen, Germany. Amount: 100 e Invited Talks August 2020 University Medicine : Reproducibility and trans- parency in bioinformatics September Bertinoro Computational Biology Meeting: Towards a 2019 unified theory of variant calling May 2019 NIH: Reproducible data analaysis with Snakemake May 2019 Harvard University: Towards a unified theory of variant call- ing February Sanger Institute: Reproducible data analaysis with Snakemake 2019 December Martin Luther University -Wittenberg: Repro- 2018 ducible data analysis with Snakemake September Bernstein Conference on computational neuroscience, 2018 : Reproducible data analysis with Snakemake November Robert-Koch-Institute, Berlin: Benchmarking RNA-seq 2017 data analysis

3/9 February Gutenberg University Mainz: Snakemake and Bioconda 2017 February Forschungszentrum Jülich: Snakemake and Bioconda 2017 November Helmholtz Open Science Workshop, , Germany: 2016 Keynote about sustainable software development in science. September Netherlands Cancer Institute (NKI), Amsterdam: Snake- 2016 make and Bioconda. July 2016 Rust bay area meeting, Mozilla Foundation Los Angeles, USA: Bioinformatics with Rust. June 2016 Aquatic Ecosystem Research, University of Duisburg- Essen, Germany: Snakemake and Bioconda. June 2016 Leiden University Medical Center, Netherlands: Alge- braic variant calling, Snakemake and Bioconda. 2015 Broad Institute, Boston, USA: Workflow Management with Snakemake 2014 Workshop on Fast Data Processing on GPUs, CUDA Center of Excellence, Dresden: Massively parallel read map- ping on graphics cards. DTL Focus Meeting, Utrecht, Netherlands: Snakemake Organized events August 2019 Organizer: Tutorial “Reproducible data analysis with Snake- make”, University of Duisburg-Essen, Germany July 2019 Co-Organizer: Tutorial “Tools for reproducible research”, ISMB 2019, , Switzerland May 2019 Organizer: Tutorial “Reproducible data analysis with Snake- make”, Dana-Farber Cancer Institute, Boston, USA February Co-Organizer: Data structures in Bioinformatics (DSB) 2019, 2019 Dortmund, Germany June 2018 Co-Organizer: Lorentz workshop, Leiden, The Netherlands: Making sense of millions of single cells: approaching the era of Single Cell Data Science, experimental advances and computa- tional challenges September PC member: CIBB 2017, Cagliari, 2017 March 2017 Organizer: Tutorial “Reproducible data analysis with Snake- make” at CWI Amsterdam February Co-Organizer and session chair: workshop “data structures 2017 in bioinformatics” (DSB) 2017 (https://dsb2017.github.io)

4/9 Teaching Experience September German Conference on Bioinformatics 2020: Tutorial “Re- 2020 producibility with Snakemake and Bioconda” (1 day) May 2020 Wageningen University, Netherlands: Tutorial “Repro- ducible data analysis with Snakemake” (1 day) August 2019 University of Duisburg-Essen: Tutorial “Reproducible data analysis with Snakemake” (2 days) SS 2019 TU Dortmund, Faculty of Computer Science Fachprojekt “Algorithm Engineering: Entwicklung einer Rust-Bibliothek am Beispiel von Wavelet Trees” (4 SWS) May 2019 NIH: Tutorial “Reproducible data analysis with Snakemake” (1 day) May 2019 Harvard University: Tutorial “Reproducible data analysis with Snakemake” (1 day) February Sanger Institute: Tutorial “Reproducible data analysis with 2019 Snakemake” (1 day) WS 2018 TU Dortmund, Faculty of Computer Science Fachprojekt “Bioinformatik: Reproduzierbare Datenanalyse mit Snakemake am Beispiel der Bioinformatik” (4 SWS) SS 2018 TU Dortmund, Faculty of Computer Science Fachprojekt “Algorithm Engineering: Entwicklung einer Rust-Bibliothek am Beispiel von Succinct Trees” (4 SWS) September Institute National De La Recherche Agronomique 2018 (INRA), Toulouse: Tutorial “Reproducible data analysis with Snakemake” (1 day) August 2018 Universiteit Utrecht: Tutorial “Reproducible data analysis with Snakemake” (1 day) June 2018 Bioinformatics open source conference (BOSC 2018), Portland: Tutorial “Reproducible data analysis with Snake- make” (1 day) November ETH Zürich: Tutorial “Reproducible data analysis with Snake- 2017 make” (1 day) March 2017 CWI Amsterdam: Tutorial “Reproducible data analysis with Snakemake” December Berlin Institute of Health: Tutorial “Reproducible data anal- 2016 ysis with Snakemake” (1 day) 2013 Guest lecture about variant calling with next-generation se- quencing in the course “Statistic in Genetics”, Faculty of Statis- tics, TU Dortmund, Germany.

5/9 2012 Guest lectures about protein networks, complex prediction and interaction dependencies in the course “Computational Omics”, Faculty of Computer Science, TU Dortmund, Germany. 2011 Teaching assistant for “Datastructures, Algorithms and Pro- gramming 1”, at the Chair of Computer Science X, TU Dortmund, Germany. (2 SWS) 2007-2009 Student teaching assistant for “Datastructures, Algorithms and Programming 1”, at the Chair of Computer Science X, TU Dortmund, Germany. (2 SWS) Supervision Experience since 2017 Supervision of five PhD students within my group 2020 Supervisor of bachelor thesis on developing a probabilistic approach for clustering RAD-seq data into loci-specific stacks 2019 Supervisor of bachelor thesis on visualizing read alignments via a client server architecture 2018 Co-supervisor of master thesis on comparison of genomic sequences using fast cross-correlation 2017 Co-supervisor of master thesis about building a pan-genome data structure for viruses, Bastiaan van der Roest 2015 Supervision of intern for the documentation of a workflow for the analysis of CRISPR/Cas9 knockout screens. Co-supervisor of master thesis “Entwurf von q-Gramm- basierten Readmappingverfahren für lange Reads”, Bianca Patro, TU Dortmund, Germany. 2013-2015 Supervision of student assistants for the development and maintenance of the web-based Exomate platform for interactive, database-supported exploration and filtration of variant calls, University of Duisburg-Essen, Germany. 2013-2015 Supervision of student assistant for the development and implementation of logic-based protein network models incorpo- rating interaction dependencies, University of Duisburg-Essen, Germany. 2012 Co-supervisor of bachelor thesis “Rekonstruktion von Protein-Interaktionsabhängigkeiten mit dem Quine-McCluskey- Algorithmus”, Bianca Patro, TU Dortmund, Germany. 2011 Co-supervisor of diploma thesis "Entwurf einer Datenstruk- tur für Pangenome", Christiane Küch, TU Dortmund, Germany. Co-supervisor of bachelor thesis “Konstruktion von Protein- Hypernetzwerken durch Text-Mining in der PubMed Datenbank”, Michael Nimbs, TU Dortmund, Germany.

6/9 Peer reviewed Publications 2020 Köster, J. (corresponding author), Dijkstra, L. J., Marschall, T., Schönhuth, A. Varlociraptor: enhancing sensitivity and controlling false discovery rate in somatic indel discovery. Genome Biology 21, 98 (2020). David Lähnemann, Johannes Köster (shared first and last author), Ewa Szczurek, Davis J. McCarthy, Stephanie C.Hicks, Mark D. Robinson, Catalina A. Vallejos, Kieran R. Campbell, et al. Eleven grand challenges in single-cell data science. Genome Biology 21, 31 (2020). 2019 Baaijens, J. A., Van der Roest, B., Köster, J., Stougie, L., Schönhuth, A. Full-length de novo viral quasispecies assembly through variation graph construction. Bioinformatics, btz443. Johannes Köster (corresponding author), Myles Brown, X. Shirley Liu. A Bayesian Model for Single Cell Transcript Expression Analysis on MERFISH Data. Bioinformatics 35, 995–1001. 2018 Grüning, B., [. . . ], Köster, J. et al. Practical Computational Reproducibility in the Life Sciences. Cell Systems 6, 631–635 (2018). Björn Grüning, Ryan Dale, Andreas Sjödin, Brad A. Chapman, Jillian Rowe, Christopher H. Tomkins-Tinch, Renan Valieris, the Bioconda Team, Johannes Köster (corresponding author). Bioconda: Sustainable and Comprehensive Software Distribution for the Life Sciences. Nature Methods 15, 475–476. Cornwell, M., Vangala, M., Taing, L., Herbert, Z., Köster, J., Li, B., Sun, H., Li, T., Zhang, J., Qiu, X., Pun, M., Jeselsohn, R., Brown, M., Liu, X. S., and Long, H. (2018). VIPER: Visual- ization Pipeline for RNA-seq, a Snakemake workflow for efficient and complete RNA-seq analysis. BMC Bioinformatics 19:135. Schulte, M., Köster, J., Rahmann, S., Schramm, A. Cancer evo- lution, mutations, and clonal selection in relapse neuroblastoma. Cell Tissue Res 372:2. 2016 Stöcker, B. K., Köster, J., Rahmann, S. (2016). SimLoRD: Sim- ulation of Long Read Data. Bioinformatics, 32(17), 2704–2706. Ma, J., Köster, J., Qin, Q., Hu, S., Li, W., Chen, C., . . . Liu, X. S. (2016). CRISPR-DO for genome-wide CRISPR design and optimization. Bioinformatics. Townsend, E. C., Murakami, M. A., Christodoulou, A., Christie, A. L., Köster, J. (first collaborative author), DeSouza, T. A., . . . Weinstock, D. M. (2016). The Public Repository of Xenografts Enables Discovery and Randomized Phase II-like Trials in Mice. Cancer Cell, 29(4), 574–586.

7/9 2015 Li, W., Köster, J. (co-first author), Xu, H., Chen, C.-H., Xiao, T., Liu, J. S., . . . Liu, X. S. (2015). Quality control, modeling, and visualization of CRISPR screens with MAGeCK- VISPR. Genome Biology, 16(1), 281. Köster, J. (2015). Rust-Bio: a fast and safe bioinformatics library. Bioinformatics. Schramm, A., Köster, J., Assenov, Y., Althoff, K., Peifer, M., Mahlow, E., Odersky, A., Beisser, D., Ernst, C., Henssen, A. G., Stephan, H., Schröder, C., Heukamp, L., Engesser, A., Kahlert, Y., Theissen, J., Hero, B., Roels, F., Altmüller, J., Nürnberg, P., Astrahantseff, K., Gloeckner, C., De Preter, K., Plass, C., Lee, S., Lode, H. N., Henrich, K., Gartlgruber, M., Speleman, F., Schmezer, P., Westermann, F., Rahmann, S., Fischer, M., Eggert, A., Schulte, J. H. (2015). Mutational dynamics between primary and relapse neuroblastomas. Nature Genetics, 47(8), 872–877. Schwermer, M., Lee, S., Köster, J., Maerken, T. van, Stephan, H., Eggert, A., Morik, K., Schulte, J. H., Schramm, A., van Maerken, T. (2015). Sensitivity to cdk1-inhibition is modulated by p53 status in preclinical models of embryonal tumors. Onco- target, 6(17), 15425–15435. 2014 Köster, J. (corresponding author), Rahmann, S. (2014). Massively parallel read mapping on GPUs with the q-group index and PEANUT. PeerJ 2, e606. 2013 Schramm, A., Köster, J., Marschall, T., Martin, M., Schwermer, M., Fielitz, K., Büchel, G., Barann, M., Esser, D., Rosenstiel, P., Rahmann, S., Eggert, A., Schulte, J. H. (2013). Next-generation RNA sequencing reveals differential expression of MYCN target genes and suggests the mTOR pathway as a promising therapy target in MYCN-amplified neuroblastoma. International Journal of Cancer 132, E106–E115. Rahmann, S., Martin, M., Schulte, J. H., Köster, J., Marschall, T., Schramm, A. (2013). Identifying transcriptional miRNA biomarkers by integrating high-throughput sequencing and real- time PCR data. Methods 59, 154–163. Althoff, K., Beckers, A., Odersky, A., Mestdagh, P., Köster, J., Bray, I.M., Bryan, K., Vandesompele, J., Speleman, F., Stallings, R. L., Schramm, A., Eggert, A., Sprüssel, A., Schulte, J. H. (2013). MiR-137 functions as a tumor suppressor in neuroblastoma by downregulating KDM1A. International Journal of Cancer 133, 1064–1073. 2012 Köster, J. (corresponding author), Rahmann, S. (2012). Snakemake – a scalable bioinformatics workflow engine. Bioin- formatics 28, 2520–2522.

8/9 Köster, J., Zamir, E., Rahmann, S. (2012). Efficiently mining protein interaction dependencies from large text corpora. Integr. Biol. 4, 805–812. Köster, J. (corresponding author), Rahmann, S. (2012). Building and Documenting Workflows with Python-Based Snake- make, in: Böcker, S., Hufsky, F., Scheubert, K., Schleicher, J., Schuster, S. (Eds.), German Conference on Bioinformat- ics 2012, OpenAccess Series in Informatics (OASIcs). Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, pp. 49–56. Schramm, A., Schowe, B., Fielitz, K., Heilmann, M., Martin, M., Marschall, T., Köster, J., Vandesompele, J., Vermeulen, J., de Preter, K., Koster, J., Versteeg, R., Noguera, R., Speleman, F., Rahmann, S., Eggert, A., Morik, K., Schulte, J. H. (2012). Exon-level expression analyses identify MYCN and NTRK1 as major determinants of alternative exon usage and robustly predict primary neuroblastoma outcome. Br J Cancer 107, 1409–1417. Software (excerpt) Varlociraptor Generic Bayesian framework for FDR controlled variant calling in arbitrary scenarios (https://varlociraptor.github.io) Snakemake Popular workflow management system (https://snakemake. readthedocs.io) Bioconda Popular distribution of bioinformatics software realized as a channel for the versatile package manager Conda (https:// bioconda.github.io). MERFISHtools Bayesian framework for single-cell transcript expression analysis on multiplexed error-robust fluorescence in-situ hybridization data (https://merfishtools.github.io). Rust-Bio Bioinformatics algorithm and data structure library for the Rust language (https://rust-bio.github.io). VISPR Web-based, interactive visualization framework for CRISPR/Cas9 knockout screen experiments (https: //bitbucket.org/liulab/vispr).

9/9