Phd Degree in Molecular Medicine (Curriculum in Computational Biology)

Phd Degree in Molecular Medicine (Curriculum in Computational Biology)

PhD degree in Molecular Medicine (curriculum in Computational Biology) European School of Molecular Medicine (SEMM), University of Milan and University of Naples “Federico II” Settore Disciplinare: MED/04 Computational frameworks for the identification of somatic and germline variants contributing to cancer predisposition and development Giorgio Enrico Maria Melloni IIT@SEMM, Milan Matricola n. R10338 Supervisor: Prof. Piergiuseppe Pelicci IEO, Milan Added Supervisor: Dr. Laura Riva IIT@SEMM, Milan Anno accademico 2015-2016 2 TABLE OF CONTENTS 1 ABSTRACT ......................................................................................................................................... 5 2 INTRODUCTION ............................................................................................................................... 6 2.1 CANCER AS AN EVOLUTIONARY PROCESS .................................................................................................... 6 2.2 ACCUMULATING DRIVER MUTATIONS .......................................................................................................... 8 2.3 TUMOR HETEROGENEITY ............................................................................................................................. 10 2.4 CANCER GENOME LANDSCAPES ................................................................................................................. 11 2.5 DRIVER VS PASSENGER: A PROBLEM OF MUTATION RATE ....................................................................... 13 2.6 CANCER GENOMICS AND HUMAN GENETICS ............................................................................................. 14 2.7 CANCER GENOMICS IN THE NGS ERA ........................................................................................................ 16 3 MATERIAL AND METHODS ........................................................................................................ 17 3.1 DATA FORMAT ............................................................................................................................................. 17 3.1.1 VCF format ............................................................................................................................................ 17 3.1.2 MAF format ........................................................................................................................................... 18 3.2 DATA RETRIEVAL .......................................................................................................................................... 19 3.2.1 TCGA ........................................................................................................................................................ 19 3.2.2 ICGC ......................................................................................................................................................... 19 3.2.3 cBioPortal .............................................................................................................................................. 20 3.2.4 COSMIC and CGC ................................................................................................................................ 20 3.2.5 ExAC ......................................................................................................................................................... 20 3.3 DATA PROCESSING AND MANIPULATION ................................................................................................. 21 4 RESULTS ......................................................................................................................................... 22 4.1 DOTS-FINDER: A COMPREHENSIVE TOOL TO ASSESSING DRIVER GENES IN CANCER GENOMES .......... 22 4.1.1 Abstract .................................................................................................................................................. 22 4.1.2 Introduction .......................................................................................................................................... 23 4.1.3 Implementation ................................................................................................................................... 25 4.1.3.1 Overview of DOTS-Finder ....................................................................................................................................... 25 4.1.3.2 The Functional Step: finding tumor suppressor gene and oncogene candidates ........................... 30 4.1.3.3 The Frequentist step: assessing the possible drivers ................................................................................... 35 4.1.4 Material and Methods ....................................................................................................................... 36 4.1.4.1 Availability .................................................................................................................................................................... 36 4.1.4.2 Input Format ................................................................................................................................................................ 36 4.1.4.3 Requirements .............................................................................................................................................................. 37 4.1.4.4 Mutation data .............................................................................................................................................................. 37 4.1.4.5 Databases ..................................................................................................................................................................... 38 4.1.4.6 DOTS-Finder step by step ....................................................................................................................................... 42 4.1.4.6.1 Setting the threshold for TSG-S and OG-S ............................................................................................. 48 4.1.5 Results ..................................................................................................................................................... 50 4.1.5.1 Application of DOTS-Finder to individual cancer types .............................................................................. 50 4.1.5.2 Driver genes and tissue specificity ...................................................................................................................... 52 4.1.5.3 Breast carcinoma ....................................................................................................................................................... 53 4.1.5.4 Thyroid Carcinoma .................................................................................................................................................... 55 4.1.5.5 Acute Myeloid Leukemia ........................................................................................................................................ 56 4.1.5.6 Bladder Carcinoma .................................................................................................................................................... 57 4.1.5.7 Atypical tumor suppressor genes and oncogenes ....................................................................................... 58 4.1.5.8 The importance of considering subsets of samples ..................................................................................... 62 4.1.5.9 Small sample size analysis. The --lax option .................................................................................................... 63 4.1.5.10 Comparison of DOTS-Finder to existing tools using Pan-Cancer12 data .......................................... 65 4.1.5.11 Statistical power using a small number of cancer samples .................................................................... 68 4.1.6 Discussion .............................................................................................................................................. 69 4.2 LOWMACA: EXPLOITING PROTEIN FAMILY ANALYSIS FOR THE IDENTIFICATION OF RARE DRIVER MUTATIONS IN CANCER ........................................................................................................................................... 71 4.2.1 Abstract .................................................................................................................................................. 71 4.2.2 Introduction .......................................................................................................................................... 72 4.2.3 Materials and Methods ..................................................................................................................... 74 4.2.3.1 Software Implementation and Overview ......................................................................................................... 74 3 4.2.3.2 Input Data ..................................................................................................................................................................... 75 4.2.3.3 Alignment and Mapping ......................................................................................................................................... 76 4.2.3.4 Statistical Testing ......................................................................................................................................................

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    186 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us