Study of Differential Allelic Expression in the Breast Cancer Intermediate-Risk Susceptibility Genes CHEK2, ATM and TP53 Binh Thieu Tu Nguyen-Dumont

Study of differential allelic expression in the breast cancer intermediate-risk susceptibility genes CHEK2, ATM and TP53 Binh Thieu Tu Nguyen-Dumont To cite this version: Binh Thieu Tu Nguyen-Dumont. Study of differential allelic expression in the breast cancer intermediate-risk susceptibility genes CHEK2, ATM and TP53. Human health and pathology. Uni- versité Claude Bernard - Lyon I, 2010. English. NNT : 2010LYO10344. tel-00838546 HAL Id: tel-00838546 https://tel.archives-ouvertes.fr/tel-00838546 Submitted on 26 Jun 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. N˚d’ordre: 3442010 Année 2010 THESE DE L’UNIVERSITE DE LYON délivrée par L’Université Claude Bernard Lyon I Ecole Doctorale BMIC diplôme de Doctorat (arrêté du 7 août 2006) soutenue publiquement le 15 décembre 2010 par Tú NGUYEN-DUMONT Study of differential allelic expression in the breast cancer intermediate-risk susceptibility genes CHEK2 , ATM and TP53 Jury : Dr Sean V. TAVTIGIAN Directeur de thèse Dr Francine DUROCHER Rapporteur Dr Ana Teresa MAIA Rapporteur Dr Janet HALL Examinateur Dr Sylvie MAZOYER Examinateur Acknowledgements During my Ph.D, I have received support from many persons, both in my professional and personal life. I would like to thank all my colleagues, friends and family who supported me and contributed to the completion of this work. i Acknowledgements I especially want to express my gratitude to Sean Tavtigian for giving me that exceptional opportunity to work and learn with him. Thank you for accepting me in the GCS group and for proposing me such an interesting project. Thank you for trusting and encouraging me all these years. Lastly, than k you for your support, despite the distance (and LATEXediting). I am also grateful to Fabienne Lesueur for her constant support during my Ph.D work. Thank you for your commitment to this project, your permanent availability for discussions and your help in correcting the drafts to this mémoire. I would like to thank Francine Durocher and AnaTeresa Maia for accepting without any hesitation to review this work. I would like to thank Janet Hall and Sylvie Mazoyer, my comité de suivi de thèse, for their valuable guidance and advices throughout the carrying out of this project. Thank you also for providing the last pieces of material needed to complete this project. Many thanks to all the persons I worked with in the GCS Group every day, for their help and encouragements. A special acknowledgment to Lars Jordheim. Although not my supervisor on the paper, you have been acting in this role at the beginning of the DAE project. Thank you for your advices to the inexperienced Masters student in the first years. Thank you for always being supportive, open to answer all the questions and solve the issues encountered along the road. I have learnt a lot from you. I would like to thank Melissa Southey. Your mentorship during my Masters studies encouraged me to pursue a scientific career. I also want to acknowledge the financial support I received from Fondation de France and IARC. ii Acknowledgements Je souhaite maintenant remercier mes amis. Ils savent tout ce que je leur dois. En particulier, Mathilde (Pub’Math!) et Karima, malgré notre parcours parsemé d’obstacles (la rentrée 2005...), nous savons maintenant que nous avons eu raison de persévérer. Merci à toutes les deux pour tout, depuis tout ce temps. Je remercie bien sûr Amélie, Maxime et Camille, Véra et Stéphane, Sandrine et James, Thomas, Bertrand, le meilleur soutien moral tous les jours, c’est le gang du CIRC. Enfin, je remercie ma famille et mes proches. Cette thèse est en particulier dédiée à mes parents. Cám on Ba và Má. Je tiens aussi à remercier ma soeur, mes beau-parents et mes belle-soeurs, pour votre soutien moral et aussi logistique! Merci aussi à tous les autres babysitters (Tom!). Sans vous, la fin aurait été (encore plus) difficile. Merci à Simplice d’avoir cru en moi depuis si longtemps. Ong Nôi, mes cousins Nguyên, gia dình Pham, et mes cousins Chanfreau: merci pour votre présence et votre soutien. Pour finir, j’aimerais te remercier, Arnaud, le meilleur assistant-éditeur, assistant-illustrateur, assistant-technique dont on puisse rêver. C’est la fin d’une première étape, allons voir ce que l’avenir nous réserve à Melbourne, avec Nam Son dans notre valise. iii Contents List of abbreviations ix List of figures xi List of tables xiii Introduction 2 Bibliographical review 6 I The genetic bases of cancer development 7 I.1 Initiation and control of tumorigenesis . 7 I.2 The genetic determinants of cancer susceptibility . 10 I.2.1 Genetic changes . 10 I.2.2 Epigenetic changes . 12 I.3 Evidence for an effect of heredity in common cancers . 13 I.4 Inherited cancer syndromes . 18 II Inherited breast cancer susceptibility 20 II.1 Background . 20 II.2 The genetic epidemiology of hereditary breast cancer . 21 iv Contents II.2.1 Strategies for identifying breast cancer susceptibility genes and variants . 21 - Linkage studies . 22 - Association studies . 23 - Casecontrol mutation screening studies . 25 II.2.2 Genetic risk, a continuous variable. 25 II.2.3 A polygenic model for breast cancer susceptibility . 29 II.3 Known breast cancer susceptibility genes and their associated hereditary syndromes . 30 II.3.1 Highrisk genes . 31 II.3.2 Intermediaterisk genes . 34 II.4 Outlook on the search for the "missing heritability" of breast cancer 36 III Gene expression regulation 38 III.1 The determinants of gene expression regulation . 39 III.1.1 Local regulatory variation . 40 III.1.2 Distant regulatory variation . 42 III.1.3 Epigenetic factors . 43 III.2 The study of gene expression phenotypes . 44 III.2.1 What is the extent of natural variation in gene expression? . 44 III.2.2 How heritable are patterns of gene expression? . 47 III.3 Strategies for determining the architecture of gene expression regulation . 48 III.3.1 Expression quantitative trait loci (eQTL) mapping . 48 III.3.2 Differential allelic expression (DAE) assays . 50 III.4 The relevance of expression studies to disease susceptibility . 54 III.4.1 eQTL studies . 54 III.4.2 DAE studies . 56 v Contents Aims of the thesis 60 Introducing high-resolution melting curve analysis 66 I The basic underlying principles to HRM analysis 67 II Amplicon melting analysis 68 III Probe melting analysis 71 III.1 Fluorescein labeled probes . 72 III.2 SimpleProbes . 73 III.3 Unlabeled probes . 74 IV Applications of HRM analysis 75 IV.1 Clinical purposes . 76 IV.2 Research purposes . 76 Article I: Description and Validation of HighThroughput Simultaneous Genotyping and Mutation Scanning by High- Resolution Melting Curve Analysis . 79 Results 90 I Implementation of an HRM assay for DAE analysis 91 I.1 DAE assessment by HRM: the general experimental approach . 91 I.1.1 General description of the approach . 91 I.1.2 The hardware . 93 I.1.3 Assembling informative samples . 95 I.1.4 Statistical criteria for evidence of DAE . 96 TM I.2 The early stages of the method: the HR1 instrument . 97 I.2.1 Preliminary study: assessment of the TP53 gene . 97 vi Contents Article II: Differential Allelic Expression in Leukoblast from Patients with Acute Myeloid Leukemia Suggests Genetic Regulation of CDA, DCK, NT5C2, NT5C3, and TP53 . 101 I.2.2 Preliminary assessment of the CHEK2 gene . 108 I.3 Upscaling the method . 111 I.3.1 A higher throughput with the LightScannerR instrument . 111 I.3.2 Automation of the measurements with an Rscript . 112 II DAE assessment of the CHEK2 gene 119 Article III: Detecting differential allelic expression using high- resolution melting curve analysis: assessment of the breast cancer susceptibility gene CHEK2 (accepted) . 121 III DAE assessment of the TP53 and ATM genes 143 Article IV: Differential allelic expression assessment of the breast cancer susceptibility genes TP53 and ATM (in preparation) 145 Discussion 162 Conclusion 168 Annexes 172 Annex I: Determining the effectiveness of high-resolution melting analysis for SNP genotyping and mutation scanning at the TP53 locus 173 Annex II: Rare, evolutionarily unlikely missense substitutions in CHEK2 contribute to breast cancer susceptibility: results from a breast cancer family registry case-control mutation-screening study 187 vii Contents Bibliography 202 viii List of abbreviations A : Adenosin A-T : AtaxiaTelangiectasia ATM : Ataxia Telangiectasia Mutated BRCA1 : Breast Cancer 1 BRCA2 : Breast Cancer 2 C : Cytosin cDNA : Complementary DNA CI : Confidence Interval CIN : Chromosomal instability CHEK2 : Protein kinase CHK2 DAE : Differential allelic expression DNA : Deoxyribonucleic acid dsDNA : Doublestranded DNA EBV : EpsteinBarr Virus eQTL : Expression Quantitative Trait Loci ix Contents G : Guanin GWA : Genome wide association HRM : Highresolution melting curve LCL : Lymphoblastoid cell line mRNA : messenger RNA miRNA : micro RNA MSI : Microsatellite instability NMD : Nonsense mediated mRNA decay OMIM : Online Mendelian Inheritance in man PBS : Phosphate buffered saline PCR : Polymerase chain reaction RFLP : Restriction fragment length polymorphism RNA : Ribonucleic acid rs : RefSNP, Single Nucleotide Polymorphism reference number RT-PCR : Reverse TranscriptionPCR SNP : Single Nucleotide Polymorphism siRNA : Small interfering RNA ssDNA : Singlestranded DNA T : Thymin x Tm : Melting temperature TP53 : Tumor Protein p53 U : Unit UTR : Untranslated Region List of Figures I.1 The hallmarks of cancer ......................... 8 II.1 The risk spectrum of breast cancer susceptibility genes . 26 III.1 Cellular phenomena associated with cis-effect regulation .

Study of Differential Allelic Expression in the Breast Cancer Intermediate-Risk Susceptibility Genes CHEK2, ATM and TP53 Binh Thieu Tu Nguyen-Dumont

Examples of Successful Protein Expression with SUMO Reference Protein Type Family Kda System (Pubmed ID)

Lung Cancer Signature Biomarkers: Tissue Specific Semantic Similarity

Physiological Significance and Molecular Genetics of Red Cell Enzymes Involved in the Ribonucleotide Metabolism

Prognostic Markers in Acute Myeloid Leukemia – a Candidate Gene Approach

A Genome-Wide Association Study Identifies Susceptibility Loci for Nonsyndromic Sagittal Craniosynostosis Near BMP2 and Within BBS9

Identification of Molecular Targets in Head and Neck Squamous Cell Carcinomas Based on Genome-Wide Gene Expression Profiling

Polymorphisms at Microrna Binding Sites of Ara-C and Anthracyclines

Identification of Gastric Cancer–Related Genes Using a Cdna Microarray Containing Novel Expressed Sequence Tags Expressed in Gastric Cancer Cells

Red Blood Cell Ageing in Vivo and in Vitro: the Integrated Omics Perspective

Evaluation of the Impact of Single-Nucleotide Polymorphisms

Inbred Mouse Strains Expression in Primary Immunocytes Across

Association with Lymphoblastoid Cell Expression