biomolecules

Article Whole Genome Sequencing of Familial Non-Medullary Thyroid Cancer Identifies Germline Alterations in MAPK/ERK and PI3K/AKT Signaling Pathways

Aayushi Srivastava 1,2,3,4 , Abhishek Kumar 1,5,6 , Sara Giangiobbe 1, Elena Bonora 7, Kari Hemminki 1, Asta Försti 1,2,3 and Obul Reddy Bandapalli 1,2,3,* 1 Division of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), D-69120 Heidelberg, Germany; [email protected] (A.S.); [email protected] (A.K.); [email protected] (S.G.); [email protected] (K.H.); [email protected] (A.F.) 2 Hopp Children’s Cancer Center (KiTZ), D-69120 Heidelberg, Germany 3 Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), German Cancer Consortium (DKTK), D-69120 Heidelberg, Germany 4 Medical Faculty, Heidelberg University, D-69120 Heidelberg, Germany 5 Institute of Bioinformatics, International Technology Park, Bangalore 560066, India 6 Manipal Academy of Higher Education (MAHE), Manipal, Karnataka 576104, India 7 S.Orsola-Malphigi Hospital, Unit of Medical Genetics, 40138 Bologna, Italy; [email protected] * Correspondence: [email protected]; Tel.: +49-6221-42-1709

 Received: 29 August 2019; Accepted: 10 October 2019; Published: 13 October 2019 

Abstract: Evidence of familial inheritance in non-medullary thyroid cancer (NMTC) has accumulated over the last few decades. However, known variants account for a very small percentage of the genetic burden. Here, we focused on the identification of common pathways and networks enriched in NMTC families to better understand its pathogenesis with the final aim of identifying one novel high/moderate-penetrance germline predisposition variant segregating with the disease in each studied family. We performed whole genome sequencing on 23 affected and 3 unaffected family members from five NMTC-prone families and prioritized the identified variants using our Familial Cancer Variant Prioritization Pipeline (FCVPPv2). In total, 31 coding variants and 39 variants located in upstream, downstream, 50 or 30 untranslated regions passed FCVPPv2 filtering. Altogether, 210 affected by variants that passed the first three steps of the FCVPPv2 were analyzed using Ingenuity Pathway Analysis software. These genes were enriched in tumorigenic signaling pathways mediated by receptor tyrosine kinases and G- coupled receptors, implicating a central role of PI3K/AKT and MAPK/ERK signaling in familial NMTC. Our approach can facilitate the identification and functional validation of causal variants in each family as well as the screening and genetic counseling of other individuals at risk of developing NMTC.

Keywords: papillary thyroid cancer; germline mutations; whole genome sequencing; predisposition markers; pathway analysis

1. Introduction Thyroid cancer is the most common endocrine malignancy with an age adjusted incidence of 0.5-20/100,000 persons per year [1]. Significant regional differences exist with Italy being among the countries with the highest incidence rates in the world [1]. An increasing incidence has been observed worldwide during the past decades, which can to a certain extent be related to changes in the availability of medical services and in standard clinical practice. On the other hand, regional

Biomolecules 2019, 9, 605; doi:10.3390/biom9100605 www.mdpi.com/journal/biomolecules Biomolecules 2019, 9, 605 2 of 20 differences in incidence as well as changes over time may also be related to lifestyle, nutritional iodine, ionizing radiation and genetic factors [2]. For instance, the high incidence of thyroid cancer in Italy can be attributed to the disruptive and carcinogenic effect of volcanic environments on the endocrine system [3]. The familial relative risk of developing thyroid cancer is estimated to be increased 6.7-fold in a study based on the Swedish Family-Cancer Database, in which 3.4% of all thyroid cancer cases had a concordant family history [4]. Approximately 90–95% of all thyroid cancers are non-medullary thyroid cancers (NMTC) [5] and can be classified into four histological subtypes: papillary, follicular, Hürthle cell and anaplastic thyroid cancer, with papillary thyroid cancer (PTC) being the most common one. Familial NMTC (FNMTC) accounts for only a small percentage of all NMTCs and can be divided into non-syndromic and syndromic forms. In the first, it occurs as the primary feature and in the second, as a minor component of a familial cancer syndrome, such as familial adenomatous polyposis, Gardner’s syndrome, Cowden’s disease, Carney’s complex type 1, Werner’s syndrome, papillary renal neoplasia, and DICER1 syndrome [6]. Known syndromes explain only a small proportion of all FNMTCs. Unlike the case of familial medullary thyroid cancer, in which there is extensive evidence linking germline point mutations in the RET proto-oncogene to the development of thyroid cancer, the genetic causes for FNMTC remain largely unknown. Over the years, studies seeking genetic factors predisposing to NMTC have been performed using linkage analysis, candidate sequencing and recently also whole genome sequencing. These studies have suggested several genes as potential NMTC-predisposing genes, including, FOXE1, SRGAP1, TITF-1/NKX2.1, SRRM2, and HABP2 [7–11]. In addition, an imbalance of the - complex has been demonstrated in the peripheral blood of familial PTC patients [12]. Nonetheless, NMTC is one of the most heritable cancers wherein first degree relatives of an affected individual have an 8-10-fold increased risk of developing the disease [13]. Therefore, there are many underlying germline mutations that are yet to be discovered. The identification of such predisposition genes could be of great value in the screening of individuals at risk of developing NMTCs as well as in the development of personalized adjuvant therapies based on the affected pathways. It has been observed that hereditary NMTC is characterized by early onset, a higher degree of aggressiveness and more frequent multifocal disease and recurrence compared with its sporadic counterpart [13]. Thus, medical centers recommend more aggressive treatment of affected family members, reinforcing the importance of identifying such cases. Here we report the germline genomic landscape of five families with NMTC aggregation consistent with an autosomal dominant pattern of inheritance. The aim of the current study was to use whole genome sequencing (WGS) data to discern pathways affected in the FNMTC families to facilitate the identification of possible disease-causing high/moderate-penetrance germline variants in each family. With our results, we hope to facilitate genetic counseling and targeted therapy in these families and improve screening of other individuals at risk of developing NMTC.

2. Materials and Methods

2.1. Ethical Approval Blood samples were collected from the participants with informed consent following ethical guidelines approved by “Comitato Etico Indipendente dell ‘Azienda Ospedaliero-Universitaria di Bologna, Policlinico S. Orsola-Malpighi (Bologna, Italy)” and “comité utiltative de protection des personnes dans la recherche biomédicale, Le centre de ute contre le cancer Léon-Bérard (Lyon, France)”.

2.2. NMTC Families Five families with NMTC aggregation consistent with an autosomal dominant pattern of inheritance were provided by the S. Orsola-Malpighi Hospital, Unit of Medical Genetics in Bologna, Italy. Samples from a total of 23 affected and 3 unaffected family members from the five families were submitted for WGS. Their respective pedigrees are shown in Figure1. In family 1, the mother (I-1) Biomolecules 2019, 9, 605 3 of 20

Biomolecules 2019, 9, x 3 of 23 was affected by insular carcinoma of the thyroid whereas three of her children and her grandchild diagnosedwere diagnosed with PTC with or PTCmicro-PTC or micro-PTC (II-2, II-3, (II-2, II-6, II-3,III-1) II-6, and III-1)one child and onewithchild benign with nodules benign (II-1). nodules Her unaffected(II-1). Her son una ffwasected deemed son was a reliable deemed control a reliable (II-4) control. WGS (II-4).(*) was WGS performed (*) was performedon five family on fivemembers. family Inmembers. family 2, Inthere family were 2, theresix cases were (III-1, six cases III-3, (III-1,III-4, III-3,IV-3, III-4,IV-4, IV-3,IV-5), IV-4, one IV-5),probable one case probable (IV-1) case and (IV-1) one controland one (IV-2) control out (IV-2)of which out six of whichunderwent six underwent WGS. Family WGS. 3 consisted Family 3 of consisted two related of two cases related (IV-4, cases IV-5)(IV-4, and oneIV-5) unrelated and one case unrelated (III-1) of case which (III-1) all three of which underwen all threet WGS. underwent Family 4 is WGS. characterized Family 4 by is bilateral characterized PTCs concurrentby bilateral with PTCs other concurrent subtypes withof NMTCs other subtypes(Hürthle ce ofll NMTCscancer, follicular (Hürthle canc celler). cancer, Four follicularfamily members cancer). wereFour diagnosed family members with thyroid were diagnosed cancer of withwhich thyroid all underwent cancer of WGS which (II-2, all III-1, underwent III-2, III-3). WGS WGS (II-2, was III-1, performedIII-2, III-3). on WGS eight was family performed members on eight of family family 5. members Five members of family were 5. Fiveaffected members by PTC, were Hürthle affected cell by cancer,PTC, Hürthle micro-PTC cell or cancer, a combination micro-PTC of ortwo a combinationof the subtypes of (II-2, two ofII-3, the II- subtypes5, II-8, II-9). (II-2, Four II-3, members II-5, II-8, were II-9). possibleFour members carriers were either possible affected carriers by benign either nodules affected or bydeceased benign (I-1, nodules II-4, II-6) or deceased and two (I-1, were II-4, unaffected II-6) and (II-1,two wereII-7). una ffected (II-1, II-7).

FigureFigure 1. Pedigrees 1. Pedigrees of the of five the non-medullary five non-medullary thyroid thyroid canc cancerer (NMTC)-prone (NMTC)-prone families families analyzed analyzed in in this this study. study.

2.3.2.3. Whole Whole Genome Genome Sequenci Sequencingng and and Variant Variant Evaluation Evaluation WGSWGS for for 23 23 cases cases and and 3 3controls controls was was performed performed us usinging Illumina-based Illumina-based small small read read sequencing sequencing after after ® DNADNA was was isolated isolated from from peripheral peripheral blood blood using using the the QIAamp QIAamp ® DNADNA Mini Mini Kit Kit (Qiagen, (Qiagen, Cat Cat No. No. 51104) 51104) accordingaccording to to the the manufacturer’s manufacturer’s instructions. instructions.

2.4.2.4. Variant Variant Calling Calling Annotation Annotation and and Filtering Filtering SequencingSequencing data data was mappedmapped toto a a reference reference genome (assembly (assembly version version Hs37d5) Hs37d5) using using BWA BWAmem (versionmem (version 0.7.8) and0.7.8) duplicates and duplicates were removed were re usingmoved biobambam using biobambam (version 0.0.148).(version Single 0.0.148). nucleotide Single nucleotidevariants (SNVs) variants and (SNVs) indels and were indels called were from called all thefrom samples all the insamples a family in togethera family usingtogether Platypus using Platypus(version 0.8.1).(version ANNOVAR, 0.8.1). ANNOVAR, 1000 Genomes, 1000 dbSNPGenomes, and dbSNP ExAC (Exomeand ExAC Aggregation (Exome Aggregation Consortium) Consortium)were used in were the annotationused in the ofannotation variants asof explainedvariants as in explained detail in ourin detail previous in our paper previous [14]. paper Variants [14]. to Variantsbe evaluated to be furtherevaluated were further selected were using selected the using following the following criteria: (i)criteria: A quality i) A scorequality greater score thangreater 20, thanand a20, coverage and a coverage greaterthan greater 5x; (ii)than All 5x; Platypus ii) All Platypus filters were filters met. were Variants met. withVariants a minor with allele a minor frequency allele frequency(MAF) less (MAF) than 0.1less % than in 1000 0.1 % genome in 1000 and genome ExAC-nonTCGA and ExAC-nonTCGA data were data selected were forselected further for analysis.further analysis.A pairwise A pairwise comparison comparison of shared of rareshared variants rare va amongriants theamong cohort the wascohort performed was performed to check to for check sample for sampleswaps andswaps family and family relatedness. relatedness. Biomolecules 2019, 9, 605 4 of 20 Biomolecules 2019, 9, x 4 of 23

2.5.2.5. Variant Variant Filtering Filtering following following the the FCVPPv2 FCVPPv2 VariantVariant evaluation was performedperformed usingusing thethe criteriacriteriaof of our our in-house in-house developed developed Familial Familial Cancer Cancer VariantVariant PrioritizationPrioritization Pipeline Pipeline v2 (FCVPPv2)v2 (FCVPPv2) [14]. [14]. This processThis process is summarized is summarized in Figure 2in and Figure explained 2 and explainedin the following in the following text. text.

FigureFigure 2. Summary 2. Summary of the of familial the familial cancer cancer variant variant prioritization prioritization pipeline versionpipeline 2 (FCVPPv2).version 2 (FCVPPv2). 2.5.1. Segregation in Pedigrees 2.5.1. TheSegregation variants in were Pedigrees filtered based on pedigree data considering family members diagnosed with NMTCThe or variants micro-PTC were as filtered cases, benignbased on nodules pedigree or goiterdata considering as potential family variant members carriers and diagnosed unaffected with NMTCmembers or asmicro-PTC controls. as The cases, probability benign ofnodules an individual or goiter being as potential a Mendelian variant case carriers or true and control unaffected was membersconsidered. as Thecontrols. general The rule probability was that variants of an individual had to be present being ina allMendelian cases and case absent or fromtrue allcontrol controls. was considered. The general rule was that variants had to be present in all cases and absent from all controls. 2.5.2. Variant Ranking Using In Silico Tools

2.5.2. AfterVariant filtering Ranking variants Using basedIn Silico on Tools pedigree segregation, the CADD tool v1.3 [15] was applied. Variants with a scaled PHRED-like CADD score greater than 10, which accounts for the top 10% of After filtering variants based on pedigree segregation, the CADD tool v1.3 [15] was applied. probable deleterious variants in the human genome, were prioritized. Variants were then selected Variants with a scaled PHRED-like CADD score greater than 10, which accounts for the top 10% of according to their conservation scores. High evolutionary conservation suggests functional importance probable deleterious variants in the human genome, were prioritized. Variants were then selected of a position. Genomic Evolutionary Rate Profiling (GERP), PhastCons and PhyloP were used to assess according to their conservation scores. High evolutionary conservation suggests functional importance conservation of the variant position, whereby GERP scores >2.0, PhastCons scores >0.3 and PhyloP of a position. Genomic Evolutionary Rate Profiling (GERP), PhastCons and PhyloP were used to assess scores >3.0 indicate a high level of conservation and are therefore used as thresholds in the selection of conservation of the variant position, whereby GERP scores >2.0, PhastCons scores >0.3 and PhyloP potentially causative variants. After that, all missense variants were assessed for deleteriousness using scores >3.0 indicate a high level of conservation and are therefore used as thresholds in the selection of the following tools: SIFT, PolyPhen V2-HDIV, PolyPhen V2-HVAR, LRT, MutationTaster, Mutation potentially causative variants. After that, all missense variants were assessed for deleteriousness using Assessor, FATHMM, MetaSVM, MetLR, PROVEAN, VEST3 and RI using dbNSFP [16]. Variants the following tools: SIFT, PolyPhen V2-HDIV, PolyPhen V2-HVAR, LRT, MutationTaster, Mutation predicted to be deleterious by at least 60% of these tools were shortlisted for further analysis. Lastly, Assessor, FATHMM, MetaSVM, MetLR, PROVEAN, VEST3 and RI using dbNSFP [16]. Variants intolerance scores were considered. These were merely used to rank the variants and not as cutoffs predicted to be deleterious by at least 60% of these tools were shortlisted for further analysis. Lastly, for selection. The ranking of variants according to the intolerance scores of the corresponding genes intolerance scores were considered. These were merely used to rank the variants and not as cutoffs for relies on the assumption that a variant in a gene intolerant to functional genetic variation is more selection. The ranking of variants according to the intolerance scores of the corresponding genes relies Biomolecules 2019, 9, 605 5 of 20 likely to be deleterious than one that is tolerant to functional variation. We used three intolerance scores based on NHLBI-ESP6500, ExAC datasets and a local dataset, all of which were developed with allele frequency data. The ExAC consortium has developed two additional scoring systems using large-scale exome sequencing data including intolerance scores (pLI) for loss-of-function variants and Z-scores for missense and synonymous variants. These were used for nonsense and missense variants respectively. In our final list, we also included missense variants in known tumor suppressor genes and oncogenes independent of their deleteriousness and intolerance scores. However, all variants had to meet previous cut-offs, i.e., MAF >0.1, pedigree segregation, CADD-PHRED >10 and positive conservation scores.

2.5.3. Analysis of Non-Coding Variants Non-coding regions make up over 98% of the genome and possess millions of potentially regulatory elements and noncoding RNA genes. Hence it is crucial to analyze the potential pathogenic impact of such variants in a Mendelian disease. Putative miRNA targets at variant positions within 30 untranslated regions (UTRs) and 1 kb downstream of transcription end sites were detected by scanning the entire dataset of the human miRNA target atlas from TargetScan 7.0 [17] with the help of the intersect function of bedtools. We scanned the 5UTRs and 1 kb regions upstream of transcription start sites for transcription factor binding sites using SNPnexus (version 3; Dec 2017) [18]. For regulatory variants, we merged enhancer [19] and promoter [20,21] data from the FANTOM5 consortium and super-enhancer data from the super-enhancer archive (SEA) [22] and dbSUPER [23] using the intersect function of bedtools to identify putative enhancers, promoters and super-enhancers in our dataset. We accessed epigenomic data and marks from 127 cell lines from the NIH Roadmap Epigenomics Mapping Consortium via CADD v.1.3 [15], which gave us information on chromatin states from ChromHmm [24] and Segway [25]. The CADD analysis of 30 UTRs also gave us mirSVR scores for putative miRNA targets; a score lower than -0.1 is indicative of a “good” miRNA target [26]. Furthermore, we used SNPnexus to obtain non-coding scores for each variant and to identify regulatory variants located in CpG islands. Top 3 ’UTR and downstream variants that had CADD scores >10 and miRNA target site matches with mirSVR scores < 0.1 were short-listed. Similarly, upstream and 5 − 0 UTR variants in enhancers, promoters, super-enhancers or transcription factor binding sites with CADD scores >10 were selected.

2.6. Variant Validation In order to increase the confidence in variant calls and reduce the risk of false positives, we visually inspected the sequencing data of all short-listed variants for correctness using the Integrative Genomics Viewer (IGV; version 2.4.10) [27].

2.7. Ingenuity Pathway Analysis (IPA) IPA (Qiagen; http://www.qiagen.com/ingenuity; analysis date 08/04/2019) was used to perform a core analysis to identify relationships, mechanisms, functions, networks, and pathways relevant to the genes affected by variants that passed the mean allele frequency cut-off, fulfilled family-based segregation criteria, had CADD scores >10 and were not intergenic or intronic variants. Data were analyzed for all five families together. Top canonical pathways were identified from the IPA pathway library and ranked according to their significance to our input data. This significance was determined by p-values calculated using the right tailed Fisher’s exact test. These values indicated the probability of association of genes from the input dataset with the canonical pathway by random chance alone. Ratios were also calculated for each pathway by dividing the number of genes from the input dataset that map to the pathway by the total number of genes in that pathway. The ratios did not influence the ranking of the canonical pathways. IPA was also used to generate gene networks in which upstream regulators were connected to the input dataset genes while taking advantage of paths that involved more than one link Biomolecules 2019, 9, 605 6 of 20

(i.e., through intermediate regulators). These connections represent experimentally observed cause-effect relationships that relate to expression, transcription, activation, molecular modification and transport as well as binding events.

2.8. STRING Analysis A protein-protein interaction network was generated for each of the prioritized candidates using STRING (https://string-db.org; v11, 19/01/2019).

3. Results

3.1. Whole Genome Sequencing In this study, five families with reported recurrence of NMTC were analyzed. WGS identified a total of 112254, 207873, 120323, 91427 and 101081 variants which were reduced by pedigree-based filtering to 6368, 9373, 3123, 7060 and 2708 in families 1-5, respectively. Non-synonymous SNVs were the most common exonic variants (Figure S1).

3.2. Final Prioritization of Candidates according to the FCVPPv2 After applying the FCVPPv2, the number of potential pathogenic protein coding variants was reduced to 31. These variants are listed in Table1. A number of genes are of high significance to our study as they are either related to cancer or play a role in thyroid metabolism. CHEK2 is a known tumor suppressor gene involved in DNA damage response [28]. EWSR1 generates a powerful oncogenic protein causing Ewing sarcoma [29], RET is a proto-oncogene well-known in hereditary medullary thyroid carcinoma NRP1 is known to be positively associated with the progression of breast cancer [30], POT1 is a known predisposing gene in malignant [31] and TG encodes the precursor of iodinated thyroid hormones and is associated with susceptibility to autoimmune thyroid diseases (AITD) [32]. FCVPPv2 also identified 14 upstream and 50 UTR variants, which are shown in Table2. Among them, three variants are of particular interest in thyroid cancer. The PCM1 variant is a 50 UTR variant that our data showed to affect three transcription factor binding sites (Egr-3, AP-2alphaA and AP-2 gamma). Chromosomal aberrations involving this gene have been associated with PTC and a variety of hematological malignancies [33]. The other 50 UTR variant is located in the P4HB gene which is known to be involved in the structural modification of the thyroglobulin precursor in hormone biogenesis [34]. Both variants are present in CpG islands and have been predicted to be localized at an active transcription start site by ChromHmm and Segway. The third variant is an upstream variant in the DAPL1 gene, shown to affect the binding sites of MAZR and Sp1, a potential tumor suppressor in thyroid cancer, by SNPnexus and Segway. Furthermore, 25 variants located downstream and in 30 UTRs were shortlisted (Table3). Among them, two genes of importance can be highlighted, namely ACVR1B and NLK. Mutations in the ACVR1B gene are associated with pancreatic cancer [35]. The variant in the 30UTR of ACVR1B is localized at a target site for miR-6871-5p with a context ++ percentile score of 53, indicating a relatively good context for repression of the mRNA due to this miRNA. Altered expression of NLK is associated with cancer development and has been shown to be an independent prognostic factor in [36]. The corresponding variant to this gene has two predicted miRNA target sites for miR-6818-5p and miR-6867-5p with high context ++ percentile scores (88 and 79, respectively). Variants prioritized by the FCVPPv2 were also present in pathways, networks, and disease categories shown to be significantly enriched in FNMTC by IPA. Biomolecules 2019, 9, 605 7 of 20

Table 1. Top exonic variants prioritized following the FCVPPv2. Chromosomal positions, classifications, PHRED-like CADD scores and the percentage of positive intolerance (Int) and deleteriousness (Del) scores are included for each variant. Additional information regarding protein-protein interactions (STRING), localization in protein domains (InterPro [37]) and the biological function of the respective protein (GeneCards [38]) is included.

Exonic CADD_PHRED Int Del Family Gene Chrom_Pos_Ref_Alt Interactions (STRING) Domain Function Classification Score (%) (%) ATM, ATR, CDC25C, Serine/ DNA repair, cell cycle arrest or nonsynonymous CDC25A, TP53BP1, TP53, 1 CHEK2 22_29107974_C_T 24.8 75 42 threonine-protein apoptosis in response to DNA SNV MRE11A, BRCA1, RAD50, kinase-like domain damage; tumor suppressor gene H2AFX Pyrimidine nucleotide-sugar nonsynonymous SCAMP3, PRKAA1, Nucleotide-sugar transmembrane transporter activity, 1 SLC35A4 5_139947647_T_C 26.5 50 75 SNV SLC35B2, SLC35D2, ABCB10 transporter sialic acid transmembrane transporter activity STX4, SNAP23, STXBP2, Phospoholipase A2 inhibition, nonsynonymous ANXA11, ANXA4, FPR1, Annexin repeat, anti-coagulant properties, formation 1 ANXA3 4_79531211_C_T 27.9 50 75 SNV CACNA1B, NLRP3, SUMF1, conserved site of inositol 1-phosphate from inositol FPR2 1,2-cyclic phosphate BARD1, ETV1, TAF5, TAF5L, nonsynonymous , cell signaling, RNA 1 EWSR1 22_29687556_C_A 22.7 100 75 FUS, TAF12, DHX9, TP53, NA SNV processing and transport; oncogene PIOK2, POLR2G INVS, LEFTY2, DNAH11, Involved in the genetic cascade that nonsynonymous CCDC102B, EN1, CCDC178, governs left-right specification and 1 RTTN 18_67776873_G_A 26.7 25 83 NA SNV L3MBTL4, CHML, CHM, in the maintenance of a normal DLL1 ciliary structure. Modulates the activity of Rho CDC42, SRC, RAC1, EFNB1, GTP-binding , connects nonsynonymous Dbl homology (DH) 1 TIAM1 21_32526579_G_A 35 100 92 RAC2, NME1, EPHA2, extracellular signals to cytoskeletal SNV domain RHOA, PARD6A, ARF6 activities, activates Rac1, CDC42, and to a lesser extent RhoA. carbohydrate binding, MAN2C1, NAAA, SIAE, nonsynonymous Glycosyl hydrolase alpha-mannosidase activity, 1 MAN2B2 4_6612617_C_T 34 25 100 GLB1L3, GLB1, PYGB, PYGL, SNV family 38, C-terminal involved in metabolism and other PYGM, NAGA glycan degradation nonsynonymous Epidermal growth Ca2+ independent binding of 2 CLEC18B 16_74446758_G_A 23.3 50 67 FRAS1, LEO1, FREM2 SNV factor-like domain polysaccharides Member of GPCR family 1, receptor HTR7, NPS, AVP, VIP, ADM, nonsynonymous for prostacyclin, elicits potent 2 PTGIR 19_47124811_C_T 35 100 67 AVPR2, ADRB2, PTH, NA SNV vasodilation and inhibition of ADCY6, GNB1 platelet aggregation Biomolecules 2019, 9, 605 8 of 20

Table 1. Cont.

Exonic CADD_PHRED Int Del Family Gene Chrom_Pos_Ref_Alt Interactions (STRING) Domain Function Classification Score (%) (%) Novel regulator of senescence, involved in DNA damage/telomere ASF1A, HIRA, CABIN1, RB1, nonsynonymous Ubinuclein middle stress induced senescence and 2 UBN1 16_4911084_G_A 34 75 67 TP53, EP400, HMGA1, SNV domain , required for HMGA2, H1F0, HIST1H1C replication independent chromatin assembly MUC7, MUC1, C1GALT1, Catalyzes the initial reaction in nonsynonymous MUC5AC, GCNT1, 2 GALNT10 5_153789322_G_C 24.6 100 67 Ricin B-related lectin O-linked oligosaccharide SNV ST6GALNAC1, B3GNT6, biosynthesis MUC2, MUC16, C1GALT1C1 CALB1, CA7, DECR1, Possibly involved in meiosis or the nonsynonymous 2 OSGIN2 8_90921899_A_T 23.7 100 67 DECR2, CALB2, NBN, NA maturation of germ cells, associated SNV SLC39A3 with retinitis pigmentosa Precursors of iodinated thyroid TPO, LRP2, TSHR, ASGR1, hormones (T4) and triiodothyronine nonsynonymous Thyroglobulin type-1 2 TG 8_133900661_A_C 25 0 75 NKX2-1, INS, SLC5A5, PAX8, (T3), associated with susceptibility to SNV domain ASGR2, ALB autoimmune thyroid diseases (AITD) Pyridine GPX1, GPX3, GPX2, CAT, nucleotide-disulphide nonsynonymous Oxidoreductase activity and flavin 2 GSR 8_30585111_C_T 34 100 75 GPX4, GSS, GPX7, HPGDS, oxidoreductase, SNV adenine dinucleotide binding TXN, ACLY FAD/NAD(P)-binding domain Sodium/Chloride/Calcium-activated nonsynonymous GPR55, C11orf40, ASRGL1, potassium channel subunit, 2 KCNT1 9_138676399_A_G 11.1 100 75 NA SNV SLC11A1 activated upon stimulation of GPCRs COPS5, GPKOW, CNIH4, nonsynonymous COPS6, PDE7A, CNIH3, Galactose oxidase, Involved in the ubiquination process, 2 KLHL18 3_47385160_A_G 27.4 100 75 SNV PDE/B, PDE6D, EEF1G, beta-propeller specific role has yet to be elucidated CNIH2 CDRT1: a protein-ubiquitin ligase; WD40/YVTN CDRT1, nonsynonymous RP11: a component of the 2 17_15501921_G_A 25.3 - 83 - repeat-like-containing RP11-385D13.1 SNV complex, one of several domain retinitis pigmentosa-causing genes Biomolecules 2019, 9, 605 9 of 20

Table 1. Cont.

Exonic CADD_PHRED Int Del Family Gene Chrom_Pos_Ref_Alt Interactions (STRING) Domain Function Classification Score (%) (%) GDNF, GFRA1, NRTN, Proto-oncogene, receptor tyrosine nonsynonymous SHC1, PSPN, PIK3CA, kinase; involved in cell 2 RET 10_43600559_T_C 26.3 75 83 Cadherin-like domain SNV GFRA2, PIK3CD, PIK3CB, differentiation, growth, migration GRB2 and survival Tetrodotoxin-resistant channel that SCN5A, CALM2, SCN8A, mediates the voltage-dependent nonsynonymous SCN2A, SCN11A, SCH3A, 2 SCN10A 3_38755465_C_A 35 50 92 Ion transport sodium ion permeability of excitable SNV SCN1A, SCN9A, SCN4A, membranes, plays a role in SCN1B neuropathic pain mechanisms nonsynonymous Possible involvement in the 3 C1orf27 1_186355211_G_A 25.1 0 67 DRAM1, PID1, TXLNG ODR-4-like domain SNV trafficking of a subset of GPCRs Binds collagen, involved in nonsynonymous Peptidase M14, adipogenesis through extracellular 3 CPXM1 20_2776248_C_T 32 100 75 FAM196A, PPP2R2B, SEC13 SNV carboxypeptidase A matrix remodeling, may act as a TSG in breast cancer POTEE, POTEI, POTEJ, nonsynonymous POTEF, SKIV2L, CFHR4, May be involved in transcriptional 3 ZBTB41 1_197128680_C_T 23.1 100 75 NA SNV RIPK4, PHLPP2, PHLPP1, regulation C7orf73 NCOA2, NCOA4, KLK3, Steroid-hormone activated nonframeshift KDM1A, FOXA1, SRC, transcription factor. Stimulates 3 AR X_66765158_T_TGCAGCAGCA 12.8 67 - insertion HSP90AA1, FKBP5, NCOA1, domain transcription of androgen responsive CCND1 genes. TMEM2, CUEDC1, PKHD1, nonsynonymous PKD1P1, C2orf74, Signaling receptor activity, immune 4 PKHD1L1 8_110477162_G_A 27.5 0 100 NA SNV RAD21-AS1, FAM135B, response CSMD3, MUM1L1, HSPA12B Metalloprotease involved in the Peptidase M13, generation of functionally RPS6KA2, EDN3, EDNRA, nonsynonymous neprilysin, pleiotropic members of the 4 ECE2 3_184008594_G_C 32 75 100 DHX40, MYSM1. EDNRB, SNV C-terminal/Metallopeptidase, endothelin vasoactive family, EDN1, EZR, LARP6, PRKCE catalytic domain possibly involved in amyloid-beta processing Biomolecules 2019, 9, 605 10 of 20

Table 1. Cont.

Exonic CADD_PHRED Int Del Family Gene Chrom_Pos_Ref_Alt Interactions (STRING) Domain Function Classification Score (%) (%) RIPK4, PPIE, POTEI, POTEE, Regulates fibrillogenesis by nonsynonymous POTEJ, POTEF, PRKAR1B, 5 EPYC 12_91365726_C_G 27 25 67 Leucine-rich repeat interacting with collagen fibrils and SNV PRKAR1A, CNBD2, other extracellular matrix proteins PRKAR2B SPARC, MMP16, FST, Calcium ion binding, cysteine-type nonsynonymous MMP14, SPARCL1, MMP2, Proteinase inhibitor I1, endopeptidase inhibitor activity, 5 SPOCK1 5_136448179_G_A 25.7 100 67 SNV CITED2, CHD1L, CFTR, Kazal cell-cell interactions, may contribute HMCN1 to various neuronal mechanisms MYH3, TTN, TNNT3, NEB, Member of the myosin-binding nonsynonymous Immunoglobulin 5 MYBPC1 12_102046527_A_G 25.9 100 67 TNNI2, DMD, MYL1, protein C family, binds actin and SNV subtype TMOD4, TNNI1, MYL3 titin, modulates muscle contraction ALDH2, ALDH3A2, nonsynonymous EHHADH, ACLY, ECHDC1, AMP-dependent Activates acetate for use in lipid 5 ACSS3 12_81593172_T_G 32 100 83 SNV ACADM, ALDH6A1, synthetase/ligase synthesis or energy generation ALDH9A1, ALDH1B1 Membrane-bound coreceptor to a SEMA3A, KDR, FLT1, tyrosine kinase receptor for both nonsynonymous PLXNA1, PLXNA2, Neuropilin-1, VEGF and semaphorin family 5 NRP1 10_33469205_G_C 24.2 75 83 SNV SEMA3C, PLXNA4, SEMA3F, C-terminal members; plays roles in PLXNA3, SEMA3E angiogenesis, guidance, cell survival, migration and invasion Member of the complex; involved in regulating telomere TERF1, TINF2, ACD, TERF2, nonsynonymous Nucleic acid-binding, length and protecting 5 POT1 7_124532359_C_A 32 50 92 TERF2IP, RAD50, MRE11A, SNV OB-fold ends from illegitimate H2AFx, DCLRE1B, BRCA1 recombination, catastrophic instability and abnormal segregation Biomolecules 2019, 9, 605 11 of 20

Table 2. Top upstream and 50 UTR variants prioritized according to the FCVPPv2. Variant annotation, chromosomal position, and regulatory consequences according to FANTOM5, SEA, CADD and SNPnexus are listed. The FANTOM5 database gives information on known promoters. CADD gives an overall deleteriousness score together with chromatin state information based on ChromHmm and Segway scores and information on transcription factor binding sites (TFBSs). Location of the variants within a specific TFBS and CpG island were obtained from SNPnexus. A cumulative non-coding score is shown as a percentage of positive scores from all scores listed in the footnote. Cut-offs for these scores are also indicated in the footnote.

Variant Details FANTOM5, SEA Annotations From CADD SNPnexus Chromatin StateII TFBS I Promoter/Enhancer_ Start..End, CADD_PHRED In a CpG Non-coding F Gene Variation_ Annotation Chrom_Pos_Ref_Alt Chrom-Hmm TFBS TFs III Strand Score Score Segway TFBS Island? scores (%) State PeaksI Egr-3, 1 PCM1 SNV_UTR5 8_17780410_G_A 17.2 TssA 0.95 TSS 50 92 AP-2alphaA, Yes 67 − AP-2gamma 1 STAP1 SNV_UTR5 4_68424468_A_G Promoter_68424462..68424469,+ 15.4 Quies 0.71 GM0 18 24 No 71 − 1 DAPL1 SNV_ Upstream 2_159651789_C_T 13.1 TF0 1 2 MAZR, Sp1 No 50 − − − 2 LRRC48 SNV_UTR5 17_17876279_G_T 10.8 TssA 0.803 GS 28 56 No 50 − − 2 P4HB SNV_UTR5 17_79818442_T_G 11.4 TssA 0.945 TSS 51 78 Yes 60 − − 2 FAM118B SNV_UTR5 11_126081608_C_T 10.5 TssA 0.969 TSS 60 129 Yes 33 − − 2 AZIN1 SNV_UTR5 8_103876327_G_A 12.4 TssA 0.929 TSS 47 76 No 50 − − 2 RPS3A SNV_UTR5 4_152020778_C_G Promoter_152020736..152020788,+ 16.0 TssA 0.961 TSS 81 184 Yes 86 − 3 C20orf194 SNV_ Upstream 20_3388577_C_A 13.4 TssA 0.921 TSS 17 26 Egr-2, Egr-3 Yes 50 − 4 DNAI1 SNV_UTR5 9_34458888_T_C Promoter_34458851..34458908,+ 14.4 TssAFlnk 0.575 GM1 10 13 Yes 71 − 4 PNPLA2 SNV_UTR5 11_819602_G_C Promoter_819601..819612,+ 10.6 TssA 0.945 TSS 38 65 Yes 50 − 4 GNB2 Indel_UTR5 7_100271438_G_GCGCCGCCGCCGC 17.5 TssA 0.992 TSS 65 115 CUTL-1 Yes 25 − CREB, delta 4 PHTF1 SNV_UTR5 1_114301745_G_T 16.2 TssA 0.961 TSS 20 28 Yes 50 − CREB 4 ZKSCAN1 SNV_UTR5 7_99613211_C_G 21.4 TssA 0.937 TSS 65 140 Elk-1, LCR-F1 Yes 67 − [I] = Family ID, [II] = ChromHmm and Segway; ChromHmm shows the proportion of 127 cell types in a particular chromatin state (x). Scores closer to 1 indicate a higher proportion of cell types in the specified chromatin state. X can be the following: active transcription start sites (TssA), enhancers (Enh), bivalent TSS (TssBiv), bivalent enhancers (EnhBiv), genic enhancers (EnhG), flanking transcription states (TxFlnk), flanking bivalent TSS (TssBiv), active transcription flanking sites (TssAFlnk), transcription states (Tx) and weak transcription states (TxWk), repressed polycomb (ReprPC) and weak repressed polycomb regions (PeprPCWk), heterochromatin (Het) and quiescent regions (Quies). Segway is a software that transforms multiple datasets on chromatin properties into a single annotation of the genome. The annotations can be as follows: D: dead, F: FAIRE, R: repression, H3K9me1: 3 lysine 9 monomethylation, L: low, GE: gene end, TF: transcription factors, C: CTCF, TSS: transcription start site, GS: gene start, E: enhancer, GM: gene middle and ZnfRpts: zinc finger repeats. [III] = Non-coding scores with their cut-offs in brackets: FitCons Score ( 0.2), FitCons P-Value ( 0.05), EIGEN (>0, at least 1 of 2 must be positive), FatHMM (>0.5), GWAVA (>0.4, at least 2 of 3 must be positive), DeepSEA (>0.5, at least 2 of 3 must be positive),≥ FunSeq2 (>3), ReMM≤ (>0.5). [IV] = TFBS peaks: regions with enrichment of transcription factor binding sites (TFBS). Biomolecules 2019, 9, 605 12 of 20

Table 3. Top downstream and 30 UTR variants prioritized according to the FCVPPv2. Variant annotation, chromosomal position, and regulatory consequences according to TargetScan, CADD and SNPnexus are listed. Information on miRNA target sites from TargetScan and chromatin states from CADD are also included. A cumulative non-coding score is shown as a percentage of positive scores from all scores listed in the footnote. Cut-offs for these scores are also indicated in the footnote.

Variant Details TargetScan Annotations from CADD SNP-nexus Context Chromatin StateIII I CADD_ PHRED Non-Coding F Gene Variation_ Annotation Chrom_Pos_Ref_Alt miRNA Target Sites Score ++ Site Type mirSVR-Score IV Score Chrom-Hmm Score Segway Scores (%) PercentileII 1 DESI2 SNV_UTR3 1_244872281_A_G DESI2:miR-3651 94 7mer-m8 -0.84 15.6 TxWk 0.73 R2 60 7mer-1a, 1 DPYSL3 SNV_UTR3 5_146770537_A_T DPYSL3:miR-4693-5p, DPYSL3:miR-4768-3p, DPYSL3:miR-6888-5p 20, 52, 59 7mer-m8, -0.24 11.1 L1 40 − − 7mer-m8 1 MECP2 SNV_UTR3 X_153295452_G_A MECP2:miR-6812-3p 72 7mer-1a NA 10.4 Tx 0.46 TF2 25 RYK:miR-548aq-3p/548am-3p/548aj-3p/548ah-3p/548ae-3p/548j-3p/548x-3p; 7mer-m8, 1 RYK SNV_UTR3 3_133876591_C_T 93, 95 1.25 12.7 TxWk 0.50 F1 80 RYK:miR-5582-3p 7mer-m8 − 7mer-m8, 1 SGTB SNV_UTR3 5_64965337_A_C SGTB:miR-3187-3p, SGTB:miR-4529-5p 84, 46 0.75 16.8 TxWk 0.68 GE0 67 7mer-1a − 1 SLC25A12 SNV_UTR3 2_172641178_G_A SLC25A12:miR-3622b-5p 62 7mer-1a 0.31 15.1 GE1 60 − − − 2 ACVR1B SNV_UTR3 12_52388057_A_G ACVR1B:miR-6871-5p 53 7mer-m8 14.8 60 − − − 2 NCAM2 Indel_UTR3 21_22913891_AT_A NCAM2:miR-6885-3p 46 7mer-m8 NA 11.3 Quies 0.99 F0 50 2 NOP2 SNV_UTR3 12_6666047_A_T NOP2:miR-3662 98 7mer-1a 1.29 14.2 Tx 0.48 GE0 50 − 2 NUPL1 SNV_UTR3 13_25909315_T_C NUPL1:miR-3145-3p 69 8mer 11.3 80 − − − − 7mer-m8, 2 PNPLA8 SNV_UTR3 7_108112453_A_G PNPLA8:miR-3163, PNPLA8:miR-4668-3p, PNPLA8:miR-551b-5p 65, 56, 62 1.25 13.3 TxWk 0.53 F0 80 7mer-1, 7mer-m8 − 8mer, 7mer-1a, 2 STK32A SNV_UTR3 5_146763869_G_A STK32A:miR-4484, STK32A:miR-548an, STK32A:miR-6768-3p 99, 80, 74 NA 11.8 Quies 0.48 F1 40 7mer-1a 2 SVEP1 SNV_UTR3 9_113128472_T_C SVEP1:miR-1468-3p 96 7mer-m8 1.32 17.0 Quies 0.77 F1 60 − 2 TFCP2 Indel_UTR3 12_51487616_A_AACAC TFCP2:miR-8485 95 7mer-m8 NA 10.2 Tx, TxWk 0.47, 0.52 GE0 67 2 MRPL51 SNV_ downstream 12_6600160_C_T MRPL51: miR-6802-3p 90 7mer-m8 NA 13.4 TxWk 0.63 H3K9 me1 50 2 ZNF45 SNV_ncRNA_UTR3 19_44417402_A_G ZNF45: miR-6777-3p 96 8mer 0.39 11.3 ZnfRpts 0.78 GE1 60 − 3 NLK Indel_UTR3 17_26522009_T_TCACA NLK:miR-6818-5p, NLK:miR-6867-5p 88, 79 7mer-m8, 8mer 0.62 11.7 TxWk 0.63 TF1 100 − 4 ADAMTS1 SNV_UTR3 21_28208629_T_C ADAMTS1:miR-325, ADAMTS1:miR-628-3p 88, 97 7mer-1a, 8mer 1.31 16.0 TxWk 0.58 F1 60 − 7mer-1a, 4 GRIA2 SNV_UTR3 4_158284635_G_A GRIA2:miR-486-5p, GRIA2:miR-7152-5p 88, 84 0.87 22.3 Quies 0.84 L1 60 7mer-m8 − 8mer,7mer-m8, 4 IGSF9 Indel_UTR3 1_159896866_TCACA_T IGSF9:miR-377-3p, IGSF9:miR-5582-3p, IGSF9:miR-8485 98, 82, -1 0.92 17.0 - TF1 50 3’compensatory − − 7mer-m8, 4 MPP6 SNV_UTR3 7_24727611_A_G MPP6:miR-138-2-3p, MPP6:miR-205-3p, MPP6:miR-498 93, 50, 48 7mer-1a, NA 15.5 TxWk, Quies 0.50, 0.45 GE0 60 7mer-1a 4 ZNF532 SNV_UTR3 18_56651809_T_C ZNF532: miR-1277-5p 53 7mer-m8 0.86 15.2 TxWk 0.73 R0 80 − 7mer-1a, 5 KLF7 Indel_UTR3 2_207945783_ATATGTG_A KLF7:miR-511-3p, KLF7:miR-223-5p 82, 59 1.10 11.9 Tx 0.73 F1 50 7mer-1a − 7mer-m8, SATB2:miR-3156-5p, SATB2:miR-3126-3p, 37, 86, 74, 7mer-m8, 5 SATB2 SNV_UTR3 2_200134548_A_G -1.22 15.2 Quies 0.74 F1 60 SATB2:miR-4720-5p/4799-3p/5588-5p, SATB2:miR-3128, SATB2:miR-6868-5p 76, 83 7mer-1a, 8mer, 7mer-1a 5 ZNF608 SNV_ downstream 5_123972606_C_A ZNF608: miR-4786-3p 87 7mer-m8 NA 16.8 TxWk 0.69 D 60 [I] = Family ID, [II] = Context score ++ percentile: a higher percentile score indicates a better context for repression of an mRNA due to a miRNA, [III] = ChromHmm and Segway; ChromHmm shows the proportion of 127 cell types in a particular chromatin state (x). Scores closer to 1 indicate a higher proportion of cell types in the specified chromatin state. X can be the following: active transcription start sites (TssA), enhancers (Enh), bivalent TSS (TssBiv), bivalent enhancers (EnhBiv), genic enhancers (EnhG), flanking transcription states (TxFlnk), flanking bivalent TSS (TssBiv), active transcription flanking sites (TssAFlnk), transcription states (Tx) and weak transcription states (TxWk), repressed polycomb (ReprPC) and weak repressed polycomb regions (PeprPCWk), heterochromatin (Het) and quiescent regions (Quies). Segway is a software that transforms multiple datasets on chromatin properties into a single annotation of the genome. The annotations can be as follows: D: dead, F: FAIRE, R: repression, H3K9me1: histone 3 lysine 9 monomethylation, L: low, GE: gene end, TF: transcription factors, C: CTCF, TSS: transcription start site, GS: gene start, E: enhancer, GM: gene middle and ZnfRpts: zinc finger repeats. [IV] = Non-coding scores with their cut-offs in brackets: FitCons Score ( 0.2), FitCons P-Value ( 0.05), EIGEN (> 0, at least 1 of 2 must be positive), FatHMM (>0.5), GWAVA (>0.4, at least 2 of 3 must be positive), DeepSEA (>0.5, at least 2 of 3 must be positive),≥ FunSeq2 (>3), ReMM≤ (>0.5).

Biomolecules 2019, 9, 605 13 of 20

Biomolecules 2019, 9, x 16 of 23 3.3. Ingenuity Pathway Analysis (IPA) Shows Enrichment of GPCR and RTK Mediated Pathways 3.3. Ingenuity Pathway Analysis (IPA) Shows Enrichment of GPCR and RTK Mediated Pathways In order to identify key biological functions and signaling pathways affected in FNMTC, we filtered theIn variants order accordingto identify to key pedigree biological segregation, functions CADD and signaling scores and pathways location, affected excluding in intronicFNMTC, and we filteredintergenic the variants variants. according The variants to pedigree were in segregation, 339 genes, with CADD 92, 122,scores 14, and 72 andlocation, 39 genes excluding coming intronic from andfamilies intergenic 1-5 respectively. variants. The Of variants these genes, were 210 in 339 gene ge IDsnes, could with 92, be mapped122, 14, 72 by and IPA 39 and genes were coming part of from the familiessubsequent 1-5 respectively. analysis (Table Of S1).these The genes, remaining 210 gene 129 IDs genes could were be uncharacterized mapped by IPA genes and were with part RP11 of IDs, the subsequentand thus could analysis not be(Table mapped. S1). The remaining 129 genes were uncharacterized genes with RP11 IDs, and thusOf thecould top not 150 be diseases mapped. and bio functions, 123 were cancer-related with thyroid cancer at position Of the top 1505 diseases and bio functions, 123 were cancer-related5 with thyroid cancer at position 99 (p = 3.17 10− ), NMTC at position 120 (p = 6.39 10− ), differentiated thyroid cancer (DTC) at 99 (p= 3.17 × 10×−5), NMTC at position5 120 (p=6.39 × 10−5),× differentiated 5thyroid cancer (DTC) at position position 125 (p = 7.88 10− ) and PTC at position 148 (p = 2.16 10− ) (Table S1B). There was a high −5 × −5 × 125overlap (p=7.88 of molecules× 10 ) andamong PTC at the position four thyroid 148 (p= cancer 2.16 × related10 ) (Table categories. S1B). There This overlap was a high of eight overlap genes of moleculesincluded among two genes the prioritizedfour thyroid using cancer our related pipeline categories. (RET and ThisTG overlap), that areof eight of particular genes included interest two in genesthyroid prioritized cancer. using our pipeline (RET and TG), that are of particular interest in thyroid cancer. WithWith the the aim aim of of evaluating the canonical pathwaypathway resultsresults toto determinedetermine thethe mostmost significantsignificant pathwayspathways in in our our dataset, dataset, we we created created a network a network of the of top the 18 top overlapping 18 overlapping canonical canonical pathways pathways (Table S1C,(Table Figure S1C, 3). Figure The 3threshold). The threshold of common of common genes betw geneseen between the pathways the pathways was set was at 2. set G-protein at 2. G-protein coupled receptorcoupled (GPCR) receptor and (GPCR) receptor and tyrosine receptor kinase tyrosine (RTK kinase) mediated (RTK) mediatedpathways, pathways, as major mediators as major mediators of thyroid cancerof thyroid development, cancer development, were represented were represented by 12 pathwa by 12ys pathways (Figure 3). (Figure The 3genes). The involved genes involved in the intop the 18 pathwaystop 18 pathways along with along their with correspondi their correspondingng variants variants are listed are in listed Table in S2. Table S2.

Figure 3. Top 18 overlapping canonical pathways visualized as a network, which shows each pathway Figureas a single 3. Top “node” 18 overlapping colored proportionally canonical pathways to the Fisher’s visualized Exact as Test a network, p-value, wherewhich brighter shows each red indicates higher significance. Nodes marked with asterisk (*) belong to the group of GPCR and RTK pathway as a single “node” colored proportionally to the Fisher’s Exact Test p-value, where mediated pathways. brighter red indicates higher significance. Nodes marked with asterisk (*) belong to the group 3.4.of Network GPCR Analysisand RTK Reinforces mediated the pathways. Central Role of PI3K/AKT and MAPK/ERK Signaling in FNMTC

3.4. NetworkWe conducted Analysis a Reinforces network analysisthe Central using Role the of IPAPI3K/AKT software and to MAPK/ERK predict interacting Signaling molecular in FNMTC networks significant to our input-data and to evaluate genes with a central role in FNMTC (Figure4, Table S1D). SinceWe the conducted IPA network a analysisnetwork includesanalysis paths using with the intermediate IPA software regulators to predict that involveinteracting more molecular than one networkslink, a comprehensive significant to our picture input-data of the possibleand to evalua gene interactionste genes with was a central generated. role in The FNMTC networks (Figure were 4, Tableranked S1D). according Since the to scoresIPA network that were analysis generated includes by considering paths with the intermediate number of focus regulators genes (inputthat involve data) more than one link, a comprehensive picture of the possible gene interactions was generated. The networks were ranked according to scores that were generated by considering the number of focus

Biomolecules 2019, 9, 605 14 of 20 Biomolecules 2019, 9, x 17 of 23 and the size of the network to approximate the relevance of the network to the original list of focus genes (input data) and the size of the network to approximate the relevance of the network to the genes. We focused on the three highest scoring networks, which had scores ranging from 33 to 51 original list of focus genes. We focused on the three highest scoring networks, which had scores ranging (Table S1D). from 33 to 51 (Table S1D). In coherence with the pathway analysis, the network analysis reinforces the importance of In coherence with the pathway analysis, the network analysis reinforces the importance of central central perpetrators of GPCR and RTK mediated signaling (AKT, ERK1/2: Networks 1 & 3) and their perpetrators of GPCR and RTK mediated signaling (AKT, ERK1/2: Networks 1 & 3) and their κ downstreamdownstream e ffeffectorsectors (NF (NFκB,B, CREB: CREB: Network Network 2).2). Furthermore,Furthermore, NetworkNetwork 3 encompassesencompasses a number of of genesgenes related related to to thyroid thyroid metabolism metabolism includingincluding TG from our our prioritized prioritized shortlist. shortlist.

FigureFigure 4. 4.The The top top three three molecular molecular networks networks identified identified by Ingenuity by Ingenuity Pathway AnalysisPathway (IPA): Analysis (a) Network (IPA): 1.(a Protein) Network Synthesis, 1. Protein Cardiovascular Synthesis, Cardiovascular System Development System andDevelopment Function, and Cellular Function, Assembly Cellular and Organization;Assembly and (b) Organization; Network 2. Cell (b) Morphology, Network 2. Cellular Cell Morphology, Assembly and Cellular Organization, Assembly Cellular and DevelopmentOrganization, and Cellular (c) Network Development 3. Endocrine and System (c) Disorders,Network Metabolic3. Endocrine Disease, System Organismal Disorders, Injury andMetabolic Abnormalities. Disease, Genes Organismal from our Injury input-data and thatAbno werermalities. prioritized Genes based from on pedigreeour input-data segregation that andwere PHRED-like prioritized CADD based scores on pedigree are shown segregat in peach.ion and Our PHRED-like top coding and CADD non-coding scores candidatesare shown are in highlightedpeach. Our in darktop orange.coding Interactions and non-coding of central cand genesidates of the networkare highlighted are highlighted in dark in blue. orange. Interactions of central genes of the network are highlighted in blue. 3.5. Overlapping Pathways in Familial Non-Medullary Thyroid Cancer 3.5.Since Overlapping GPCR andPathways RTK mediatedin Familialsignaling Non-Medullary were highlightedThyroid Cancer in both pathway and network analyses, we propose a pathway to facilitate a general understanding of FNMTC at a molecular level (Figure5).

Biomolecules 2019, 9, x 18 of 23

BiomoleculesSince2019 GPCR, 9, 605and RTK mediated signaling were highlighted in both pathway and network analyses,15 of 20 we propose a pathway to facilitate a general understanding of FNMTC at a molecular level (Figure 5).

FigureFigure 5.5. Proposed model forfor thethe mostmost importantimportant molecularmolecular mechanismsmechanismsin inFNMTC. FNMTC. Genes Genes from ourfrom input-data our input-data are highlighted are highlighted in orange andin orange genes corresponding and genes corresponding to variants prioritized to variants using the FCVPPv2prioritized are using highlighted the FCVPPv in red.2 are highlighted in red.

ActivationActivation of of GPCR GPCR receptors receptors can can activate activate MAPK MAPK/ERK/ERK signaling signaling as well as well as PI3K/AKT as PI3K/AKT signaling signaling via one of the four subclasses of G-proteins (Gαs, Gαi/o, Gαq/11, and Gα12/13). Dimerization of receptor- via one of the four subclasses of G-proteins (Gαs,Gαi/o,Gαq/11, and Gα12/13). Dimerization of receptor-tyrosinetyrosine kinase (RTK) kinase receptors (RTK) receptorscan be induced can be by induced growth byfactors growth such factors as EGFR such and as GDNF, EGFR andwhich GDNF, whichresultsresults in the in phosphorylation the phosphorylation and subsequent and subsequent activation activation of the of recept the receptoror monomers. monomers. Receptor Receptor activation is linked to downstream signal transduction pathways like the MAPK signaling cascade and activation is linked to downstream signal transduction pathways like the MAPK signaling cascade the PI3K/AKT system via adaptor proteins. Genes from our dataset that were present in these pathways and the PI3K/AKT system via adaptor proteins. Genes from our dataset that were present in these as activators or regulators are highlighted in Figure 5. pathways as activators or regulators are highlighted in Figure5. 4. Discussion 4. Discussion The high heritability of thyroid cancer can be attributed to both rare, high-penetrance mutations and common,The high low-penetrance heritability of thyroidvariants cancer [4,13]. canThe beformer attributed is best toidentified both rare, by studying high-penetrance families with mutations a andMendelian common, pattern low-penetrance of inheritance variants of the disease [4,13]. Thein question. former isWe best used identified this principle by studying in our study families and with aidentified Mendelian 31 patternexonic and ofinheritance 39 non-coding ofthe rare disease potentially in question. pathogenic We variants used this segregating principle with in our the study anddisease identified in five PTC-prone 31 exonic andfamilies. 39 non-coding rare potentially pathogenic variants segregating with the diseaseScientific in five and PTC-prone technological families. advancements in genomics have allowed WGS to become the state-of- the-artScientific tool not andonly technologicalfor the identification advancements of driver mutations in genomics in tumors have but allowed also for WGS the identification to become the state-of-the-artof novel cancer toolpredisposing not only genes for the in identificationMendelian diseases. of driver The former mutations has led in tumorsto improvements but also forin the identificationpersonalized medicine, of novel wherein cancer ther predisposingapeutic approaches genes inare Mendelian based on targeting diseases. dysregulated The former pathways has led to improvements in personalized medicine, wherein therapeutic approaches are based on targeting dysregulated pathways specific to the affected individual. There are also some reports of WGS being successfully used to implicate rare, high-penetrance germline variants in cancer, for example POT1 mutations in familial melanoma [39] and POLE and POLD1 mutations in colorectal adenomas and carcinomas [40]. Identification of cancer-predisposing mutations is a critical step in cancer risk Biomolecules 2019, 9, 605 16 of 20 assessment and can help in cancer screening and prevention strategies. Furthermore, the implication of predisposition genes and their respective pathways may facilitate development of targeted therapy. However, one has to be critical in reporting novel variants before appropriate functional validation and evaluation of their penetrance in a large cohort of families. The importance of this step is exemplified by controversial findings regarding the implication of HABP2 G534E in familial NMTC [41]. Some of the genes shortlisted based on FCVPPv2 have already been identified in other cancers. These include CHEK2 mutations in breast cancers and also in a variety of other cancers including thyroid cancer [28], EWSR1 in Ewing sarcoma [29], RET in hereditary medullary thyroid carcinoma, NRP1 in breast cancer [30] and germline POT1 variants in malignant melanoma [31]. Moreover, it is interesting to note that the expression of NRP2, an important paralog of the NRP1 gene, has been correlated to lymph node metastasis of human PTC and is required in the VEGF-C/NRP2 mediated invasion and migration of thyroid cancer cells [42]. The upstream variant in the DAPL1 gene is shown to affect the binding sites of MAZR and Sp1 by SNPnexus and Segway. MAZR1, also known as PATZ1, has been shown to be downregulated and delocalized in thyroid cancer cell lines derived from papillary, follicular and anaplastic thyroid carcinomas [43]. Another study has demonstrated the role of PATZ1 as a tumor suppressor in thyroid follicular epithelial cells and its involvement in the dedifferentiation of thyroid cancer [44]. Other genes of interest shortlisted based on the pipeline (PNPLA8, PTGIR, RET, GNB2 and POT1) were involved in the enrichment of MAPK/ERK and PI3K/AKT pathways. The MAPK pathway is the most frequently mutated signaling pathway in human cancer and is thus considered one of the most promising targets for cancer therapy. This pathway plays a central role in the induction of biological responses such as cell proliferation, differentiation, growth, migration and apoptosis [45]. Initiated by an extracellular mitogenic stimulus that leads to the activation of RTK or GPCR, the MAPK/ERK pathway leads to the phosphorylation and subsequent translocation of ERK into the nucleus. ERK activation plays a central role in the induction of cell cycle entry and the suppression of negative regulators of the cell cycle [46]. Although MEK1 and MEK2 can be activated by multiple MAP kinase kinase kinases (MAP3Ks) as well as by RAF, they serve as sole activators of ERK1/2 and thus as gatekeepers of the MAPK cascade [47]. Overexpression or aberrant activation of RTKs or their immediate downstream targets (PI3K, RAS and SRC) can result in the upregulation of the MAPK/ERK signaling pathway [48]. A common somatic mutation in this pathway is BRAFV600E, which has been implicated in melanoma [49], thyroid and colorectal cancer [50] and hairy cell leukemia [51]. The importance of the PI3K/AKT pathway in thyroid cancer was first recognized when patients suffering from Cowden’s syndrome caused by a germline mutation in the PTEN gene were found to have FTC [52]. PI3K activation phosphorylates and activates AKT which can have numerous downstream effects via activation or inhibition of multiple proteins that are involved in , proliferation, motility, adhesion, angiogenesis, metabolism and apoptosis. Furthermore, our findings are in line with recent studies on PTC tissues and PTC cell lines have implicated activation of MAPK/ERK and PI3K/AKT pathways in thyroid carcinogenesis [53–55]. Interestingly, somatic alterations that lead to the activation of the MAPK pathway as well as of the PI3K/AKT pathway are common in aggressive thyroid cancers, such as metastatic or recurrent PTC/FTC and ATC [56]. The targeting of downstream RAS effectors has already been shown to be a promising approach, however patients treated with RAF or MEK inhibitors frequently develop drug resistance [47]. Targeting the downstream ERK kinase, which is also known as the gatekeeper of the MAPK cascade, can overcome the acquired drug resistance induced by upstream kinase inhibitors [57]. In this context, it is also important to note the similarity between our proposed model for the molecular mechanisms in FNMTC and the reported molecular mechanisms in non-familial NMTC. It is known that patients with familial NMTC may have a more aggressive form of the disease, with larger tumors in younger patients and increased rates of extra-thyroid extension and lymph node metastasis. This suggests that FNMTC should be explored further to gain a better understanding of the cause of increased aggressiveness. However, none of the variants were identified in more than one family. As the phenotypes of our Biomolecules 2019, 9, 605 17 of 20 families differed (as described in Figure1), it is likely that also the mutations causing the disease in the families also are different. We analyzed only 5 families and no other WGS data on FNMTC are available, thus restricting the possibility to confirm the variants in larger data sets. Functional analysis of promising candidates highlighted in this study may shed some light to the mechanisms underlying this phenomenon. Interpreting WGS data and selecting one out of millions of genetic variants as the cause of hereditary cancer is a daunting task and highlights the importance of the use of a standardized protocol like the FCVPPv2. We were able to prioritize 31 exonic and 39 non-coding potential cancer-predisposing variants using our family-based pipeline from which we hope to pinpoint one candidate gene for each family. The final selection and implication of one candidate gene predisposing to cancer in each family is beyond the scope of this paper as it will involve further steps including population screening and functional studies. In the present study, we decided to focus on the analysis of pathways that are enriched in familial NMTC to see how the variants prioritized using our pipeline fit into the general pathway analysis results. The IPA analysis of all genes already presented us with valuable data and there was a high involvement of genes prioritized using our pipeline in the top diseases and bio functions, canonical pathways and networks generated by IPA. Although IPA could give us a general idea of molecular pathways affected in the studied families, it is important to keep in mind that the analysis was conducted at a gene level and not at a variant level. The evaluation at a variant level is largely dependent on the pipeline and its subsequent steps as mentioned above. We have already successfully implemented this pipeline to identify DICER1 as a candidate predisposing gene in familial Hodgkin lymphoma [58] and are confident that our pipeline can be applied to the NMTC families in a similar manner.

5. Conclusions In conclusion, WGS data analysis of five NMTC-prone families allowed us to prioritize 31 exonic and 39 non-coding variants from which we subsequently hope to identify one candidate gene per family. Furthermore, we were able to identify pathways and networks significant to our dataset, including important tumorigenic pathways such as MAPK/ERK and PI3K/AKT signaling pathways. The implication of previously reported tumorigenic signaling pathways and the presence of known tumor suppressor or oncogenes in these affected pathways show that the pathogenesis of FNMTC is in concordance with characteristic molecular mechanisms of cancer. The next steps will include selecting one candidate gene per family and validating it with the help of population screening and functional studies. We hope that our results can facilitate personalized therapy in the studied families and contribute to the screening of other individuals at risk of developing NMTC.

Supplementary Materials: The following are available online at http://www.mdpi.com/2218-273X/9/10/605/s1, Figure S1: Variant Distribution, Table S1: Ingenuity Pathway Analysis Results, Table S2: Gene List from Top 20 Canonical Pathways. Author Contributions: Conceptualization, O.R.B.; A.F. and K.H.; Data curation, A.S.; A.K.; O.R.B. and S.G.; Formal analysis, A.S. and O.R.B.; Funding acquisition, K.H.; Investigation, A.S. and O.R.B.; Methodology, A.S.; S.G. and E.B.; Project administration O.R.B.; K.H. and A.F.; Resources, E.B. and A.F.; Software, A.K.; Supervision, O.R.B. and A.F.; Validation, S.G.; Writing—original draft, A.S. and O.R.B.; review and editing, K.H. and A.F. Funding: This research received no external funding. Acknowledgments: We thank the Genomics and Proteomics Core Facility (GPCF) of the German Cancer Research Center (DKFZ), for providing excellent library preparation and sequencing services. We also thank the Omics IT and Data management Core Facility (ODCF) of the DKFZ for the whole genome sequencing data management. Conflicts of Interest: The authors declare no conflict of interest. Biomolecules 2019, 9, 605 18 of 20

References

1. Bray, F.; Ferlay, J.; Soerjomataram, I.; Siegel, R.L.; Torre, L.A.; Jemal, A. Global cancer statistics 2018: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2018, 68, 394–424. [CrossRef][PubMed] 2. Lubina, A.; Cohen, O.; Barchana, M.; Liphshiz, I.; Vered, I.; Sadetzki, S.; Karasik, A. Time trends of incidence rates of thyroid cancer in israel: What might explain the sharp increase. Thyroid 2006, 16, 1033–1040. [CrossRef][PubMed] 3. Malandrino, P.; Scollo, C.; Marturano, I.; Russo, M.; Tavarelli, M.; Attard, M.; Richiusa, P.; Violi, M.A.; Dardanoni, G.; Vigneri, R.; et al. Descriptive epidemiology of human thyroid cancer: Experience from a regional registry and the “volcanic factor”. Front. Endocrinol. 2013, 4, 65. [CrossRef][PubMed] 4. Frank, C.; Sundquist, J.; Yu, H.; Hemminki, A.; Hemminki, K. Concordant and discordant familial cancer: Familial risks, proportions and population impact. Int. J. Cancer 2017, 140, 1510–1516. [CrossRef][PubMed] 5. Grossman, R.F.; Tu, S.H.; Duh, Q.Y.; Siperstein, A.E.; Novosolov, F.; Clark, O.H. Familial nonmedullary thyroid cancer. An emerging entity that warrants aggressive treatment. Arch. Surg. 1995, 130, 892–897. [CrossRef][PubMed] 6. Peiling Yang, S.; Ngeow, J. Familial non-medullary thyroid cancer: Unraveling the genetic maze. Endocrine-related cancer 2016, 23, R577–r595. [CrossRef] 7. Gara, S.K.; Jia, L.; Merino, M.J.; Agarwal, S.K.; Zhang, L.; Cam, M.; Patel, D.; Kebebew, E. Germline habp2 mutation causing familial nonmedullary thyroid cancer. N. Engl. J. Med. 2015, 373, 448–455. [CrossRef] 8. He, H.; Bronisz, A.; Liyanarachchi, S.; Nagy,R.; Li, W.; Huang, Y.; Akagi, K.; Saji, M.; Kula, D.; Wojcicka, A.; et al. Srgap1 is a candidate gene for papillary thyroid carcinoma susceptibility. J. Clin. Endocrinol. Metab. 2013, 98, E973–E980. [CrossRef] 9. Ngan, E.S.; Lang, B.H.; Liu, T.; Shum, C.K.; So, M.T.; Lau, D.K.; Leon, T.Y.; Cherny, S.S.; Tsai, S.Y.; Lo, C.Y.; et al. A germline mutation (a339v) in thyroid transcription factor-1 (titf-1/nkx2.1) in patients with multinodular goiter and papillary thyroid carcinoma. J. Natl. Cancer Inst. 2009, 101, 162–175. [CrossRef] 10. Pereira, J.S.; da Silva, J.G.; Tomaz, R.A.; Pinto, A.E.; Bugalho, M.J.; Leite, V.; Cavaco, B.M. Identification of a novel germline foxe1 variant in patients with familial non-medullary thyroid carcinoma (fnmtc). Endocrine 2015, 49, 204–214. [CrossRef] 11. Tomsic, J.; He, H.; Akagi, K.; Liyanarachchi, S.; Pan, Q.; Bertani, B.; Nagy, R.; Symer, D.E.; Blencowe, B.J.; Chapelle, A.d.l. A germline mutation in srrm2, a splicing factor gene, is implicated in papillary thyroid carcinoma predisposition. Sci. Reports 2015, 5, 10566. [CrossRef][PubMed] 12. Capezzone, M.; Cantara, S.; Marchisotta, S.; Filetti, S.; De Santi, M.M.; Rossi, B.; Ronga, G.; Durante, C.; Pacini, F. Short , telomerase reverse transcriptase gene amplification, and increased telomerase activity in the blood of familial papillary thyroid cancer patients. J. Clin. Endocrinol. Metab. 2008, 93, 3950–3957. [CrossRef] [PubMed] 13. Bauer, A.J. Clinical behavior and genetics of nonsyndromic, familial nonmedullary thyroid cancer. Front. Horm. Res. 2013, 41, 141–148. [PubMed] 14. Kumar, A.; Bandapalli, O.R.; Paramasivam, N.; Giangiobbe, S.; Diquigiovanni, C.; Bonora, E.; Eils, R.; Schlesner, M.; Hemminki, K.; Forsti, A. Familial cancer variant prioritization pipeline version 2 (fcvppv2) applied to a papillary thyroid cancer family. Sci. Rep. 2018, 8, 11635. [CrossRef][PubMed] 15. Kircher, M.; Witten, D.M.; Jain, P.; O’Roak, B.J.; Cooper, G.M.; Shendure, J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 2014, 46, 310–315. [CrossRef][PubMed] 16. Liu, X.; Wu, C.; Li, C.; Boerwinkle, E. Dbnsfp v3.0: A one-stop database of functional predictions and annotations for human nonsynonymous and splice-site snvs. Hum. Mutat. 2016, 37, 235–241. [CrossRef] [PubMed] 17. Agarwal, V.; Bell, G.W.; Nam, J.W.; Bartel, D.P. Predicting effective target sites in mammalian mrnas. eLife 2015, 4.[CrossRef] 18. Dayem Ullah, A.Z.; Oscanoa, J.; Wang, J.; Nagano, A.; Lemoine, N.R.; Chelala, C. Snpnexus: Assessing the functional relevance of genetic variation to facilitate the promise of precision medicine. Nucleic Acids Res. 2018, 46, W109–W113. [CrossRef] Biomolecules 2019, 9, 605 19 of 20

19. Andersson, R.; Gebhard, C.; Miguel-Escalada, I.; Hoof, I.; Bornholdt, J.; Boyd, M.; Chen, Y.; Zhao, X.; Schmidl, C.; Suzuki, T.; et al. An atlas of active enhancers across human cell types and tissues. Nature 2014, 507, 455. [CrossRef] 20. Lizio, M.; Harshbarger, J.; Shimoji, H.; Severin, J.; Kasukawa, T.; Sahin, S.; Abugessaisa, I.; Fukuda, S.; Hori, F.; Ishikawa-Kato, S.; et al. Gateways to the fantom5 promoter level mammalian expression atlas. Genome Biol. 2015, 16, 22. [CrossRef] 21. Li, H.; Durbin, R. Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics 2009, 25, 1754–1760. [CrossRef][PubMed] 22. Zhang, B.; Wang, F.; Su, J.; Shang, S.; Zhang, S.; Li, S.; Wang, X.; Wei, Y.; Liu, H.; Zhang, Y.; et al. Sea: A super-enhancer archive. Nucleic Acids Res. 2015, 44, D172–D179. 23. Khan, A.; Zhang, X. Dbsuper: A database of super-enhancers in mouse and human genome. Nucleic Acids Res. 2016, 44, D164–D171. [CrossRef][PubMed] 24. Ernst, J.; Kellis, M. Chromhmm: Automating chromatin-state discovery and characterization. Nat. Methods 2012, 9, 215. [CrossRef][PubMed] 25. Hoffman, M.M.; Buske, O.J.; Wang, J.; Weng, Z.; Bilmes, J.A.; Noble, W.S. Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat. Methods 2012, 9, 473. [CrossRef][PubMed] 26. Betel, D.; Wilson, M.; Gabow, A.; Marks, D.S.; Sander, C. The microrna.Org resource: Targets and expression. Nucleic Acids Res. 2008, 36, D149–D153. [CrossRef][PubMed] 27. Robinson, J.T.; Thorvaldsdottir, H.; Wenger, A.M.; Zehir, A.; Mesirov, J.P. Variant review with the integrative genomics viewer. Cancer Res. 2017, 77, e31–e34. [CrossRef] 28. Zannini, L.; Delia, D.; Buscemi, G. Chk2 kinase in the DNA damage response and beyond. J. Mol. Cell Biol. 2014, 6, 442–457. [CrossRef] 29. Noujaim, J.; Jones, R.L.; Swansbury, J.; Gonzalez, D.; Benson, C.; Judson, I.; Fisher, C.; Thway, K. The spectrum of ewsr1-rearranged neoplasms at a tertiary sarcoma centre; assessing 772 tumour specimens and the value of current ancillary molecular diagnostic modalities. Br. J. Cancer 2017, 116, 669. [CrossRef] 30. Hu, C.; Jiang, X. Role of nrp-1 in vegf-vegfr2-independent tumorigenesis. Target. Oncol. 2016, 11, 501–505. [CrossRef] 31. Robles-Espinoza, C.D.; Harland, M.; Ramsay, A.J.; Aoude, L.G.; Quesada, V.; Ding, Z.; Pooley, K.A.; Pritchard, A.L.; Tiffen, J.C.; Petljak, M.; et al. Pot1 loss-of-function variants predispose to familial melanoma. Nat. Genet. 2014, 46, 478–481. [CrossRef][PubMed] 32. Effraimidis, G.; Wiersinga, W.M. Mechanisms in endocrinology: Autoimmune thyroid disease: Old and new players. Eur. J. Endocrinol. 2014, 170, R241–R252. [CrossRef][PubMed] 33. O’Leary, N.A.; Wright, M.W.; Brister, J.R.; Ciufo, S.; Haddad, D.; McVeigh, R.; Rajput, B.; Robbertse, B.; Smith-White, B.; Ako-Adjei, D.; et al. Reference sequence (refseq) database at ncbi: Current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 2016, 44, D733–D745. [CrossRef][PubMed] 34. Mezghrani, A.; Courageot, J.; Mani, J.C.; Pugniere, M.; Bastiani, P.; Miquelis, R. Protein-disulfide isomerase (pdi) in frtl5 cells. Ph-dependent thyroglobulin/pdi interactions determine a novel pdi function in the post- of thyrocytes. J. Biol. Chem. 2000, 275, 1920–1929. [CrossRef][PubMed] 35. Su, G.H.; Bansal, R.; Murphy, K.M.; Montgomery, E.; Yeo, C.J.; Hruban, R.H.; Kern, S.E. Acvr1b (alk4, activin receptor type 1b) gene mutations in pancreatic carcinoma. Proc. Natl. Acad. Sci. USA 2001, 98, 3254–3257. [CrossRef][PubMed] 36. Zhang, W.; He, J.; Du, Y.; Gao, X.-H.; Liu, Y.; Liu, Q.-Z.; Chang, W.-J.; Cao, G.-W.; Fu, C.-G. Upregulation of nemo-like kinase is an independent prognostic factor in colorectal cancer. World J. Gastroenterol. 2015, 21, 8836–8847. [CrossRef][PubMed] 37. Mitchell, A.L.; Attwood, T.K.; Babbitt, P.C.; Blum, M.; Bork, P.; Bridge, A.; Brown, S.D.; Chang, H.Y.; El-Gebali, S.; Fraser, M.I.; et al. Interpro in 2019: Improving coverage, classification and access to protein sequence annotations. Nucleic Acids Res. 2019, 47, D351–D360. [CrossRef][PubMed] 38. Stelzer, G.; Rosen, N.; Plaschkes, I.; Zimmerman, S.; Twik, M.; Fishilevich, S.; Stein, T.I.; Nudel, R.; Lieder, I.; Mazor, Y.; et al. The suite: From gene data mining to disease genome sequence analyses. Curr. Protoc. Biol. 2016, 54, 1.30.31–31.30.33. 39. Wong, K.; Robles-Espinoza, C.D.; Rodriguez, D.; Rudat, S.S.; Puig, S.; Potrony, M.; Wong, C.C.; Hewinson, J.; Aguilera, P.; Puig-Butille, J.A.; et al. Association of the pot1 germline missense variant p.I78t with familial melanoma. JAMA Dermatol. 2019, 155, 604–609. [CrossRef][PubMed] Biomolecules 2019, 9, 605 20 of 20

40. Palles, C.; Cazier, J.B.; Howarth, K.M.; Domingo, E.; Jones, A.M.; Broderick, P.; Kemp, Z.; Spain, S.L.; Guarino, E.; Salguero, I.; et al. Germline mutations affecting the proofreading domains of pole and pold1 predispose to colorectal adenomas and carcinomas. Nat. Genet. 2013, 45, 136–144. [CrossRef][PubMed] 41. Ngeow, J.; Eng, C. Habp2 in familial non-medullary thyroid cancer: Will the real mutation please stand up? J. Natl. Cancer Inst. 2016, 108, djw013. [CrossRef][PubMed] 42. Tu, D.G.; Chang, W.W.; Jan, M.S.; Tu, C.W.; Lu, Y.C.; Tai, C.K. Promotion of metastasis of thyroid cancer cells via nrp-2-mediated induction. Oncol. Lett. 2016, 12, 4224–4230. [CrossRef][PubMed] 43. Chiappetta, G.; Valentino, T.; Vitiello, M.; Pasquinelli, R.; Monaco, M.; Palma, G.; Sepe, R.; Luciano, A.; Pallante, P.; Palmieri, D.; et al. Patz1 acts as a tumor suppressor in thyroid cancer via targeting p53-dependent genes involved in emt and cell migration. Oncotarget 2015, 6, 5310–5323. [CrossRef][PubMed] 44. Iesato, A.; Nakamura, T.; Izumi, H.; Uehara, T.; Ito, K.I. Patz1 knockdown enhances malignant phenotype in thyroid epithelial follicular cells and thyroid cancer cells. Oncotarget 2017, 8, 82754–82772. [CrossRef] [PubMed] 45. Morrison, D.K. Map kinase pathways. Cold Spring Harb. Perspect. Biol. 2012, 4, a011254. [CrossRef][PubMed] 46. Chambard, J.C.; Lefloch, R.; Pouyssegur, J.; Lenormand, P. Erk implication in cell cycle regulation. Biochim. Biophys. Acta 2007, 1773, 1299–1310. [CrossRef][PubMed] 47. Liu, F.; Yang, X.; Geng, M.; Huang, M. Targeting erk, an achilles’ heel of the mapk pathway, in cancer therapy. Acta Pharm. Sin. B 2018, 8, 552–562. [CrossRef] 48. Burotto, M.; Chiou, V.L.; Lee, J.M.; Kohn, E.C. The mapk pathway across different malignancies: A new perspective. Cancer 2014, 120, 3446–3456. [CrossRef] 49. Fedorenko, I.V.; Paraiso, K.H.T.; Smalley, K.S.M. Acquired and intrinsic braf inhibitor resistance in braf v600e mutant melanoma. Biochem. Pharmacol. 2011, 82, 201–209. [CrossRef] 50. Davies, H.; Bignell, G.R.; Cox, C.; Stephens, P.; Edkins, S.; Clegg, S.; Teague, J.; Woffendin, H.; Garnett, M.J.; Bottomley, W.; et al. Mutations of the braf gene in human cancer. Nature 2002, 417, 949. [CrossRef] 51. Tiacci, E.; Trifonov, V.; Schiavoni, G.; Holmes, A.; Kern, W.; Martelli, M.P.; Pucciarini, A.; Bigerna, B.; Pacini, R.; Wells, V.A.; et al. Braf mutations in hairy-cell leukemia. N. Engl. J. Med. 2011, 364, 2305–2315. [CrossRef] [PubMed] 52. Benvenga, S.; Koch, C.A. Molecular pathways associated with aggressiveness of papillary thyroid cancer. Curr. Genom. 2014, 15, 162–170. [CrossRef][PubMed] 53. Zhang, J.; Du, Y.; Zhang, X.; Li, M.; Li, X. Downregulation of bancr promotes aggressiveness in papillary thyroid cancer via the mapk and pi3k pathways. J. Cancer 2018, 9, 1318–1328. [CrossRef][PubMed] 54. Liu, Z.; Zhang, J.; Gao, J.; Li, Y. Microrna-4728 mediated regulation of mapk oncogenic signaling in papillary thyroid carcinoma. Saudi J. Biol. Sci. 2018, 25, 986–990. [CrossRef][PubMed] 55. Yu, S.T.; Zhong, Q.; Chen, R.H.; Han, P.; Li, S.B.; Zhang, H.; Yuan, L.; Xia, T.L.; Zeng, M.S.; Huang, X.M. Crlf1 promotes malignant phenotypes of papillary thyroid carcinoma by activating the mapk/erk and pi3k/akt pathways. Cell Death Dis. 2018, 9, 371. [CrossRef] 56. Xing, M. Genetic alterations in the phosphatidylinositol-3 kinase/akt pathway in thyroid cancer. Thyroid 2010, 20, 697–706. [CrossRef] 57. Xue, Y.; Martelotto, L.; Baslan, T.; Vides, A.; Solomon, M.; Mai, T.T.; Chaudhary, N.; Riely, G.J.; Li, B.T.; Scott, K.; et al. An approach to suppress the evolution of resistance in brafv600e-mutant cancer. Nat. Med. 2017, 23, 929. [CrossRef] 58. Bandapalli, O.R.; Paramasivam, N.; Giangiobbe, S.; Kumar, A.; Benisch, W.; Engert, A.; Witzens-Harig, M.; Schlesner, M.; Hemminki, K.; Forsti, A. Whole genome sequencing reveals dicer1 as a candidate predisposing gene in familial hodgkin lymphoma. Int. J. Cancer 2018, 143, 2076–2078. [CrossRef]

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).