Andrea Vo

Genome 465

January 23, 2019

Problem Set 1

Family 1

1) Family 1 is likely to have an autosomal recessive mode of inheritance. The phenotype skips many generations, and both a male and females are affected, suggesting that the mode of inheritance is unlikely dominant or X-linked. The genotype of affected members is more likely homozygous than compound heterozygous because the parents, 1.01 and 1.02, are consanguineous.

2) All candidate fit the autosomal recessive mode of inheritance with homozygosity. Each candidate variant was found by filtering homozygous in the affected persons and heterozygous in the parents. Candidate variants were narrowed by high PPH2, GERP, and frequency above 1% in 1000 exomes.

Gene Type of Chromos- Coordinate Referenc Variant Mode of Mutation ome e Inheritance

RNF123 S857C missense 3 49,749,985 C G Autosomal Recessive (A.R).

Involved in cancer genetics (PMID: 29903916)

QARS F286L missense 3 49,138,806 G T A.R.

Causes neurodegenerative disorder with developmental delay and seizures (PMID: 28056632)

SPNS2 R313H missense| 17 4,436,274 G A A.R. SPL

Involved in bone homeostasis, cancer, Immune responses, and vascular development (PMID: 30196234)

ACTR8 A427S missense 3 53,906,434 C A A.R. Involved in cell division, DNA, transcription, and translation (UniProtKB - Q9H981)

SERPINA11 G331V missense 14 94,909,488 C A A.R.

Involved in endopeptidase regulation (UniProtKB - Q86U17)

TCTA Q54H missense 3 49,450,021 G C A.R.

Involved with leukemia cancer genetics (UniProtKB - P57738)

3) The most promising candidate is the QARS gene. All candidate genes had relatively high conservation amongst vertebrates GERP score at the location of the mutation. QARS gene has the lowest GERP score of candidates, 3.0, but a high PPH2 score of 0.990. These values suggest that a mutation in the QARS gene would have a significant impact, and therefore suggest it is a strong candidate for a disease causing gene. Furthermore, past studies indicate that QARS is responsible for disease phenotypes similar to that of Family 1. QARS is expressed in the developing fetal brain and mutations have been linked to patients with progressive microcephaly, cerebral-cerebellar atrophy, intractable seizures, and intellectual disability, as well as other traits (PMID: 28056632). These symptoms are well matched with characteristics of disease in Family 1, as patient 1.03 has suffered from both seizures and mental retardation since birth. Furthermore, the QARS mutation is homozygous and autosomal recessive, which matches the expected mode of inheritance determined from pedigree.

4) Experiments should be conducted to further confirm the candidate gene and explore the causal variant. QARS codes for the protein Glutamine--tRNA ligase, which is responsible for molecular functions such as ATP binding and controlling protein kinase activity (UniProtKB - P47897). One potential experiment is to replicate the mutation in model human cells and measure protein levels to observe whether the gene is functional. A second experiment is to complete biochemical tests to observe any negative or disruptive effects on processes such as ATP binding and protein kinase activity.

Family 2

1) Family 2 has an autosomal recessive mode of inheritance, as it skips many generations and there is no obvious unequal gender ratio in affected members. The parents of the affected member (2.01, 2.02) are consanguineous, indicating that their child (2.03) is homozygous for the causal variant. Confidence in this inheritance pattern would increase if there were more affected members, showing no gender bias.

2) Candidate genes were found by filtering 2.01 and 2.02 as heterozygous and 2.03 as homozygous. An initial list of candidate genes were found by higher values of PPH2 and GERP, but no genes had promising studies. The best candidate was TMEM184A, which showed potential involvement with two symptoms, but no strong indication of mtDNA Depletion Syndrome. Going back to the original list based on inheritance pattern, genes with blank GERP were checked for frameshift or deletion type. This provided two candidates that could not be ruled out by GERP score.

Gene Protein Type of Chromo- Coordinate Reference Variant Mode of Mutation some Inheritance

TMEM184A S317F missense 7 1,587,440 G A A.R.

Plays a role in body weight regulation (PMID: 28811369) and vascular cells (PMID: 28936181)

REV1 327_32 delIF 2 100,055,293 GAA - A.R. 8del

Involved in DNA repair (UniProtKB - Q9UBZ9)

FBXL4 T441fs delFS|SP 6 99,328,495 T - A.R. L

Plays a role in mitochondrial DNA depletion syndrome (PMID: 28383868)

3) The best candidate gene was FBXL4. TMEM184A had weak evidence relating to failure to gain body weight and involvement in vascular cells could potentially relate to anemia and neutropenia, but this evidence was not convincing. Checking for frameshift and deletion mutations indicated FBXL4 as the clear gene candidate. Although the PPH2 and GERP are blank, these results were initially misleading. As a deletion mutation, there is no comparison for which to calculate the GERP score. With this in mind, studies directly indicated FBXL4 as the gene responsible for Mitochondrial DNA Depletion Syndrome, and listed symptoms from Family 3 such as hypertonia, developmental delay, and growth failure (PMID: 28383868). This claim was backed up by multiple other sources such as Uniprot (UniProtKB - Q9UKA2) and other studies on Mitochondrial DNA Depletion Syndrome (PMID: 30361041). Strong evidence and agreement with the mode of inheritance makes FBXL4 strong choice.

4) Although past studies support FBXL4 as a causal gene, further experiments could be done to test this. One proposed experiment is to recreate the deletion mutation in the gene in model cells and observe any effect on expression of protein F-box/LRR-repeat protein 4, for which the gene codes (UniProtKB - Q9UKA2). A second test that could be done with model cells is to mutate the FBXL4 gene and observe any changes in mitochondrial DNA, which would suggest involvement in mtDNA Depletion Syndrome.

Family 3

1) Family 3 likely has a recessive inheritance pattern because both parents of the affected patients are unaffected. X-linked inheritance can be ruled out, as both affected offsprings are female, and X-linked inheritance would require the father to pass down a dominant wild type allele. Because the parents are not consanguineous, the patients’ genotypes are likely compound heterozygous rather than homozygous

2) Only one candidate gene was found by filtering for a compound heterozygous genotype in the affected members. Both affected members (3.01 and 3.06) were filtered for heterozygous, as well as the mother (3.03). The father (3.04) was filtered for blank values, as compound heterozygosity implies the parents are carriers for different variants. Blank PPH2 and GERP scores were filtered out as a starting point to narrow the list, and only one gene from the list was found to have multiple variants in the whole exome data.

Gene Protein Type of Coordinate Reference Variant Mode of Mutation Inheritance

C10orf2/T W441G missense 10 102,749,478 T G A.R. WNK

C10orf2/T V53I missense 10 102,750,227 G A A.R. WNK

Involved in Perrault syndrome (PMID: 28178980)

3) The candidate gene was c10orf2, also called TWNK. TWNK is a likely candidate because it is the only gene from the narrowed list which had two variants included in the data set. This supports the assumption that the patients are compound heterozygous, as both parents are carriers for different mutations. Furthermore, both variants for the TWNK gene have high conservation. The mother (3.03) has a GERP value of 5.8 and PPH2 of 0.965, and the father (3.04) has a GERP value of 4.8 and PPH2 of 0.637. The symptoms of family 3 suggest the affected members have an autosomal recessive illness called Perrault syndrome, which is also characterized by hearing loss and ovarian dysgenesis (PMID: 25254289). Additional studies have also indicated that variants of the TWNK gene are involved in causing Perrault syndrome (PMID: 28178980).

4) TWNK is thought to be involved with mitochondrial DNA metabolism and maintenance (UniProtKB - Q96RR1) . To further test this gene, a potential experiment is the mutate the gene in model cells and observe any damage to the mitochondrial DNA. Another option is to mutate TWNK in model animals to try to replicate the phenotype. Physically observable symptoms, such as ovarian dysgenesis, would further support TWNK as a candidate gene.

Family 4

1) Family 4 has a autosomal dominant mode of inheritance. Hyperopia does not skip any generations, and X-linked inheritance can be ruled out as not all the daughters are affected. Because the mother is unaffected, and not all of the children are affected, it can be determined that all affected members of the family are heterozygous. 2) Candidate genes for Family 4 were found by filtering all affected members for heterozygosity and all unaffected members as blank because they do not have the causal variant. From this list, candidates were narrowed to a short list by omitting genes without a GERP score and variants which are silent.

Gene Protein Type of Chromosome Coordinate Reference Variant Mode of Mutation Inheritance

TRMT13 L180I missense 1 100,606,444 C A Autosomal Dominant (A.D.)

Codes for tRNA methylase (UniProtKB - Q9NUP7) and no evidence suggested any linked conditions

PPM1J G488R missense 1 113,252,841 C T A.D.

Codes for protein phosphatase (UniProtKB - Q5JR12) and is associated with kidney development (PMID: 27920155)

TTN R3549H missense 2 179,616,481 C T A.D.

Involved in assembly and function of striated muscle in vertebrates (UniProtKB - Q8WZ42)

FRMD4B Q832E missense 3 69,230,407 G C A.D.

Involved with photoreceptor dysplasia (PMID: 29947801), celiac disease (PMID: 2499984), and heart failure ​ (PMID: 20124441)

CLSTN2 R600H missense 3 140,275,479 G A A.D.

Involved in learning and memory (PMID: 28647593, PMID: 25080189)

MFN1 M299K missense 3 179,085,369 T A A.D.

Involved in visual development, myopia (PMID: 27609161)

DLC1 D1320E missense 8 12,947,875 G T A.D.

Plays a role in cell migration and differentiation (UniProtKB - Q96QB1)

3) MFN1 was the strongest candidate gene. PPH2 and GERP scores were both high, at 0.903 and 5.2 respectively. MFN1 was blank for all frequencies in databases, which suggests rarity, but could also indicate that it has not been tested. Regardless, this gene has the strongest evidence on the list. While Family 4 is affected by hyperopia and MFN1 is involved in myopia, this is a strong indicator that MFN1 is involved in visual development (PMID: 27609161). While the conditions are “opposites” of each other, MFN1’s role in causing myopia strongly supports the hypothesis that it may also be involved in hyperopia. Because no other candidate gene had any more evidence, MFN1 remains the best candidate gene for hyperopia in Family 4.

4) The MFN1 gene codes for protein Mitofusin-1 (UniProtKB - Q8IWA4). Replicating the variant in model cells and testing protein levels would help confirm whether the mutation affects functionality of the MFN1 gene. A second option is the replicate the mutation in model animals and observe any physical effects, such as eye shape, which cause hyperopia.