<<

Efficient analysis of mouse genome sequences reveal many nonsense variants

Sophie Steelanda,b,1, Steven Timmermansa,b,1, Sara Van Ryckeghema,b, Paco Hulpiaua,b, Yvan Saeysa,c, Marc Van Montagud,e,f,2, Roosmarijn E. Vandenbrouckea,b,3, and Claude Liberta,b,2,3

aInflammation Research Center, Flanders Institute for Biotechnology (VIB), 9052 Ghent, Belgium; bDepartment of Biomedical Molecular Biology, Ghent University, 9052 Ghent, Belgium; cDepartment of Internal Medicine, Ghent University, 9052 Ghent, Belgium; dDepartment of Plant Systems Biology, VIB, 9052 Ghent, Belgium; eDepartment of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium; and fInternational Plant Biotechnology Outreach, VIB, Ghent, Belgium

Contributed by Marc Van Montagu, March 30, 2016 (sent for review December 31, 2015; reviewed by Bruce Beutler, Stefano Bruscoli, Stefan Rose-John, and Klaus Schulze-Osthoff) Genetic polymorphisms in coding play an important role alive, archiving them, and distributing mutant strains to in- when using mouse inbred strains as research models. They have terested users (4). been shown to influence research results, explain phenotypical Since Clarence Little showed in the early 20th century that the differences between inbred strains, and increase the amount of principle of inbreeding also applies to mice, several hundred interesting variants present in the many available inbred inbred mouse strains have been generated (5). Some of these lines. SPRET/Ei is an inbred strain derived from Mus spretus that strains display specific phenotypes that are the result of a mutant has ∼1% sequence difference with the C57BL/6J reference ge- gene, and in several cases have formed the basis for identifying nome. We obtained a listing of all SNPs and insertions/deletions this mutant allele or opening a new field of research. A well- (indels) present in SPRET/Ei from the Mouse Genomes Project known example is C3H/HeJ mice, which resist inflammation (Wellcome Trust Sanger Institute) and processed these data to induced by lipopolysaccharide (LPS) because of a missense obtain an overview of all transcripts having nonsynonymous cod- mutation in the Tlr4 gene, which codes for Toll-like receptor 4 ing sequence variants. We identified 8,883 unique variants affect-

(TLR4) (5). In another example, MRL/lpr mice were GENETICS ing 10,096 different transcripts from 6,328 protein-coding genes, found to develop lupus erythematosus because of an insertion in which is about 28% of all coding genes. Because only a subset of Fas, the gene coding for the death receptor Fas (first apoptotic these variants results in drastic changes in , we focused on signal) (6). Such examples illustrate that the available inbred variations that are nonsense mutations that ultimately resulted in mouse strains may form a potentially very rich source of spon- a gain of a stop codon. These genes were identified by in silico taneously arisen variants of alleles, and that this rich source re- changing the C57BL/6J coding sequences to the SPRET/Ei sequences, mains insufficiently used. converting them to amino acid (AA) sequences, and comparing Since the availability of deep-sequencing techniques, whole- the AA sequences. All variants and transcripts affected were also genome sequences of several mouse strains have been generated. stored in a database, which can be browsed using a SPRET/Ei The Wellcome Trust Sanger Institute has decided to obtain high- M. spretus variants web tool (www.spretus.org), including a manual. quality sequences of 36 commonly used inbred mouse strains and We validated the tool by demonstrating the loss of function of three make the data available to the scientific community. We decided proteins predicted to be severely truncated, namely Fas, IRAK2, γ to study the sequences of the protein-coding genes of the mouse and IFN R1. inbred strain SPRET/Ei, and compared the sequences with C57BL/6J, the standard mouse strain. SPRET/Ei is a relatively bioinformatics | genomics | mouse | mutations | inflammation Significance he normal function of a given protein in physiology, disease, or development is generally understood by studying the T Mouse genome sequences have been generated of tens of in- phenotype of a variant form of this protein. This principle is bred strains. We here describe a user-friendly bioinformatics based on the general rule in genetics that we must “study the ” tool for revealing the sequence variations in protein-coding abnormal to understand the normal. Hence, introducing and genes in one or more inbred strains and predicting which detecting genetic variance are essential. A plethora of technologies proteins are significantly different compared with those of the have been developed to generate mutations in the genomes of a reference genome of C57BL/6J inbred mice. The analysis of selected group of model organisms. The mouse is undoubtedly the truncated proteins in the strain SPRET/Ei, as an example, results favorite mammalian model species and, also in mice, techniques in a user-friendly PDF file. This information is essential for have been established that allow mutagenesis of the genome, interpreting biological data obtained using a given inbred yielding gain- or loss-of-function mutations, missense mutations, strain, and confirms that inbred lines are a rich source of vari- nonsense mutations, knockin alleles, deletions, and so forth (1). ant alleles of protein-coding genes, including knockout alleles. By and large, the available technologies use embryonic stem cells and early embryos and require specific training and tools. Ad- Author contributions: Y.S., M.V.M., and C.L. designed research; S.S., S.T., S.V.R., and P.H. ditionally, these technologies are expensive, slow, have low efficiency, performed research; P.H. and Y.S. contributed new reagents/analytic tools; S.T., Y.S., R.E.V., and C.L. analyzed data; and M.V.M. and C.L. wrote the paper. and often show off-target effects. Even the recent revolutionary Reviewers: B.B., UT Southwestern Medical Center; S.B., University of Perugia; S.R.-J., Christian- clustered regularly interspaced palindromic repeats (CRISPR)- Albrechts-Universität zu Kiel; and K.S.-O., University of Tübingen. associated (Cas9) technology has limitations and off-target effects The authors declare no conflict of interest. (2). Besides these targeting techniques, also random mutagenesis, for 1S.S. and S.T. contributed equally to this work. example using the chemical mutagen ethylnitrosourea (ENU), has 2 To whom correspondence may be addressed. Email: [email protected] or increased the number of available mutant mouse strains to many [email protected]. thousands, and many of these have shown their value for the mouse 3R.E.V. and C.L. contributed equally to this work. community (3). Besides the problems of generating mouse mutants, This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. there are also issues with phenotyping, keeping mouse colonies 1073/pnas.1605076113/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1605076113 PNAS Early Edition | 1of6 Downloaded by guest on September 25, 2021 recent inbred strain, derived from Mus spretus, a species of mice closely related to Mus musculus, the species that had been used to generate C57BL/6J and many more strains. Because of the evolutionary distance of about 1.5 million y between M. spretus and M. musculus, a rich degree of genetic polymorphism is found between SPRET/Ei and C57BL/6J, so that 1 bp every 100 bp is different (7). Moreover, SPRET/Ei mice have attracted interest because they display interesting and robust phenotypes. For in- stance, they are long-lived (8), and resist the development of multiple forms of cancer or lethal inflammation induced by high doses of LPS or of the (TNF) (9–11). In this study, we used the variant data of the Mouse Genomes Project made available by the Wellcome Trust Sanger Institute consisting of information about SNPs and insertions/deletions Fig. 1. Overview of the distribution of variants affecting the coding se- (indels). More specifically, the data of the M. spretus-derived quences of SPRET/Ei genes. Fifty-one percent of all events are in-frame indels. Stop-gain and frameshift mutations (lifted slices) make up 42% of all inbred strain SPRET/Ei were used in this study. We made a list events; these were also the variations we focused on further. Stop-loss and of all protein-coding genes that have nonsilent mutations com- missense variants made up the last 7% of all variants present. pared with the C57BL/6J reference genome. We determined the final combined outcome, per transcript, of all mutations present and stored all results in a database that is easily accessible. We Stop-Gain Variants. The 865 genes having one or more transcripts validated the tool by demonstrating the loss of function of three with a large truncation were subjected to a more detailed anal- proteins predicted to be severely truncated, namely Fas, IRAK2 ysis. A functional (GO) enrichment analysis (-1 receptor-associated kinase 2), and IFNγR1 revealed an almost 10-fold enrichment of genes related to re- (-gamma receptor 1). This tool can be applied to any sponse to pheromone (P = 1.24e-10) compared with what would inbred strain, the data of which are made available in public be expected by chance. The responsible genes almost all belong databases. to the olfactory receptor family of genes. We also detected a significant enrichment for genes involved in immune system Results process (1.7-fold; P = 0.0038). For example, the following genes Comparative Genomics of the SPRET/Ei and C57BL/6J Reference in SPRET/Ei led to significantly shorter proteins compared with Genomes. To investigate the genes most likely involved in the C57BL/6J: Ccl26, Cxcr1, Ackr2, Xcl1, Trem2, Fas, Il17re, Irak2, extreme phenotypes of the SPRET/Ei mice and to obtain a list of Cd244, Cr1l, Ifngr1, Lilrb4, Gstt2, A4galt, and Il1f9/Il36g. The all deviant proteins compared with C57BL/6J, we used the ge- presence of nonfunctional or only partially functional genes re- nomic sequences of the C57BL/6J and SPRET/Ei strains re- lated to the immune system may help explain previously observed leased by the Wellcome Trust Sanger Institute (www.sanger.ac. effects, such as resistance to LPS (9). An overview of all proteins uk/science/data/mouse-genomes-project). A filtering was done that are truncated in SPRET/Ei compared with C57BL/6J is found for all SPRET/Ei sequence variants that were predicted to affect in Dataset S1. Based on the size of the truncation and the function protein-coding sequences compared with C57BL/6J. This yielded of the proteins in inflammation, we selected three genes of which 8,883 sequence variants, of which 89 were found to be hetero- the protein is significantly truncated in SPRET/Ei to validate the zygous. Of these heterozygous variants, 30 have two different predictive value of our tool. The genes are Fas (in SPRET/Ei, variations, one per allele, both different from the reference se- encoding a protein of 60 rather than 327 amino acids), Irak2 (459 quence. If the latter variations are counted as two unique events, rather than 622 amino acids), and Ifngr1 (190 rather than 477 the total number of differences found between C57BL/6J and amino acids). SPRET/Ei coding sequences increases to 8,913. These 8,913 variations are composed of 3,711 SNPs and 5,202 indels. We Data Availability. The data for the SNP/indel variants for all 8,883 retained the variants that have a predictable effect on the final transcripts can be obtained per gene from the M. spretus variants protein product for downstream analysis, namely SNPs affecting web tool at www.spretus.org. Data come in a table format as a stop codon, insertions or deletions in an exon, and missense described previously by searching by gene name. Affected tran- SNPs. Our focus in particular was on variations that led/could scripts can be found on the gene details page. Here a PDF report lead to a premature termination of translation: stop-gain SNPs can also be downloaded including essential information per tran- and frameshift indels (Fig. 1). The effect of these variants on script such as a grouped GO annotation, the size of the truncation the final protein product can be very large if they occur early in (if any), and a sequence alignment of the affected positions. the transcript. All of the stop-gain SNPs and about 40% of the frameshift-causing indels resulted in an early termination of the Truncation of Fas Protein Completely Protects SPRET/Ei Mice Against encoded protein. These truncations range from one amino acid Jo2-Induced Lethality. To test the functionality of the predicted to almost complete deletion. There were a total of 1,848 tran- deviant Fas protein in SPRET/Ei mice, we used the agonistic scripts leading to a final protein product that was truncated to anti-Fas antibody Jo2. Jo2 will bind to Fas and a lethal hepatitis some degree. In Fig. 2A, we display the distribution of the rel- will be induced (12). Intraperitoneal injection of Jo2 rapidly ative size of the truncations of SPRET/Ei proteins. Because the killed C57BL/6J mice within 6.5 h, but the SPRET/Ei mice were probability that an early termination will lead to loss of function resistant against this lethality (Fig. 3A). Moreover, SPRET/Ei increases with the size of the deleted portion, transcripts were mice did not even show a drop in body temperature after in- divided into groups according to the amount of sequence lost. jection (Fig. 3B). To assess the degree of liver injury, the mice Among the 1,848 transcripts affected by an early stop there were were killed 6 h after injection for the analysis of their liver and 1,149 transcripts (originating from 865 genes) where the in- serum. This was the moment the C57BL/6J mice became mori- troduction of a premature stop codon reduced the length of the bund. As shown in Fig. 3C, aspartate transaminase (AST) and final protein product by 20% or more. For 457 transcripts (355 alanine transaminase (ALT) levels in serum, which are measures genes), the truncation size exceeds 75% of the normal C57BL/6J for hepatotoxicity, were significantly increased in Jo2-treated reference protein length (Fig. 2B). C57BL/6J mice, whereas there was no increase measurable in

2of6 | www.pnas.org/cgi/doi/10.1073/pnas.1605076113 Steeland et al. Downloaded by guest on September 25, 2021 leading to truncated IRAK2 and IFNγR1. IRAK2 functions as an adaptor protein that becomes associated with the interleukin 1 receptor (IL1R) upon IL1β stimulation and is reported to participate in the up-regulation of NF-κB induced by IL1β (13). IFNγR1, on the other hand, is the ligand-binding subunit that heterodimerizes with the signal-transducing subunit IFNγR2 when the ligand IFNγ is present to form a functional signaling receptor (14, 15). To determine the consequences of truncated IRAK2, C57BL/6J and SPRET/Ei mice were injected with IL1β, and spleen and liver were isolated to assess the NF-κB response in those organs after injection. As illustrated in Fig. 4 A and B, both in liver and spleen, the NF-κB response—illustrated by Il6 and Tnfaip3 expression—was abolished in SPRET/Ei mice compared with C57BL/6J mice, which exerted full expression of these genes. Truncated IRAK2 also impacted the systemic inflammation. Although IL6 levels were increased in IL1β-injected SPRET/Ei mice compared with PBS-injected mice, levels were significantly lower than those in IL1β-injected C57BL/6J mice (Fig. 4C). To investigate the effects of IFNγR1 abrogation, mice were injected with its ligand, IFNγ. We investigated the expression of typical IFN-stimulating genes (ISGs) 2 h after injection of IFNγ. In spleen, we could not detect any up-regulation of the tested ISGs 2 h after injection, not in C57BL/6J nor in SPRET/Ei. By contrast, in liver, we detected a clear induction of Il12rb1 and Il23 in IFNγ-injected C57BL/6J mice that was not present in SPRET/Ei mice (Fig. 4D). Also, systemically, an aberrant IFNγ GENETICS signaling led to reduced IL6 levels in SPRET/Ei mice that were significantly lower than those in C57BL/6J mice (Fig. 4E). Based on these results, we can conclude that IL1β and IFNγ signaling, mediated by IRAK2 and IFNγR1, respectively, are abolished in SPRET/Ei mice due to the predicted presence of stop mutations in those genes. Discussion Efficient and user-friendly analysis of whole-genome sequence data is both a necessity and a challenge. Obviously, genome data contain a treasure of information, and there are several good reasons why the detailed study of the sequence of the protein- coding genes and their functional annotation of the most popular Fig. 2. Deletion size of stop-gain variants in SPRET/Ei mice. (A) Histogram mouse strains should be easily accessible. A first reason concerns representing the number of transcripts (a total of 1,848 transcripts that are the fact that mouse strains may contain abnormal alleles, which truncated in SPRET/Ei) and how large the effect is on protein length. The can compromise the interpretation of the phenotype that arises length of the protein in SPRET/Ei is divided by the length of the C57BL/6J from the targeted mutation of interest. Most knockout (KO) protein. Every bar represents a bin with a width of 0.1 (or 10%). The bins go alleles have been generated in ES cells with a 129 genetic from 0 to 0.1, meaning that the SPRET/Ei protein is at most 10% of the length of the C57BL/6J protein (e.g., 100 AA→10 AA), to 0.9–1, indicating a background, whereas most scientists want to study their knock- reduction of the SPRET/Ei protein length to 90% or more of the C67BL/6J out allele in a strain that is better-standardized, for example, length (e.g., 100 AA→90 AA). (B) Distribution of 1,848 transcripts (from 1,410 C57BL/6J. Thus, scientists backcross the mutant allele from the genes) according to truncation size. Six hundred ninety-nine of the tran- donor strain, for at least 5 (but preferentially 10) generations scripts (38%) truncated in SPRET/Ei produce a protein that is at most 20% into the acceptor strain, C57BL/6J. However, they usually fail to shorter than the C57BL/6J protein. About the same number of transcripts realize that a significant part of the donor genome (129) still (692) code for a protein that is 20–75% shorter in SPRET/Ei than in C57BL/6J. surrounds the mutant allele, called passenger DNA, and this may Finally, the remaining 457 transcripts code for a protein that has an extreme contain deviant alleles of the donor strain. As we recently pub- truncation, with a loss of at least 75% of the WT AA sequence. lished (16), a notorious example is the caspase 1 KO mouse, generated in 129 ES cells, which in turn appeared to contain a SPRET/Ei mice compared with untreated SPRET/Ei mice. His- defective caspase 11 gene closely linked to the caspase 1 gene tological examination of liver tissue of Jo2-treated mice showed no (Casp1), and thus scientists have been studying caspase 1 phe- signs of liver damage in SPRET/Ei mice compared with C57BL/6J notypes in mice that appeared to be caspase 11-deficient as well, mice. Hematoxylin and eosin staining showed prominent apoptotic even after successive backcrossing (17). lesions in C57BL/6J mice in contrast to the livers of SPRET/Ei mice, A second important reason for knowing the functionality of all which were completely preserved without any signs of hepatic apo- protein-coding alleles of a mouse strain is to provide an expla- ptosis (Fig. 3D). Together, these findings indicate that SPRET/Ei nation for the different penetration or phenotype observed by a mice are completely protected against Fas-induced hepatotoxicity. targeted allele when studied in different genetic backgrounds. Some spectacular cases have been reported, which are highly IRAK2 and IFNγR1 Truncation Leads to Aberrant Response to IL1β and likely to derive from the activities of genes in the genetic back- IFNγ in SPRET/Ei Mice. The genomic comparison between C57BL/6J ground, namely by modifier alleles, and the identification of and SPRET/Ei mice also revealed premature stop mutations these alleles may be of extreme importance (18).

Steeland et al. PNAS Early Edition | 3of6 Downloaded by guest on September 25, 2021 Fig. 3. SPRET/Ei mice are resistant to Jo2-induced lethality and hepatitis. C57BL/6J and SPRET/Ei mice were i.p. injected with 15 μgofJo2.(A) Lethality of C57BL/6J and SPRET/Ei mice after Jo2 injection. (B) Rectal body temperature of Jo2-injected mice 6 h after injection. (C) AST (Left) and ALT (Right) levels were measured in sera of treated and untreated mice using an Hitachi kit. (D) Representative H&E liver sections. The right-hand panel for each strain zooms in on the left-hand panel. Scale bars repre- sent 100 μm. Bars represent mean ± SEM; *, 0.01 ≤ P < 0.05; ***, 0.001 ≤ P 0.0001; ****P < 0.0001; ns, not significant; n = 5–17. Black, C57BL/6J; gray, SPRET/Ei.

Finally, in the search for a saturation of the mouse genome 2013, which explains why it failed to find any SNPs in the with mutant alleles of protein-coding genes, the mouse community SPRET/Ei Fas gene, whereas we clearly show a loss of function. has already generated many thousands of mutant mice using a va- All our results were stored in a database that can be searched by riety of technologies. High-throughput mutagenesis programs, such the public (www.spretus.org). In addition, we also determined as those based on chemical mutagenesis, have led to many thou- the final result of all mutations affecting a transcript, as many sands of mutants, usually also several per gene. However, there is transcripts are affected by multiple events. We found that still room for more mutant alleles. Also in this sense, a compre- SPRET/Ei, a mouse strain known to contain a very significant hensive overview of the sequences of all genes (also non–protein- number of genetic polymorphisms compared with C57BL/6J (7), coding) of the major inbred mouse strains that have been sequenced indeed expresses a wealth of genes coding for proteins that differ would be a very interesting piece of information. in amino acid (AA) sequence or length. Many proteins, for ex- The Wellcome Trust Sanger Institute has obtained high-quality ample, are significantly shorter than the C57BL/6J ortholog, genome sequences of 36 inbred mouse strains (19) (www.sanger.ac. because their genes contain a premature stop codon in the uk/science/data/mouse-genomes-project). We here describe a tool, reading frame. This is the case in no fewer than 1,848 proteins, of available online, where all genes of the SPRET/Ei strain having which 457 proteins are more than 75% shorter than the C57BL/6J variations resulting in changes to the final protein product can be version. To validate our method, we picked three examples of found. We have downloaded and processed the data from the such stop-gained proteins and tested whether they were indeed Wellcome Trust Sanger Institute and compiled a list of all var- nonfunctional in SPRET/Ei mice. These three examples are iants affecting the coding sequence and all genes affected. Al- coding for the death receptor Fas (Fas), the inflammation though several tools exist to browse variant data at dbSNP (www. adaptor molecule IRAK2 (Irak2), and the major IFN-gamma ncbi.nlm.nih.gov/projects/SNP/), Wellcome Trust Sanger Institute receptor (Ifngr1). These three proteins are all involved in in- (www.sanger.ac.uk), Mouse Genome Informatics (MGI) (www. flammation and immunity and lacked 81.7%, 26.2%, and 60.2%, informatics.jax.org/), and others, ours has several advantages. respectively, of the canonical AA sequence due to the nonsense None of the tools we found takes the effect of the variants to the mutation. In all cases at least one protein domain, as defined in the protein level to identify potentially nonfunctional proteins. The UniProt database (www..org), of the protein was affected data we used can be browsed on the website from the Mouse (partial or complete deletion). Genomes Project (Wellcome Trust Sanger Institute) itself, but Fas (CD95) is a member of the TNF-R family, a group of type our tool is less complex to use and was one of the reasons we I transmembrane proteins (20). Fas contains a “death domain” made our own www.spretus.org website. In addition, we also in its cytoplasmic region, which is essential for its induction of used the most recent data, which do not need to be validated, as apoptosis. (FasL) is the only known physiological li- our goal was to identify possible nonfunctional proteins in gand of Fas. Fas/FasL signaling plays a critical role in lymphocyte SPRET/Ei. The advantage of using the latest data is illustrated homeostasis (21). Mice with mutations in Fas (lpr mice) or FasL by comparing our findings for the Fas gene with the results (gld) show massive lymphadenopathy and systemic autoimmunity obtained with the tool from MGI. This tool uses SNP data from and glomerulonephrosis. lpr mice are a useful model for studying

4of6 | www.pnas.org/cgi/doi/10.1073/pnas.1605076113 Steeland et al. Downloaded by guest on September 25, 2021 Fig. 4. SPRET/Ei mice have altered response to IL1β and IFNγ. C57BL/6J and SPRET/Ei mice were i.p. in- jected with 1 mg of IL1β or 5,000 IU IFNγ to activate their pathways via IRAK2 or IFNγR1, respectively. Liver, spleen, and blood were isolated 2 h after in- jection. (A and B) Quantitative (q)PCR analysis of the relevant NF-κB–dependent genes Il6 and Tnfaip3 in spleen (A) and liver (B). (C–E) Using Luminex tech- nology, IL6 levels were measured. qPCR data are normalized to the stable housekeeping genes (HKGs) and represent fold induction normalized to the PBS

condition. (D) qPCR analysis of relevant interferon- GENETICS stimulating gene expression of Nos1, Il12rb1,and Il23r in the liver. Primer sequences see Table S1.Bars represent mean ± SEM; *, 0.01 ≤ P < 0.05; **, 0.001 ≤ P < 0.01; ***, 0.001 ≤ P 0.0001; ****P < 0.0001; n = 5– 17. Black, C57BL/6J; gray, SPRET/Ei.

systemic lupus erythematosus (22–24). Jo2 is an anti-mouse Fas The lack of functionality of IRAK2 was tested by injecting IL1β agonistic monoclonal antibody that leads, upon injection in mice, into SPRET/Ei mice, and we studied gene induction in the spleen to acute, lethal, massive apoptotic hepatitis (12, 25). Because and liver that results from activation of IL1 receptor followed by SPRET/Ei mice lack most of the Fas protein sequence, as found IRAK2 activation and gene induction of the NF-κB–dependent using the Spretus tool, we indeed found that, in contrast to genes Il6 and Tnfaip3. As expected from a deficiency in Irak2,de- C57BL/6J mice, they completely resisted Jo2-induced fulminant creased induction of these genes and of IL6 appearance in the se- liver failure. This was reflected in the complete absence of AST rum was observed. The partial response of SPRET/Ei in the absence and ALT elevation in serum, and also histopathological analysis of functional IRAK2 is compatible with the fact that IRAK1 and of the liver clearly showed that SPRET/Ei mice were protected IRAK2 are cooperative but redundant and that IRAK1 is es- against necrosis induced by Jo2. SPRET/Ei is not the only mouse pecially important in the early response to IL1 and IRAK2 to inbred line that has no expression of Fas. Also, MRL/lpr mice later responses (26). No other defects were found in SPRET/Ei are devoid of Fas expression, due to a transposon insertion in the proteins that mediate the IL1 response. Equally, lack of response was found when IFNγR1 was stim- Fas gene (22). Because of the failure of lymphocytes to be killed γ by apoptosis after an immune response, these mice develop an ulated after injection of IFN into SPRET/Ei mice, and in- duction of the IFNγ-stimulating response genes IL12rb1 and autoimmune pathology characterized by swollen lymph nodes, anti- IL23r was studied, as well as IL6 release in serum of mice. Again, DNA antibodies in the urine, and premature death. SPRET/Ei we observed an impaired IFNγ response in SPRET/Ei mice mice do not develop such a lupus phenotype, and are long-lived. compared with C57BL/6J mice. This impaired response may be There is no clear reason for this difference with MRL/lpr, and we (partly) responsible for the poor response of SPRET/Ei mice in speculate that other gene differences, forming the basis of the models of arthritis (7). Also, the response to IFN described here obvious antiinflammatory status of SPRET/Ei and their lack of was not completely absent, however, which may be due to cell proliferation in numerous tumor models, protect them from IFNγR2 activities, either directly responding to IFNγ or via a the development of the lupus phenotype. soluble IFNγR1, which in theory could be produced in SPRET/Ei, Additionally, we tested the functionality of the predicted genes as the protein in this strain lacks all intracellular domains and the Irak2 and Ifngr1 in SPRET/Ei mice. Irak2 KO mice did not show transmembrane domain. κ an overt phenotype, but showed impaired NF- B signaling fol- These examples confirm that the predicted deficiency of lowing TLR stimulation (26). Ifngr1 KO mice also had no overt SPRET/Ei in the function of Fas, IRAK2, and IFNγR1 is vali- defects in their immune system or anomalies; however, they were dated. The tool described here is also useful for finding candi- more susceptible to certain viral infections or infection with in- date deviant genes that may be linked to particular chromosomal tracellular bacteria (27). Also, loss of function of human IFNγR1 loci (for an overview, see Dataset S1) and phenotypes described for has been reported and associated with severe susceptibility to the SPRET/Ei strain. For example, the resistance of SPRET/Ei poorly virulent mycobacteria that can be fatal in childhood (14), mice was linked to a locus (110–140 Mb) on the X , and the severity of this sensitivity was correlated with the degree which, according to the current analysis, could be due not only of deficiency. to the gene Tsc22d3, as described by Pinheiro et al. (10), but

Steeland et al. PNAS Early Edition | 5of6 Downloaded by guest on September 25, 2021 also to Pim2, an inflammatory protein that has lost 84% of its Development of an M. spretus Variants Web Tool. It is important that in- protein sequence in SPRET/Ei (Dataset S1). The Mouse Phe- formation about the SPRET/Ei coding sequence variants can be accessed with nome Database, generated and hosted by the Jackson Labora- ease. To this end, we developed an M. spretus variants web tool, which is tory (MPD; phenome.jax.org), shows a number of obvious available to the public at www.spretus.org, to allow the data to be browsed phenotypes in SPRET/Ei mice when taking C57BL/6J as a ref- in a user-friendly way. A brief description of the tool is provided, and a manual can be found at www.spretus.org by clicking on “examples” and on erence. Some of the gene variations that we identified in the “here.” The manual is also presented in SI Spretus.org Tool Manual, and a SPRET/Ei genome can be directly correlated to SPRET/Ei brief outline of the procedure is shown in Fig. S1. The M. spretus variants phenotypes. We give two examples here. First, the MPD yields that web tool allows searches in three different ways: by gene name, location, or the number of white blood cells (WBCs) is lower in SPRET/Ei GO term. An entered keyword is matched against all official gene symbols or mice than in C57BL/6J mice. When we select all genes that code GO annotations in the database, depending on the search type. In the case for truncated proteins in SPRET/Ei and perform an Ingenuity of a gene search, all perfect and partial gene name matches are returned, a Pathway Analysis (IPA analysis; www.ingenuity.com), the IPA GO search returns all terms found by a wildcard-based matching, and a re- predicts the number of various WBC (neutrophils, lymphocytes) gion search returns all genes with a SNP/indel in the specified region. This is types to be indeed negatively affected in SPRET/Ei mice. Sec- presented to the user in a table presenting the variant location, Ensembl ID, ond, the MPD also shows differences in blood cholesterol level, gene name, genomic location of the variant, reference nucleotide sequence with SPRET/Ei mice having a lower LDL level. Indeed, based on (C57BL/6J), alternative nucleotide sequence (SPRET/Ei), and depth of se- the list of truncated proteins, IPA predicts the level of LDL in quencing, which can be used as a measure of reliability. The variant location is a link to the Mouse Genomes Project genome browser, where the variant SPRET/Ei mice to be lower, only based on the list of genes. In can be visualized. Clicking the Ensembl ID shows a page with more detailed conclusion, using the tools we describe here, lists can be easily information for that gene. This also includes a per-transcript overview of all generated containing genes that are expected to be defective in a transcripts affected by a SNP or indel. This includes the sequence, position in given sequenced mouse strain. the coding sequence (CDS) and protein, a quality score, and direct conse- quences of the variation. There is also an option to download a report for a Materials and Methods gene as a PDF. This report is generated by a Perl script. It shows the com- For mice, reagents, and experimental procedures, see SI Materials and bined effect of all variants on the transcripts of the gene. In case a new stop Methods. Animal experiments were approved by the institutional ethics codon is introduced into the transcript, the region around this new stop will committee for animal welfare of the Faculty of Science, Ghent University. be included as a pairwise sequence alignment in the report.

Bioinformatics Analysis of SNP and Indel Data. Our analyses were performed Statistics. GO enrichment analysis was performed using the GOslim sets for using Mouse Genomes Project data (28, 29), v5 release (REL-1505- molecular function, biological process, and cellular component, with all SNPs_Indels), from the Wellcome Trust Sanger Institute (ftp://ftp-mouse. genes in the genome serving as the background. Functional annotations sanger.ac.uk/REL-1505-SNPs_Indels/). The data in the variant files for indels (GOSlim) for proteins were acquired from BioMart (www.Biomart.org). We (mgp.v5.merged.indels.dbSNP142.normed.vcf.gz)andSNPs(mgp.v5.merged. corrected these analyses for multiple testing with the Bonferroni method. snps_all.dbSNP142.vcf.gz) were filtered to obtain all sequence variants present All GO terms with a P value ≤ 0.05 (after Bonferroni correction) were in the SPRET/Ei strain compared with C57BL/6J. This list was further filtered so retained as significantly enriched. The experimental mouse data are pre- that only variants affecting protein-coding genes were retained. The filtering sented as mean ± SEM. Data were analyzed with an unpaired t test, unless was done based on the Sequence Ontology (SO) assigned by the Wellcome mentioned differently. Significance levels were calculated for differences Trust Sanger Institute. The variants with the following SO annotations between treated and untreated SPRET/Ei and C57BL/6J, as indicated (*, 0.01 ≤ were retained: stop_gained, stop_lost, inframe_insertion, in-frame_deletion, P < 0.05; **, 0.001 ≤ P < 0.01; ***, 0.001 ≤ P 0.0001; ****, P < 0.0001). frameshift_variant, splice_donor_variant, splice_acceptor_variant, and coding_ sequence_variant. We used the Ensembl annotation of the mouse genome ACKNOWLEDGMENTS. We thank Joke Vanden Berghe for technical assis- to annotate the data on the gene and transcript level. Finally, we retained tance related to breeding the mice. This work was supported by the Fund for 8,883 variants affecting 10,096 unique transcripts from 6,328 different Scientific Research, Flanders (FWO), Belgian Interuniversity Attraction Poles genes. For completion, intronic variants are also included in the database, Program, Agency for Innovation by Science and Technology (IWT), and VIB, but they were not included in any downstream analysis. Ghent University.

1. Shalem O, Sanjana NE, Zhang F (2015) High-throughput functional genomics using 15. Bach EA, Aguet M, Schreiber RD (1997) The IFN gamma receptor: A paradigm for CRISPR-Cas9. Nat Rev Genet 16(5):299–311. cytokine receptor signaling. Annu Rev Immunol 15:563–591. 2. Pelletier S, Gingras S, Green DR (2015) Mouse genome engineering via CRISPR-Cas9 16. Vanden Berghe T, et al. (2015) Passenger mutations confound interpretation of all for study of immune function. Immunity 42(1):18–27. genetically modified congenic mice. Immunity 43(1):200–209. 3. Cook MC, Vinuesa CG, Goodnow CC (2006) ENU-mutagenesis: Insight into immune 17. Kayagaki N, et al. (2011) Non-canonical activation targets caspase-11. – function and pathology. Curr Opin Immunol 18(5):627–633. Nature 479(7371):117 121. – 4. Fuchs H, et al. (2011) Mouse phenotyping. Methods 53(2):120–135. 18. Nadeau JH (2001) Modifier genes in mice and humans. Nat Rev Genet 2(3):165 174. 5. Poltorak A, et al. (1998) Defective LPS signaling in C3H/HeJ and C57BL/10ScCr mice: 19. Yalcin B, Adams DJ, Flint J, Keane TM (2012) Next-generation sequencing of experi- – – Mutations in Tlr4 gene. Science 282(5396):2085–2088. mental mouse strains. Mamm Genome 23(9 10):490 498. 6. Jeltsch-David H, Muller S (2014) Neuropsychiatric systemic lupus erythematosus and 20. Ashkenazi A, Dixit VM (1998) Death receptors: Signaling and modulation. Science – cognitive dysfunction: The MRL-lpr mouse strain as a model. Autoimmun Rev 13(9): 281(5381):1305 1308. 21. Nagata S (1997) Apoptosis by death factor. Cell 88(3):355–365. 963–973. 22. Tarkowski A, Jonsson R, Sanchez R, Klareskog L, Koopman WJ (1988) Features of renal 7. Dejager L, Libert C, Montagutelli X (2009) Thirty years of Mus spretus: A promising vasculitis in autoimmune MRL lpr/lpr mice: Phenotypes and functional properties of future. Trends Genet 25(5):234–241. infiltrating cells. Clin Exp Immunol 72(1):91–97. 8. Vanhooren V, et al. (2007) N-glycomic changes in serum proteins during human ag- 23. Nagata S (1994) Mutations in the Fas antigen gene in lpr mice. Semin Immunol 6(1): ing. Rejuvenation Res 10(4):521–531a. 3–8. 9. Mahieu T, et al. (2006) The wild-derived inbred mouse strain SPRET/Ei is resistant to 24. Adachi M, et al. (1996) Enhanced and accelerated lymphoproliferation in Fas-null – LPS and defective in IFN-beta production. Proc Natl Acad Sci USA 103(7):2292 2297. mice. Proc Natl Acad Sci USA 93(5):2131–2136. 10. Pinheiro I, et al. (2013) LPS resistance of SPRET/Ei mice is mediated by Gilz, encoded by 25. Van Molle W, et al. (1999) Activation of caspases in lethal experimental hepatitis and – the Tsc22d3 gene on the X chromosome. EMBO Mol Med 5(3):456 470. prevention by acute phase proteins. J Immunol 163(10):5235–5241. 11. Puimège L, et al. (2015) Glucocorticoid-induced microRNA-511 protects against TNF by 26. Kawagoe T, et al. (2008) Sequential control of Toll-like receptor-dependent responses – down-regulating TNFR1. EMBO Mol Med 7(8):1004 1017. by IRAK1 and IRAK2. Nat Immunol 9(6):684–691. 12. Ogasawara J, et al. (1993) Lethal effect of the anti-Fas antibody in mice. Nature 27. Huang S, et al. (1993) Immune response in mice that lack the interferon-gamma re- 364(6440):806–809. ceptor. Science 259(5102):1742–1745. 13. Weber A, Wasiliew P, Kracht M (2010) Interleukin-1 (IL-1) pathway. Sci Signal 28. Keane TM, et al. (2011) Mouse genomic variation and its effect on phenotypes and 3(105):cm1. gene regulation. Nature 477(7364):289–294. 14. Schroder K, Hertzog PJ, Ravasi T, Hume DA (2004) Interferon-gamma: An overview of 29. Yalcin B, et al. (2011) Sequence-based characterization of structural variation in the signals, mechanisms and functions. J Leukoc Biol 75(2):163–189. mouse genome. Nature 477(7364):326–329.

6of6 | www.pnas.org/cgi/doi/10.1073/pnas.1605076113 Steeland et al. Downloaded by guest on September 25, 2021