Nested Retrotransposition in the East Asian Mouse Genome Causes the Classical Nonagouti Mutation
Total Page:16
File Type:pdf, Size:1020Kb
ARTICLE https://doi.org/10.1038/s42003-019-0539-7 OPEN Nested retrotransposition in the East Asian mouse genome causes the classical nonagouti mutation Akira Tanave1,3, Yuji Imai1 & Tsuyoshi Koide 1,2 1234567890():,; Black coat color (nonagouti) is a widespread classical mutation in laboratory mouse strains. The intronic insertion of endogenous retrovirus VL30 in the nonagouti (a) allele of agouti gene was previously reported as the cause of the nonagouti phenotype. Here, we report agouti mouse strains from East Asia that carry the VL30 insertion, indicating that VL30 alone does not cause the nonagouti phenotype. We find that a rare type of endogenous retrovirus, β4, was integrated into the VL30 region at the a allele through nested retrotransposition, causing abnormal splicing. Targeted complete deletion of the β4 element restores agouti gene expression and agouti coat color, whereas deletion of β4 except for a single long terminal repeat results in black-and-tan coat color. Phylogenetic analyses show that the a allele and the β4 retrovirus originated from an East Asian mouse lineage most likely related to Japanese fancy mice. These findings reveal the causal mechanism and historic origin of the classical nonagouti mutation. 1 Mouse Genomics Resource Laboratory, National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8540, Japan. 2 Department of Genetics, SOKENDAI (The Graduate University for Advanced Studies), 1111 Yata, Mishima, Shizuoka 411-8540, Japan. 3Present address: Laboratory for Mouse Genetic Engineering, RIKEN Center for Biosystems Dynamics Research, 1–3 Yamadaoka, Suita, Osaka 565-0871, Japan. Correspondence and requests for materials should be addressed to T.K. (email: [email protected]) COMMUNICATIONS BIOLOGY | (2019) 2:283 | https://doi.org/10.1038/s42003-019-0539-7 | www.nature.com/commsbio 1 ARTICLE COMMUNICATIONS BIOLOGY | https://doi.org/10.1038/s42003-019-0539-7 oat color mutations are common in laboratory animals. a 20 kb The reference laboratory mouse strain C57BL/6 (B6) Exon 1A1A′ 1B/1C 234 C a carries the nonagouti (a) mutation at the agouti locus. The B6 coloration of each hair in the agouti (A) mouse, the so-called wild b type, shows unique hair pigmentation banded in a pattern of 1 2914 12,251 14,701 bp black-yellow-black from the root to the tip, whereas nonagouti VL30 Unknown VL30 (a/a) mice exhibit simple black hairs. From the early days of 3′ LTR 5′ DTR 3′ DTR 5′ LTR genetic studies after the rediscovery of Mendel’s law, many c researchers have been interested in the inheritance of the various LTR alleles at the agouti locus, including the a allele, because of its ease ′ of detection, high penetrance, and large number of available 4.0 genetic resources1,2. More than 100 mutations at the agouti locus 2.0 have been reported in the Mouse Genome Informatics (MGI) Ref VL30 version 6.12 database, and the a allele is broadly distributed in a LTR 3 ′ 0.0 kb variety of inbred strains listed in the International Mouse 5 0.02.0 4.0 6.0 8.0 10.0 12.0 14.0 kb Strain Resource (IMSR). These nonagouti mutants have been used for a variety of studies, such as analyses of the mechanism Fig. 1 Structure of inserted sequence in the a allele. a, b Schematic of pigmentation, melanosome function and its mechanism representation of the genomic structure of the a allele of the agouti gene (a) of gene regulation, and associations between coat color and and the inserted sequence (b) in the B6 mouse. The B6 reference genome behaviors1,3–6. Therefore, the nonagouti mutation remains useful sequence (14,705 bp in length on chr2:155,014,951–155,029,655 in in many research fields. GRCm38/mm10 assembly) was used to analyze the inserted sequence. The a allele is characterized by a hypomorphic mutation of the c Comparison of the inserted sequence in the a allele (horizontal axis) and agouti gene, which encodes a paracrine signaling molecule called the reference VL30 sequence (X17124.1; vertical axis). Scale bar = 20 kb. agouti signaling protein (ASIP or ASP), and the reduced ASIP is LTR, long terminal repeat; DTR, direct terminal repeat not sufficient for switching between the synthesis of eumelanin (black or brown) pigment to that of pheomelanin (yellow) pig- additional sequence (hereafter referred to as the unknown ele- ment. Thus, eumelanin is the predominant pigment in the mel- ment) internally at position 2914 in the VL30 sequence. The anocytes within the hair follicle in B6 mice1,3. The agouti gene, length of the unknown element within VL30 was thus longer than located on chromosome 2, is composed of seven exons. A pre- that estimated in a previous report4. vious genetic study proposed that an insertion of 11 kb in the intron lying downstream of the agouti exon 1C causes the non- agouti phenotype in a/a mice4. The structure of the inserted Nested insertion of an endogenous retrovirus within VL30. The sequence is complex and was estimated to consist of 5.5 kb of presence of a 4-bp direct duplication, also called a target site virus-like 30S (VL30) sequence, belonging to one of the murine duplication, flanking the VL30 element is thought to be evidence endogenous retroviruses, as well as incorporating 5.5 kb of of a retroviral insertion in the a allele of the agouti gene (Fig. 2a)4. additional unknown sequence internally. The VL30 insertion Similarly, we identified a 6-bp direct duplication of a part of the could be mutagenic to the agouti gene, as such sequences can act VL30 sequence on either side of the unknown element (Fig. 2a), as a target of epigenetic silencing7, but the actual molecular implying that the unknown element was inserted into the mechanism underlying the decline in agouti gene expression VL30 sequence through a mechanism similar to the integration of remains unknown. Furthermore, the molecular nature and evo- retroviral sequence8. In support of this idea, we found a com- lutionary process responsible for the internal, unknown sequence pletely identical direct terminal repeat of 522 bp at both ends within the VL30 element still remains to be clarified. lying just inside the 6-bp direct duplication sequence. A sequence Here, we characterized the molecular structure of the 11 kb (GenBank ID: AB242436) homologous to the unknown element sequence inserted into the agouti gene. We found that the internal has been reported and derives from the inserted sequence in the sequence is a rare type of endogenous retrovirus, β4, which was piebald (s) allele of the endothelin receptor type B (Ednrb) gene integrated into the VL30 sequence through nested retro- (Ednrbs) in Japanese fancy mouse strain JF1/Ms (JF1) (Fig. 2b)9. transposition. Deletion of the internal β4 in a nonagouti strain In order to predict the functional significance of the direct using the CRISPR/Cas9 method caused reversion of the wild-type terminal repeats in the unknown element, we searched for agouti phenotype, whereas deletion of the β4 except for a single homologous RNA sequences within the National Center for long terminal repeat (LTR) sequence resulted in a black-and-tan Biotechnology Information (NCBI) Mus musculus Annotation coat color. These results clearly demonstrate that the nested Release 106 using BLASTN (version 2.8.0 + ). Among these RNA retrotransposition of the β4 is the true cause behind the classical sequences, we found eight genes that contain sequence homo- nonagouti mutation. Furthermore, we found that the retro- logous to the direct terminal repeat (C6, Fuca1, Gimap3, Nyx transposition of β4 was present in Japanese fancy mice, through and Serpina3a mRNAs; Mir680-1, Mir680-2, and Mir680-3 which it was likely introduced into many laboratory strains. This microRNAs). Based on available expression sequence tags and study provides a clear scenario concerning the origin of one of the full-length cDNAs from mouse and experimental evidence from most famous classical mouse mutations. the Ednrbs gene8, we identified a transcription start site and a termination site on the sense strand of the homologous sequences (Supplementary Fig. 1). For instance, the direct terminal repeats Results in the unknown element were homologous to a 518-bp sequence Genomic structure of the nonagouti allele.Wefirst character- composed of the first exon and the flanking downstream regions ized in detail the structure of the inserted sequence in the a of the Serpina3a gene (93.1% identity), a 518-bp sequence com- allele based on the B6 reference genome sequence (Fig. 1a–c). posed of the last exon and the flanking upstream regions of the We found that the inserted sequence consists of 5357 bp of C6 gene (87.4% identity), and a 507-bp sequence composed of the VL30 sequence in an antisense orientation relative to the direc- last exon and the flanking upstream regions of the abnormal tion of agouti gene transcription, interrupted by 9344 bp of transcript of the Ednrbs gene (96.0% identity). It is known that the 2 COMMUNICATIONS BIOLOGY | (2019) 2:283 | https://doi.org/10.1038/s42003-019-0539-7 | www.nature.com/commsbio COMMUNICATIONS BIOLOGY | https://doi.org/10.1038/s42003-019-0539-7 ARTICLE a d 1 9338 bp A 5′ 3′ a 5′ 3′ Unknown ATG 3′ LTR 5′ LTR Gm14226 VL30Unknown VL30 Ref VL30 3′ 5′ a 5′ 3′ 5′ LTR gag propol env 3′ LTR 2 kb 5′ DTR 3′ DTR e Gypsy Erranti VL30Unknown VL30 PFV-1 77 WDSV 85 100 MDEV MmERV 91 b 100 GaLV KoRV 90 99 FeLV DR ′ MoMLV 100 70 PERV-A BaEV SNV HTLV-I HIV-1 Lenti 100 83 ALV 97 RSV 2.0 kb 83 GgERV-K10 Transposon-like Python_molorus_ERV 92 79 MmIAPEYI 97 MIAP 97 DR 3 ′ 97 MaIAP 5 0.0 4.0 0.5 subs./site MmERV-K10c 0.02.0 4.0 6.0 8.0 kb HERV-K10 MMTV (2) 91 MmERV- 1_NT_039714 MusD (7) Unknown ENTV (3) ′ ′ 100 JSRV ( 3) 5 DTR 3 DTR SERV (6) -like SRV-2 (6) 100 MPMV (6) 100 SRV-1 (6) c 90 SMRV (5) ′ MmERV- 5_AC098708 5′DTR 16 nt PBS PPT 3 DTR 100 MmERV- 4_AC110500 MmERV- 4_AC124523 TvERV-D (4) 96 522 bp 522 bp M_murinus_ERV- 4_AC145758 Lys-tRNA 82 MmERV-4_AC102561 3′ 5′ 72 100 Unknown Fig.