https://www.alphaknockout.com

Mouse Mlc1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Mlc1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Mlc1 (NCBI Reference Sequence: NM_133241 ; Ensembl: ENSMUSG00000035805 ) is located on Mouse 15. 12 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 12 (Transcript: ENSMUST00000042594). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Mlc1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-456F18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null allele exhibit myelin vacuolization that progresses with age, and show alterations in glial cell and oligodendrocyte physiology.

Exon 2 starts from about 100% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 518 bp, and the size of intron 2 for 3'-loxP site insertion: 1206 bp. The size of effective cKO region: ~713 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 12 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Mlc1 cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7195bp) | A(23.81% 1713) | C(21.97% 1581) | T(28.45% 2047) | G(25.77% 1854)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr15 - 88978244 88981243 3000 browser details YourSeq 516 830 1456 3000 92.2% chr5 + 147862552 147875750 13199 browser details YourSeq 185 547 1893 3000 86.6% chr1 - 135904789 136265071 360283 browser details YourSeq 131 523 723 3000 80.7% chr11 + 32528918 32529103 186 browser details YourSeq 126 1640 1892 3000 80.4% chr18 + 35152174 35152369 196 browser details YourSeq 116 532 690 3000 87.3% chr11 - 90488963 90489122 160 browser details YourSeq 112 522 702 3000 81.2% chr15 + 102164151 102164320 170 browser details YourSeq 111 533 689 3000 85.9% chr9 + 64723141 64723294 154 browser details YourSeq 110 511 678 3000 86.2% chr3 + 108864979 108865149 171 browser details YourSeq 108 512 694 3000 83.3% chr2 - 181866388 181866573 186 browser details YourSeq 107 547 708 3000 86.3% chr8 + 121624465 121624629 165 browser details YourSeq 106 547 713 3000 82.1% chr3 + 104425198 104425350 153 browser details YourSeq 105 1090 1228 3000 84.9% chr6 + 134949718 134949849 132 browser details YourSeq 102 1656 1872 3000 89.1% chr12 - 83711827 84063508 351682 browser details YourSeq 100 1734 2049 3000 91.7% chr11 + 76690326 76690649 324 browser details YourSeq 96 1664 1872 3000 78.7% chr5 + 114713923 114714112 190 browser details YourSeq 95 1749 1903 3000 79.9% chr3 + 138016300 138016425 126 browser details YourSeq 94 547 681 3000 85.2% chr1 + 9710115 9710261 147 browser details YourSeq 93 547 679 3000 85.7% chr5 - 110852933 110853081 149 browser details YourSeq 93 1738 1895 3000 82.5% chr5 - 73049522 73049652 131

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr15 - 88974549 88977548 3000 browser details YourSeq 948 1579 2572 3000 98.1% chr15 - 88975187 88976264 1078 browser details YourSeq 859 1285 2173 3000 98.4% chr15 - 88974998 88975886 889 browser details YourSeq 831 1705 2572 3000 98.3% chr15 - 88975397 88976264 868 browser details YourSeq 761 1786 2572 3000 98.4% chr15 - 88975481 88976267 787 browser details YourSeq 753 1269 2047 3000 98.4% chr15 - 88974998 88975776 779 browser details YourSeq 735 1467 2215 3000 99.1% chr15 - 88975166 88975914 749 browser details YourSeq 662 1412 2089 3000 98.9% chr15 - 88975124 88975801 678 browser details YourSeq 583 1677 2277 3000 98.6% chr15 - 88975398 88975998 601 browser details YourSeq 550 1831 2572 3000 96.5% chr15 - 88975355 88976054 700 browser details YourSeq 470 1436 1921 3000 98.4% chr15 - 88975166 88975651 486 browser details YourSeq 433 1980 2425 3000 98.7% chr15 - 88975796 88976283 488 browser details YourSeq 405 2148 2572 3000 98.1% chr15 - 88975859 88976283 425 browser details YourSeq 402 1266 1687 3000 97.7% chr15 - 88974980 88975401 422 browser details YourSeq 349 1266 1632 3000 97.6% chr15 - 88974993 88975359 367 browser details YourSeq 322 2206 2572 3000 98.0% chr15 - 88975817 88976267 451 browser details YourSeq 133 2039 2555 3000 85.4% chr8 + 76619502 76619974 473 browser details YourSeq 125 1955 2555 3000 81.7% chr8 + 76619502 76619926 425 browser details YourSeq 124 1367 2051 3000 80.7% chr8 + 76619502 76619926 425 browser details YourSeq 104 1661 2511 3000 78.7% chr8 + 76619502 76620042 541

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Mlc1 megalencephalic leukoencephalopathy with subcortical cysts 1 homolog (human) [ Mus musculus (house mouse) ] Gene ID: 170790, updated on 28-Sep-2019

Gene summary

Official Symbol Mlc1 provided by MGI Official Full Name megalencephalic leukoencephalopathy with subcortical cysts 1 homolog (human) provided by MGI Primary source MGI:MGI:2157910 See related Ensembl:ENSMUSG00000035805 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as VL; LVM; MLC; WKL1; AW048630; BB074274; Kiaa0027-hp Expression Biased expression in frontal lobe adult (RPKM 39.9), cortex adult (RPKM 36.9) and 6 other tissues See more Orthologs human all

Genomic context

Location: 15; 15 E3 See Mlc1 in Genome Data Viewer

Exon count: 13

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 15 NC_000081.6 (88955884..88982693, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 15 NC_000081.5 (88786314..88808983, complement)

Chromosome 15 - NC_000081.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Mlc1 ENSMUSG00000035805

Description megalencephalic leukoencephalopathy with subcortical cysts 1 homolog (human) [Source:MGI Symbol;Acc:MGI:2157910] Gene Synonyms Kiaa0027-hp, WKL1 Location Chromosome 15: 88,955,884-88,979,007 reverse strand. GRCm38:CM001008.2 About this gene This gene has 2 transcripts (splice variants), 185 orthologues, is a member of 1 Ensembl protein family and is associated with 17 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Mlc1-201 ENSMUST00000042594.12 2783 382aa ENSMUSP00000047667.6 Protein coding CCDS37173 Q8VHK5 TSL:1 GENCODE basic APPRIS P2

Mlc1-202 ENSMUST00000109368.1 2801 388aa ENSMUSP00000104993.1 Protein coding - E9QP87 TSL:1 GENCODE basic APPRIS ALT2

43.12 kb Forward strand 88.95Mb 88.96Mb 88.97Mb 88.98Mb Mov10l1-203 >protein coding (Comprehensive set...

Mov10l1-205 >retained intron

Mov10l1-201 >protein coding

Contigs < AC119959.8 Genes (Comprehensive set... < Ttll8-203nonsense mediated decay < Mlc1-202protein coding < Gm8702-201processed pseudogene

< Ttll8-201protein coding < Mlc1-201protein coding

< Ttll8-204nonsense mediated decay

Regulatory Build

88.95Mb 88.96Mb 88.97Mb 88.98Mb Reverse strand 43.12 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript pseudogene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000042594

< Mlc1-201protein coding

Reverse strand 23.12 kb

ENSMUSP00000047... Transmembrane heli... MobiDB lite Low complexity (Seg) PANTHER Membrane protein MLC1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 382

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7