https://www.alphaknockout.com

Mouse Mlst8 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Mlst8 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Mlst8 (NCBI Reference Sequence: NM_019988 ; Ensembl: ENSMUSG00000024142 ) is located on Mouse 17. 9 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 9 (Transcript: ENSMUST00000070888). Exon 5~6 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Mlst8 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-162O18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null mutation exhibit lethality around E10.5 and abnormal yolk sac vasculature, brain development and heart development.

Exon 5 starts from about 35.28% of the coding region. The knockout of Exon 5~6 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 357 bp, and the size of intron 6 for 3'-loxP site insertion: 731 bp. The size of effective cKO region: ~753 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 4 5 6 7 8 5 4 9 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Mlst8 cKO region Exon of mouse Bricd5 loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7181bp) | A(22.89% 1644) | C(25.0% 1795) | T(25.21% 1810) | G(26.9% 1932)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 - 24477684 24480683 3000 browser details YourSeq 185 80 605 3000 83.8% chr5 + 37361933 37362483 551 browser details YourSeq 155 292 702 3000 92.4% chr2 + 120305825 120306329 505 browser details YourSeq 153 104 599 3000 85.1% chr18 + 88092125 88092591 467 browser details YourSeq 153 97 604 3000 84.4% chr14 + 45200664 45201211 548 browser details YourSeq 150 96 466 3000 88.7% chr3 + 123051904 123052312 409 browser details YourSeq 147 96 482 3000 85.0% chr9 + 33462375 33462793 419 browser details YourSeq 144 179 611 3000 86.0% chr1 - 78926700 78927150 451 browser details YourSeq 140 97 490 3000 87.6% chr5 - 29513709 29514118 410 browser details YourSeq 137 81 612 3000 88.8% chr5 + 72933537 72934090 554 browser details YourSeq 134 80 484 3000 86.5% chr1 - 19512116 19512555 440 browser details YourSeq 132 123 493 3000 86.7% chr6 - 71758582 71758988 407 browser details YourSeq 132 109 481 3000 89.3% chr1 + 5243373 5243789 417 browser details YourSeq 130 176 482 3000 88.4% chrX + 61069813 61070132 320 browser details YourSeq 130 96 571 3000 85.1% chr1 + 51839591 51840031 441 browser details YourSeq 126 97 471 3000 88.1% chr15 - 36258632 36259034 403 browser details YourSeq 126 103 610 3000 83.1% chr16 + 8231444 8232005 562 browser details YourSeq 125 226 599 3000 85.4% chr6 + 44464419 44464807 389 browser details YourSeq 117 97 483 3000 91.7% chr11 - 24554636 24555065 430 browser details YourSeq 112 292 494 3000 90.5% chr2 - 3825092 3825314 223

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 - 24473931 24476930 3000 browser details YourSeq 26 2937 2969 3000 86.7% chr2 + 125230195 125230226 32 browser details YourSeq 23 2248 2270 3000 100.0% chr9 - 46150311 46150333 23 browser details YourSeq 22 2045 2067 3000 100.0% chr1 - 18215402 18215426 25 browser details YourSeq 22 2770 2791 3000 100.0% chr1 - 12652823 12652844 22 browser details YourSeq 20 1982 2001 3000 100.0% chr1 + 31019345 31019364 20

Note: The 3000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Mlst8 MTOR associated protein, LST8 homolog (S. cerevisiae) [ Mus musculus (house mouse) ] Gene ID: 56716, updated on 24-Oct-2019

Gene summary

Official Symbol Mlst8 provided by MGI Official Full Name MTOR associated protein, LST8 homolog (S. cerevisiae) provided by MGI Primary source MGI:MGI:1929514 See related Ensembl:ENSMUSG00000024142 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gbl; AA409454; AI505104; AI851821; 0610033N12Rik Expression Ubiquitous expression in thymus adult (RPKM 14.9), large intestine adult (RPKM 14.2) and 28 other tissues See more Orthologs human all

Genomic context

Location: 17; 17 A3.3 See Mlst8 in Genome Data Viewer

Exon count: 9

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (24473550..24479114, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (24610496..24616023, complement)

Chromosome 17 - NC_000083.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 10 transcripts

Gene: Mlst8 ENSMUSG00000024142

Description MTOR associated protein, LST8 homolog (S. cerevisiae) [Source:MGI Symbol;Acc:MGI:1929514] Gene Synonyms 0610033N12Rik, Gbl, mLST8 Location Chromosome 17: 24,473,551-24,479,078 reverse strand. GRCm38:CM001010.2 About this gene This gene has 10 transcripts (splice variants), 198 orthologues, is a member of 1 Ensembl protein family and is associated with 11 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Mlst8- ENSMUST00000234686.1 3855 326aa ENSMUSP00000157137.1 Protein coding CCDS28484 Q9DCJ1 GENCODE 208 basic APPRIS P1

Mlst8- ENSMUST00000070888.13 3358 326aa ENSMUSP00000065004.6 Protein coding CCDS28484 Q9DCJ1 TSL:1 201 GENCODE basic APPRIS P1

Mlst8- ENSMUST00000179163.2 3340 326aa ENSMUSP00000136287.1 Protein coding CCDS28484 Q9DCJ1 TSL:1 202 GENCODE basic APPRIS P1

Mlst8- ENSMUST00000234335.1 1574 326aa ENSMUSP00000157301.1 Protein coding CCDS28484 Q9DCJ1 GENCODE 205 basic APPRIS P1

Mlst8- ENSMUST00000234941.1 3223 271aa ENSMUSP00000157107.1 Protein coding - A0A3Q4L2V5 GENCODE 210 basic

Mlst8- ENSMUST00000234543.1 3127 260aa ENSMUSP00000157225.1 Protein coding - A0A3Q4EC26 GENCODE 207 basic

Mlst8- ENSMUST00000234147.1 850 238aa ENSMUSP00000157174.1 Protein coding - A0A3Q4L2Y8 CDS 3' 204 incomplete

Mlst8- ENSMUST00000234516.1 786 56aa ENSMUSP00000157327.1 Nonsense mediated - A0A3Q4EC81 - 206 decay

Mlst8- ENSMUST00000234032.1 668 No - Retained intron - - - 203 protein

Mlst8- ENSMUST00000234892.1 519 No - Retained intron - - - 209 protein

Page 6 of 8 https://www.alphaknockout.com

25.53 kb Forward strand 24.465Mb 24.470Mb 24.475Mb 24.480Mb 24.485Mb Pgp-201 >protein coding Bricd5-201 >protein coding Gm50062-201 >lncRNA (Comprehensive set...

Pgp-202 >retained intron Caskin1-203 >retained intron

Bricd5-203 >protein coding Caskin1-201 >protein coding

Bricd5-202 >protein coding Caskin1-204 >protein coding

Contigs AC154237.1 > Genes (Comprehensive set... < E4f1-202nonsense mediated decay < Mlst8-202protein coding

< E4f1-203lncRNA < Mlst8-210protein coding

< Mlst8-201protein coding

< Mlst8-208protein coding

< Mlst8-207protein coding

< Mlst8-205protein coding

< Mlst8-204protein coding

< Mlst8-209retained intron

< Mlst8-206nonsense mediated decay

< Mlst8-203retained intron

Regulatory Build

24.465Mb 24.470Mb 24.475Mb 24.480Mb 24.485Mb Reverse strand 25.53 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000070888

< Mlst8-201protein coding

Reverse strand 5.53 kb

ENSMUSP00000065... Superfamily Quinoprotein alcohol dehydrogenase-like superfamily

SMART WD40 repeat Prints G-protein beta WD-40 repeat Pfam WD40 repeat PROSITE profiles WD40 repeat

WD40-repeat-containing domain PROSITE patterns WD40 repeat, conserved site

PANTHER PTHR19842:SF0

Target of rapamycin complex subunit LST8 Gene3D WD40/YVTN repeat-like-containing domain superfamily CDD cd00200

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 326

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8