https://www.alphaknockout.com

Mouse Mms19 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Mms19 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Mms19 (NCBI Reference Sequence: NM_028152 ; Ensembl: ENSMUSG00000025159 ) is located on Mouse 19. 31 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 31 (Transcript: ENSMUST00000171561). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Mms19 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-287N7 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 5.33% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 3388 bp, and the size of intron 3 for 3'-loxP site insertion: 1810 bp. The size of effective cKO region: ~601 bp. The cKO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 4 31 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Mms19 Homology arm cKO region loxP site

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7101bp) | A(24.94% 1771) | C(22.53% 1600) | T(29.36% 2085) | G(23.17% 1645)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 - 41966565 41969564 3000 browser details YourSeq 249 762 2565 3000 89.5% chr2 + 130727567 130836633 109067 browser details YourSeq 216 2102 3000 3000 83.9% chr1 + 34796472 34796985 514 browser details YourSeq 185 759 2519 3000 88.2% chr11 + 5177770 5392502 214733 browser details YourSeq 163 2393 2902 3000 84.4% chr4 - 152278345 152278699 355 browser details YourSeq 161 2006 2564 3000 81.1% chr11 + 79000041 79000351 311 browser details YourSeq 151 2387 2888 3000 93.7% chr8 + 34758317 34758860 544 browser details YourSeq 150 2387 2571 3000 88.9% chr2 - 11213607 11213786 180 browser details YourSeq 149 2387 2569 3000 88.6% chr5 + 149323504 149323679 176 browser details YourSeq 146 2392 2571 3000 88.0% chr16 - 95963764 95963937 174 browser details YourSeq 145 2393 3000 3000 77.8% chr14 - 122220390 122220666 277 browser details YourSeq 145 2387 2564 3000 90.1% chr10 + 85472221 85472396 176 browser details YourSeq 144 2381 2564 3000 91.3% chr4 + 134499961 134500157 197 browser details YourSeq 144 2387 2570 3000 87.1% chr14 + 64684605 64684782 178 browser details YourSeq 142 2387 2570 3000 86.0% chr1 + 143735346 143735523 178 browser details YourSeq 141 2387 2561 3000 87.6% chr2 - 154493693 154493861 169 browser details YourSeq 141 2392 2570 3000 89.1% chr10 - 76059466 76059638 173 browser details YourSeq 139 2387 2563 3000 86.6% chr7 - 43057907 43058077 171 browser details YourSeq 139 2387 2571 3000 85.0% chr15 - 55416756 55416934 179 browser details YourSeq 139 2390 2586 3000 88.2% chr11 - 116243670 116243865 196

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 - 41962964 41965963 3000 browser details YourSeq 244 416 784 3000 92.0% chr11 + 82552848 82553207 360 browser details YourSeq 240 444 1141 3000 91.2% chr4 - 147815716 147816338 623 browser details YourSeq 238 450 786 3000 89.8% chr4 + 108409978 108410312 335 browser details YourSeq 229 447 1121 3000 88.3% chr10 + 7706734 7707153 420 browser details YourSeq 220 444 1120 3000 87.9% chr18 + 46619209 46619661 453 browser details YourSeq 215 456 1119 3000 96.6% chrX + 60636291 60636956 666 browser details YourSeq 202 452 1102 3000 96.4% chr9 - 110284057 110284739 683 browser details YourSeq 199 447 674 3000 93.9% chr17 + 28949607 28949824 218 browser details YourSeq 197 155 640 3000 90.7% chr18 - 10297709 10298025 317 browser details YourSeq 196 450 662 3000 96.6% chr7 - 101254268 101254478 211 browser details YourSeq 196 451 781 3000 97.6% chr5 - 121562949 121563421 473 browser details YourSeq 195 438 640 3000 98.6% chr5 + 28383160 28383443 284 browser details YourSeq 194 444 640 3000 99.5% chr1 - 173383772 173383969 198 browser details YourSeq 193 416 640 3000 97.1% chr1 + 179702705 179702943 239 browser details YourSeq 192 447 642 3000 99.0% chr17 - 24098895 24099090 196 browser details YourSeq 192 459 1098 3000 88.0% chr16 - 4005031 4005357 327 browser details YourSeq 192 447 640 3000 99.5% chr3 + 20009948 20010141 194 browser details YourSeq 191 450 644 3000 99.0% chr7 - 16292675 16292869 195 browser details YourSeq 191 444 640 3000 99.0% chr6 - 100164864 100165065 202

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 9 https://www.alphaknockout.com

Gene and information: Mms19 MMS19 cytosolic iron-sulfur assembly component [ Mus musculus (house mouse) ] Gene ID: 72199, updated on 6-Sep-2019

Gene summary

Official Symbol Mms19 provided by MGI Official Full Name MMS19 cytosolic iron-sulfur assembly component provided by MGI Primary source MGI:MGI:1919449 See related Ensembl:ENSMUSG00000025159 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as C79368; C86341; Mms19l; AI316855; 2410001K24Rik; 2610042O15Rik Expression Ubiquitous expression in thymus adult (RPKM 24.9), ovary adult (RPKM 20.0) and 28 other tissues See more Orthologs human all

Genomic context

Location: 19; 19 C3 See Mms19 in Genome Data Viewer

Exon count: 32

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 19 NC_000085.6 (41943707..41981207, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 19 NC_000085.5 (42018197..42055626, complement)

Chromosome 19 - NC_000085.6

Page 5 of 9 https://www.alphaknockout.com

Transcript information: This gene has 18 transcripts

Gene: Mms19 ENSMUSG00000025159

Description MMS19 cytosolic iron-sulfur assembly component [Source:MGI Symbol;Acc:MGI:1919449] Gene Synonyms 2610042O15Rik, C86341, Mms19, Mms19l Location Chromosome 19: 41,941,086-41,981,157 reverse strand. GRCm38:CM001012.2 About this gene This gene has 18 transcripts (splice variants), 197 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Mms19- ENSMUST00000171561.7 6607 1031aa ENSMUSP00000130900.1 Protein coding CCDS29818 Q9D071 TSL:1 217 GENCODE basic APPRIS P2

Mms19- ENSMUST00000026168.8 3162 988aa ENSMUSP00000026168.2 Protein coding - Q9D071 TSL:5 201 GENCODE basic APPRIS ALT2

Mms19- ENSMUST00000163287.7 3002 885aa ENSMUSP00000128653.1 Protein coding - F7C9N6 CDS 5' incomplete 202 TSL:5

Mms19- ENSMUST00000167820.1 859 286aa ENSMUSP00000130399.1 Protein coding - F6RGK4 CDS 5' and 3' 208 incomplete TSL:3

Mms19- ENSMUST00000169765.1 387 129aa ENSMUSP00000133075.1 Protein coding - F6TJQ8 CDS 5' and 3' 212 incomplete TSL:5

Mms19- ENSMUST00000167927.7 7875 357aa ENSMUSP00000132483.1 Nonsense mediated - E9PW47 TSL:2 209 decay

Mms19- ENSMUST00000164776.7 3412 75aa ENSMUSP00000129478.1 Nonsense mediated - E9PUY8 TSL:5 204 decay

Mms19- ENSMUST00000163398.7 3378 75aa ENSMUSP00000126864.1 Nonsense mediated - E9PUY8 TSL:5 203 decay

Mms19- ENSMUST00000169775.7 2630 174aa ENSMUSP00000128234.1 Nonsense mediated - E9Q5T0 TSL:1 213 decay

Mms19- ENSMUST00000168484.7 2368 357aa ENSMUSP00000126881.1 Nonsense mediated - E9PW47 TSL:1 210 decay

Mms19- ENSMUST00000166090.7 1859 501aa ENSMUSP00000131219.1 Nonsense mediated - F7A0X7 CDS 5' incomplete 206 decay TSL:5

Mms19- ENSMUST00000170209.7 3780 No - Retained intron - - TSL:1 216 protein

Mms19- ENSMUST00000171755.7 2810 No - Retained intron - - TSL:1 218 protein

Mms19- ENSMUST00000166517.1 807 No - Retained intron - - TSL:5 207 protein

Mms19- ENSMUST00000165043.1 599 No - Retained intron - - TSL:3 205 protein

Mms19- ENSMUST00000169779.1 432 No - Retained intron - - TSL:2 214 protein

Mms19- ENSMUST00000169933.1 240 No - Retained intron - - TSL:3 215 protein

Mms19- ENSMUST00000168737.1 141 No - Retained intron - - TSL:3 211 protein

60.07 kb Forward strand 41.94Mb 41.95Mb 41.96Mb 41.97Mb 41.98Mb 41.99Mb Zdhhc16-206 >protein coZdihnhgc16-202 >retained intron Gm23113-201 >snRNA (Comprehensive set... Page 6 of 9

Zdhhc16-201 >protein coding Ubtd1-201 >protein coding

Zdhhc16-209 >protein coZddinhghc16-205 >protein coding

Zdhhc16-208 >protein coding

Zdhhc16-211 >nonsense mediated decay

Zdhhc16-207 >protein coding

Zdhhc16-204 >retained intron

Zdhhc16-210 >retained intron

Zdhhc16-203 >retained intron

Contigs AC140193.3 > Genes (Comprehensive set... < Exosc1-202protein coding < Mms19-217protein coding

< Exosc1-201protein coding < Mms19-209nonsense mediated decay

< Exosc1-204retained intron < Mms19-216retained intron

< Exosc1-203retained intron < Mms19-202protein coding

< Mms19-204nonsense mediated decay

< Mms19-203nonsense mediated decay

< Mms19-201protein coding

< Mms19-206nonsense mediated decay < Mms19-215retained intron

< Mms19-214retained intro

< Mms19-218retained intron

< Mms19-208protein coding < Mms19-212protein coding

< Mms19-205retained intron

< Mms19-213nonsense mediated decay

< Mms19-207retained intron

< Mms19-211retained intron

Regulatory Build

41.94Mb 41.95Mb 41.96Mb 41.97Mb 41.98Mb 41.99Mb Reverse strand 60.07 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript 60.07 kb Forward strand https://www.alphaknockout.com 41.94Mb 41.95Mb 41.96Mb 41.97Mb 41.98Mb 41.99Mb Genes Zdhhc16-206 >protein coZdihnhgc16-202 >retained intron Gm23113-201 >snRNA (Comprehensive set...

Zdhhc16-201 >protein coding Ubtd1-201 >protein coding

Zdhhc16-209 >protein coZddinhghc16-205 >protein coding

Zdhhc16-208 >protein coding

Zdhhc16-211 >nonsense mediated decay

Zdhhc16-207 >protein coding

Zdhhc16-204 >retained intron

Zdhhc16-210 >retained intron

Zdhhc16-203 >retained intron

Contigs AC140193.3 > Genes (Comprehensive set... < Exosc1-202protein coding < Mms19-217protein coding

< Exosc1-201protein coding < Mms19-209nonsense mediated decay

< Exosc1-204retained intron < Mms19-216retained intron

< Exosc1-203retained intron < Mms19-202protein coding

< Mms19-204nonsense mediated decay

< Mms19-203nonsense mediated decay

< Mms19-201protein coding

< Mms19-206nonsense mediated decay < Mms19-215retained intron

< Mms19-214retained intro

< Mms19-218retained intron

< Mms19-208protein coding < Mms19-212protein coding

< Mms19-205retained intron

< Mms19-213nonsense mediated decay

< Mms19-207retained intron

< Mms19-211retained intron

Regulatory Build

41.94Mb 41.95Mb 41.96Mb 41.97Mb 41.98Mb 41.99Mb Reverse strand 60.07 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 9 https://www.alphaknockout.com

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000171561

< Mms19-217protein coding

Reverse strand 40.07 kb

ENSMUSP00000130... Low complexity (Seg) Superfamily Armadillo-type fold Pfam MMS19, N-terminal MMS19, C-terminal

PANTHER DNA repair/transcription protein MET18/MMS19

Gene3D Armadillo-like helical

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1031

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9