https://www.alphaknockout.com

Mouse Mrpl30 Knockout Project (CRISPR/Cas9)

Objective: To create a Mrpl30 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Mrpl30 (NCBI Reference Sequence: NM_027098 ; Ensembl: ENSMUSG00000026087 ) is located on Mouse 1. 6 exons are identified, with the ATG in exon 3 and the TGA in exon 6 (Transcript: ENSMUST00000027256). Exon 3~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 0.21% of the coding region. Exon 3~6 covers 100.0% of the coding region. The size of effective KO region: ~4297 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6

Legends Exon of mouse Mrpl30 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.0% 500) | C(19.85% 397) | T(31.4% 628) | G(23.75% 475)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.2% 584) | C(22.15% 443) | T(25.5% 510) | G(23.15% 463)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 37891966 37893965 2000 browser details YourSeq 96 663 1001 2000 81.9% chr18 + 10751118 10751420 303 browser details YourSeq 94 485 808 2000 76.8% chr14 - 78717602 78717738 137 browser details YourSeq 91 660 806 2000 85.0% chr18 + 56687082 56687206 125 browser details YourSeq 90 658 808 2000 82.2% chr18 + 43047291 43047422 132 browser details YourSeq 89 664 809 2000 85.9% chr8 - 126836690 126836815 126 browser details YourSeq 89 658 809 2000 82.8% chr11 + 52289895 52290031 137 browser details YourSeq 85 977 1069 2000 95.7% chrX - 104382534 104382626 93 browser details YourSeq 85 658 807 2000 80.8% chr18 - 21387874 21388004 131 browser details YourSeq 83 672 808 2000 83.6% chr5 + 131410852 131410971 120 browser details YourSeq 83 658 801 2000 80.8% chr17 + 57117958 57118082 125 browser details YourSeq 83 670 808 2000 82.2% chr10 + 75237155 75237272 118 browser details YourSeq 81 664 804 2000 82.7% chr5 - 45003793 45003918 126 browser details YourSeq 81 658 796 2000 83.4% chr2 - 162336905 162337026 122 browser details YourSeq 81 670 809 2000 84.7% chr3 + 127390426 127390547 122 browser details YourSeq 80 586 801 2000 91.9% chr8 - 46993743 46993959 217 browser details YourSeq 80 658 808 2000 81.5% chr4 - 106277817 106277948 132 browser details YourSeq 80 648 799 2000 80.3% chr10 + 30857331 30857458 128 browser details YourSeq 79 660 809 2000 82.8% chr3 - 100302861 100302991 131 browser details YourSeq 79 658 809 2000 81.5% chr17 - 83635891 83636023 133

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 37898263 37900262 2000 browser details YourSeq 212 507 1111 2000 86.9% chr8 - 111671866 111672184 319 browser details YourSeq 201 523 1107 2000 96.8% chr1 - 153517922 153518546 625 browser details YourSeq 200 507 1102 2000 87.0% chr1 - 156694127 156694558 432 browser details YourSeq 197 507 1102 2000 95.0% chr19 - 16757715 16758366 652 browser details YourSeq 192 507 1107 2000 96.7% chr1 + 63100834 63332860 232027 browser details YourSeq 185 525 1107 2000 95.6% chr17 + 87409373 87409996 624 browser details YourSeq 184 507 1096 2000 84.7% chr1 + 152547495 152547936 442 browser details YourSeq 177 563 1107 2000 95.9% chr11 - 85251836 85252407 572 browser details YourSeq 174 601 1107 2000 89.1% chr14 + 60687839 60688141 303 browser details YourSeq 171 507 1106 2000 87.5% chr3 - 90327737 90328223 487 browser details YourSeq 170 603 1107 2000 91.3% chr11 - 74955772 74956244 473 browser details YourSeq 168 563 1107 2000 85.8% chr11 + 98572960 98573205 246 browser details YourSeq 165 575 1105 2000 86.4% chr5 - 121727012 121727443 432 browser details YourSeq 164 944 1107 2000 100.0% chr19 + 46299854 46300017 164 browser details YourSeq 162 929 1108 2000 93.7% chr1 - 136405885 136406059 175 browser details YourSeq 161 945 1107 2000 99.4% chr4 - 145326290 145326452 163 browser details YourSeq 161 945 1107 2000 99.4% chr4 + 141585990 141586152 163 browser details YourSeq 160 943 1107 2000 98.8% chr11 - 103879732 103879897 166 browser details YourSeq 160 941 1106 2000 98.2% chr1 - 94414798 94414963 166

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Mrpl30 mitochondrial L30 [ Mus musculus (house mouse) ] Gene ID: 107734, updated on 24-Oct-2019

Gene summary

Official Symbol Mrpl30 provided by MGI Official Full Name mitochondrial ribosomal protein L30 provided by MGI Primary source MGI:MGI:1333820 See related Ensembl:ENSMUSG00000026087 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as L30mt; Rpml28; MRP-L28; MRP-L30; 2310001L22Rik Expression Ubiquitous expression in heart adult (RPKM 15.4), liver E14 (RPKM 13.9) and 28 other tissues See more Orthologs human all

Genomic context

Location: 1; 1 B See Mrpl30 in Genome Data Viewer Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (37890477..37898338)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (37947398..37955178)

Chromosome 1 - NC_000067.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Mrpl30 ENSMUSG00000026087

Description mitochondrial ribosomal protein L30 [Source:MGI Symbol;Acc:MGI:1333820] Gene Synonyms 2310001L22Rik, MRP-L28, Rpml28 Location Chromosome 1: 37,890,477-37,898,535 forward strand. GRCm38:CM000994.2 About this gene This gene has 7 transcripts (splice variants), 111 orthologues and is a member of 2 Ensembl protein families. Transcripts

Name Transcript ID bp Protein ID Biotype CCDS UniProt Flags

Mrpl30-201 ENSMUST00000027256.11 969 160aa ENSMUSP00000027256.5 Protein coding CCDS14896 Q9D7N6 TSL:1 GENCODE basic APPRIS P1

Mrpl30-205 ENSMUST00000193673.5 662 160aa ENSMUSP00000141654.1 Protein coding CCDS14896 Q9D7N6 TSL:2 GENCODE basic APPRIS P1

Mrpl30-207 ENSMUST00000195373.1 587 160aa ENSMUSP00000141693.1 Protein coding CCDS14896 Q9D7N6 TSL:2 GENCODE basic APPRIS P1

Mrpl30-202 ENSMUST00000160082.2 1757 6aa ENSMUSP00000141883.1 Protein coding - - CDS 3' incomplete TSL:3

Mrpl30-206 ENSMUST00000194857.5 501 128aa ENSMUSP00000142168.1 Protein coding - A0A0A6YXW4 CDS 3' incomplete TSL:1

Mrpl30-204 ENSMUST00000162513.1 644 No protein - Retained intron - - TSL:2

Mrpl30-203 ENSMUST00000161694.1 356 No protein - Retained intron - - TSL:2

Page 7 of 9 https://www.alphaknockout.com

28.06 kb Forward strand 37.885Mb 37.890Mb 37.895Mb 37.900Mb 37.905Mb (Comprehensive set... Mrpl30-201 >protein coding

Mrpl30-205 >protein coding

Mrpl30-203 >retained intron Mrpl30-204 >retained intron

Mrpl30-202 >protein coding

Mrpl30-206 >protein coding

Mrpl30-207 >protein coding

Contigs < AC119189.8

Genes < Mitd1-205protein coding < Lyg2-201protein coding (Comprehensive set...

< Mitd1-201protein coding

< Mitd1-202lncRNA

< Mitd1-204retained intron

Regulatory Build

37.885Mb 37.890Mb 37.895Mb 37.900Mb 37.905Mb Reverse strand 28.06 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000027256

8.06 kb Forward strand

Mrpl30-201 >protein coding

ENSMUSP00000027... Superfamily Ribosomal protein L30, ferredoxin-like fold domain superfamily Pfam Ribosomal protein L30, ferredoxin-like fold domain PANTHER Ribosomal protein L30, bacterial-type Gene3D Ribosomal protein L30, ferredoxin-like fold domain superfamily CDD Ribosomal protein L30, bacterial-type

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9