https://www.alphaknockout.com

Mouse Ndufb4 Knockout Project (CRISPR/Cas9)

Objective: To create a Ndufb4 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ndufb4 (NCBI Reference Sequence: NM_026610 ; Ensembl: ENSMUSG00000022820 ) is located on Mouse 16. 3 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 3 (Transcript: ENSMUST00000023514). Exon 1~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 1 starts from about 0.26% of the coding region. Exon 1~3 covers 100.0% of the coding region. The size of effective KO region: ~6698 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3

Legends Exon of mouse Ndufb4 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.0% 500) | C(21.65% 433) | T(29.1% 582) | G(24.25% 485)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.45% 549) | C(23.0% 460) | T(23.95% 479) | G(25.6% 512)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr16 - 37654360 37656359 2000 browser details YourSeq 248 889 1402 2000 87.6% chr9 - 57429769 57430335 567 browser details YourSeq 234 897 1400 2000 87.8% chr4 + 39682659 39683231 573 browser details YourSeq 230 899 1308 2000 88.4% chr18 + 24333957 24334408 452 browser details YourSeq 218 897 1401 2000 85.5% chr13 + 44939148 44939692 545 browser details YourSeq 217 907 1354 2000 90.5% chr1 - 86061125 86150917 89793 browser details YourSeq 215 895 1409 2000 86.8% chr1 - 35981614 35982161 548 browser details YourSeq 210 910 1409 2000 89.5% chr3 - 79394084 79394637 554 browser details YourSeq 210 897 1297 2000 90.2% chr3 - 58078678 58079110 433 browser details YourSeq 208 895 1354 2000 90.8% chrX - 163024750 163233302 208553 browser details YourSeq 208 909 1412 2000 83.0% chr12 - 106169552 106170046 495 browser details YourSeq 206 899 1407 2000 87.4% chr3 - 69506244 69506771 528 browser details YourSeq 206 895 1407 2000 88.2% chr2 - 32122110 32122640 531 browser details YourSeq 203 895 1409 2000 87.2% chr1 + 75342245 75342793 549 browser details YourSeq 201 910 1407 2000 86.3% chr2 - 84056876 84057402 527 browser details YourSeq 198 904 1402 2000 88.2% chr19 + 28872676 28873256 581 browser details YourSeq 198 907 1411 2000 89.3% chr14 + 61325897 61326463 567 browser details YourSeq 193 923 1354 2000 86.8% chr7 - 144078362 144078808 447 browser details YourSeq 185 996 1410 2000 89.5% chr5 - 149925262 150357136 431875 browser details YourSeq 184 996 1410 2000 87.7% chr15 + 76993167 76993712 546

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr16 - 37645660 37647659 2000 browser details YourSeq 56 1452 1510 2000 98.4% chr1 - 93834108 93834176 69 browser details YourSeq 56 1448 1508 2000 96.8% chr16 + 18425129 18425191 63 browser details YourSeq 55 1452 1508 2000 98.3% chr15 - 59578240 59578296 57 browser details YourSeq 55 1452 1508 2000 98.3% chr9 + 59545631 59545687 57 browser details YourSeq 55 1452 1508 2000 98.3% chr19 + 23037920 23037976 57 browser details YourSeq 55 1452 1508 2000 98.3% chr16 + 4580746 4580802 57 browser details YourSeq 54 1452 1510 2000 96.7% chr2 + 13075110 13075170 61 browser details YourSeq 54 1 60 2000 89.5% chr19 + 25033529 25033585 57 browser details YourSeq 53 1452 1506 2000 98.2% chr9 - 110510058 110510112 55 browser details YourSeq 53 1452 1504 2000 100.0% chr9 - 74753100 74753152 53 browser details YourSeq 53 1452 1508 2000 96.5% chr18 + 17714331 17714387 57 browser details YourSeq 52 1452 1510 2000 95.0% chrX - 136880271 136880331 61 browser details YourSeq 51 1452 1508 2000 94.8% chrX - 104429545 104429601 57 browser details YourSeq 51 1452 1508 2000 94.8% chr6 + 13558666 13558722 57 browser details YourSeq 50 1 57 2000 94.7% chr5 - 26859911 26859969 59 browser details YourSeq 49 1452 1504 2000 96.3% chr3 - 38315760 38315812 53 browser details YourSeq 48 1 52 2000 96.2% chr15 - 73809458 73809509 52 browser details YourSeq 44 1453 1496 2000 100.0% chr17 - 36986358 36986401 44 browser details YourSeq 42 1 49 2000 93.8% chr7 - 100738003 100738053 51

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Ndufb4 NADH:ubiquinone oxidoreductase subunit B4 [ Mus musculus (house mouse) ] Gene ID: 68194, updated on 10-Oct-2019

Gene summary

Official Symbol Ndufb4 provided by MGI Official Full Name NADH:ubiquinone oxidoreductase subunit B4 provided by MGI Primary source MGI:MGI:1915444 See related Ensembl:ENSMUSG00000022820 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 0610006N12Rik; 1300010H20Rik Expression Ubiquitous expression in kidney adult (RPKM 303.2), heart adult (RPKM 262.0) and 28 other tissues See more Orthologs human all

Genomic context

Location: 16; 16 B3 See Ndufb4 in Genome Data Viewer Exon count: 3

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 16 NC_000082.6 (37647602..37654368, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 16 NC_000082.5 (37647688..37654454, complement)

Chromosome 16 - NC_000082.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Ndufb4 ENSMUSG00000022820

Description NADH:ubiquinone oxidoreductase subunit B4 [Source:MGI Symbol;Acc:MGI:1915444] Gene Synonyms 0610006N12Rik, 1300010H20Rik Location Chromosome 16: 37,647,170-37,654,453 reverse strand. GRCm38:CM001009.2 About this gene This gene has 3 transcripts (splice variants), 234 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ndufb4-201 ENSMUST00000023514.3 971 129aa ENSMUSP00000023514.3 Protein coding CCDS28161 Q9CQC7 TSL:1 GENCODE basic APPRIS P1

Ndufb4-203 ENSMUST00000231881.1 933 No protein - Retained intron - - -

Ndufb4-202 ENSMUST00000135019.1 465 No protein - Retained intron - - TSL:2

27.28 kb Forward strand 37.64Mb 37.65Mb 37.66Mb Contigs < AC129539.11 Genes (Comprehensive set... < Ndufb4-201protein coding

< Ndufb4-202retained intron < Ndufb4-203retained intron

Regulatory Build

37.64Mb 37.65Mb 37.66Mb Reverse strand 27.28 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000023514

< Ndufb4-201protein coding

Reverse strand 7.28 kb

ENSMUSP00000023... Transmembrane heli... PDB-ENSP mappings Pfam NADH:ubiquinone oxidoreductase, subunit NDUFB4 PANTHER NADH:ubiquinone oxidoreductase, subunit NDUFB4

PTHR15469:SF1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 129

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8