https://www.alphaknockout.com

Mouse Ndufb5 Knockout Project (CRISPR/Cas9)

Objective: To create a Ndufb5 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ndufb5 (NCBI Reference Sequence: NM_025316 ; Ensembl: ENSMUSG00000027673 ) is located on Mouse 3. 6 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 6 (Transcript: ENSMUST00000127477). Exon 2~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 22.05% of the coding region. Exon 2~5 covers 57.32% of the coding region. The size of effective KO region: ~3714 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6

Legends Exon of mouse Ndufb5 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.55% 451) | C(23.55% 471) | T(29.2% 584) | G(24.7% 494)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.95% 439) | C(22.05% 441) | T(31.5% 630) | G(24.5% 490)

Note: The 2000 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr3 + 32742855 32744854 2000 browser details YourSeq 90 1 154 2000 88.8% chr18 + 74859119 74859277 159 browser details YourSeq 89 1 124 2000 90.2% chr10 + 122671440 122671586 147 browser details YourSeq 81 23 172 2000 92.7% chr6 - 148665668 148665843 176 browser details YourSeq 81 44 154 2000 84.6% chr3 + 52939439 52939548 110 browser details YourSeq 79 44 146 2000 86.3% chr15 - 40128667 40128768 102 browser details YourSeq 79 41 142 2000 89.3% chr17 + 28285686 28285792 107 browser details YourSeq 75 35 135 2000 92.3% chrX - 75244520 75244620 101 browser details YourSeq 75 7 124 2000 87.3% chr17 - 56750757 56750889 133 browser details YourSeq 75 44 136 2000 90.9% chr10 - 66482537 66482628 92 browser details YourSeq 75 41 154 2000 87.8% chr13 + 103695346 103695457 112 browser details YourSeq 73 5 136 2000 83.7% chr2 - 179434404 179434535 132 browser details YourSeq 73 36 138 2000 86.3% chr12 - 55294569 55294667 99 browser details YourSeq 73 10 120 2000 92.0% chr8 + 14712287 14712411 125 browser details YourSeq 73 23 119 2000 85.5% chr17 + 31055322 31055417 96 browser details YourSeq 72 39 136 2000 86.8% chr10 + 60720307 60720404 98 browser details YourSeq 72 11 136 2000 88.5% chr1 + 127105084 127105213 130 browser details YourSeq 71 44 136 2000 88.2% chr8 - 85623556 85623648 93 browser details YourSeq 70 36 133 2000 85.8% chr17 - 29615966 29616063 98 browser details YourSeq 70 41 124 2000 89.2% chr1 + 119883740 119883822 83

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr3 + 32748569 32750568 2000 browser details YourSeq 60 144 679 2000 65.9% chr14 - 62482770 62483018 249 browser details YourSeq 52 645 797 2000 73.9% chr2 + 165527175 165527296 122 browser details YourSeq 52 645 799 2000 92.1% chr17 + 49752222 49752407 186 browser details YourSeq 47 643 794 2000 91.4% chr2 - 92079475 92079628 154 browser details YourSeq 44 618 783 2000 62.5% chr11 - 86138325 86138381 57 browser details YourSeq 42 645 791 2000 68.7% chr11 - 57630377 57630490 114 browser details YourSeq 42 664 797 2000 95.9% chr5 + 131020153 131020293 141 browser details YourSeq 41 765 854 2000 87.3% chr2 - 3647675 3647766 92 browser details YourSeq 41 645 703 2000 77.0% chr4 + 142743894 142743946 53 browser details YourSeq 41 225 331 2000 90.2% chr2 + 20583836 20583973 138 browser details YourSeq 41 975 1023 2000 97.7% chr12 + 68645612 68645665 54 browser details YourSeq 36 618 778 2000 62.0% chr16 - 30306742 30306790 49 browser details YourSeq 36 167 642 2000 50.0% chr16 + 77630405 77630480 76 browser details YourSeq 35 271 333 2000 92.7% chr15 - 81854311 81854402 92 browser details YourSeq 35 646 792 2000 92.7% chr6 + 92758731 92758881 151 browser details YourSeq 35 565 685 2000 87.3% chr16 + 33976517 33976639 123 browser details YourSeq 33 765 800 2000 97.3% chr14 - 118744940 118744976 37 browser details YourSeq 31 219 288 2000 97.0% chr1 - 76840778 76840847 70 browser details YourSeq 31 617 649 2000 97.0% chr18 + 73831329 73831361 33

Note: The 2000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Ndufb5 NADH:ubiquinone oxidoreductase subunit B5 [ Mus musculus (house mouse) ] Gene ID: 66046, updated on 12-Aug-2019

Gene summary

Official Symbol Ndufb5 provided by MGI Official Full Name NADH:ubiquinone oxidoreductase subunit B5 provided by MGI Primary source MGI:MGI:1913296 See related Ensembl:ENSMUSG00000027673 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as SGDH; CI-SGDH; AU015782; 0610007D05Rik Expression Ubiquitous expression in heart adult (RPKM 156.3), adrenal adult (RPKM 105.9) and 28 other tissues See more Orthologs human all

Genomic context

Location: 3; 3 A3 See Ndufb5 in Genome Data Viewer Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (32737057..32751559)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (32635985..32650481)

Chromosome 3 - NC_000069.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Ndufb5 ENSMUSG00000027673

Description NADH:ubiquinone oxidoreductase subunit B5 [Source:MGI Symbol;Acc:MGI:1913296] Gene Synonyms 0610007D05Rik Location : 32,736,990-32,751,566 forward strand. GRCm38:CM000996.2 About this gene This gene has 8 transcripts (splice variants), 208 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ndufb5- ENSMUST00000127477.7 1079 189aa ENSMUSP00000114963.1 Protein coding CCDS17300 Q9CQH3 TSL:1 204 GENCODE basic APPRIS P1

Ndufb5- ENSMUST00000154257.7 910 135aa ENSMUSP00000117240.1 Protein coding - F6Y6V5 CDS 5' 206 incomplete TSL:2

Ndufb5- ENSMUST00000122290.1 625 119aa ENSMUSP00000113602.1 Protein coding - D3Z568 TSL:2 203 GENCODE basic

Ndufb5- ENSMUST00000139593.7 583 172aa ENSMUSP00000115088.1 Protein coding - D3YX99 CDS 3' 205 incomplete TSL:3

Ndufb5- ENSMUST00000121778.7 574 181aa ENSMUSP00000113169.1 Protein coding - D3Z6W9 TSL:3 202 GENCODE basic

Ndufb5- ENSMUST00000029217.11 799 63aa ENSMUSP00000029217.5 Nonsense mediated - F8WI84 CDS 5' 201 decay incomplete TSL:3

Ndufb5- ENSMUST00000156174.1 353 63aa ENSMUSP00000123596.1 Nonsense mediated - F6VXF9 CDS 5' 207 decay incomplete TSL:5

Ndufb5- ENSMUST00000195565.1 3127 No - Retained intron - - TSL:NA 208 protein

Page 7 of 9 https://www.alphaknockout.com

34.58 kb Forward strand

32.73Mb 32.74Mb 32.75Mb 32.76Mb (Comprehensive set... Ndufb5-204 >protein coding

Ndufb5-202 >protein coding

Ndufb5-208 >retained intron Ndufb5-203 >protein coding

Ndufb5-201 >nonsense mediated decay

Ndufb5-207 >nonsense mediated decay

Ndufb5-206 >protein coding

Ndufb5-205 >protein coding

Contigs AC116736.14 > Genes < Mrpl47-201protein coding (Comprehensive set...

< Mrpl47-202lncRNA < Mrpl47-203retained intron

Regulatory Build

32.73Mb 32.74Mb 32.75Mb 32.76Mb Reverse strand 34.58 kb

Regulation Legend CTCF Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000127477

14.58 kb Forward strand

Ndufb5-204 >protein coding

ENSMUSP00000114... Transmembrane heli... PDB-ENSP mappings Coiled-coils (Ncoils) Pfam NADH:ubiquinone oxidoreductase, NDUFB5/SGDH subunit PANTHER NADH:ubiquinone oxidoreductase, NDUFB5/SGDH subunit

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 189

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9