https://www.alphaknockout.com

Mouse Ndufab1 Knockout Project (CRISPR/Cas9)

Objective: To create a Ndufab1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ndufab1 (NCBI Reference Sequence: NM_028177 ; Ensembl: ENSMUSG00000030869 ) is located on Mouse 7. 5 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 4 (Transcript: ENSMUST00000033157). Exon 2~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 36.11% of the coding region. Exon 2~4 covers 64.1% of the coding region. The size of effective KO region: ~5146 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5

Legends Exon of mouse Ndufab1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 4 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.8% 496) | C(21.6% 432) | T(28.7% 574) | G(24.9% 498)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(23.9% 478) | C(23.6% 472) | T(32.2% 644) | G(20.3% 406)

Note: The 2000 bp section downstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 - 122096732 122098731 2000 browser details YourSeq 78 1263 1349 2000 95.4% chr7 - 122097078 122097315 238 browser details YourSeq 67 1263 1337 2000 91.9% chr7 - 122097052 122097125 74 browser details YourSeq 63 1613 1680 2000 97.1% chr7 - 122097395 122097463 69 browser details YourSeq 58 660 1805 2000 93.9% chr10 - 61515656 61631505 115850 browser details YourSeq 52 1728 1855 2000 80.7% chr8 - 109616866 109616984 119 browser details YourSeq 46 1374 1614 2000 65.4% chr5 - 116141761 116141827 67 browser details YourSeq 38 1753 1855 2000 93.2% chr5 + 64218809 64218912 104 browser details YourSeq 34 1645 1680 2000 97.3% chr7 - 122097434 122097469 36 browser details YourSeq 34 1789 1836 2000 94.8% chr10 + 82646191 82646239 49 browser details YourSeq 33 1816 1853 2000 97.2% chr9 - 123456239 123456277 39 browser details YourSeq 32 548 585 2000 92.2% chr10 + 52855975 52856012 38 browser details YourSeq 31 1814 1855 2000 97.0% chr6 + 38289596 38289638 43 browser details YourSeq 31 656 691 2000 94.2% chr10 + 90321446 90321481 36 browser details YourSeq 28 1821 1855 2000 93.8% chr7 - 141607131 141607166 36 browser details YourSeq 27 543 586 2000 81.4% chr12 - 100788269 100788318 50 browser details YourSeq 26 560 585 2000 100.0% chr14 - 67046166 67046191 26 browser details YourSeq 26 558 585 2000 96.5% chr11 + 102640596 102640623 28 browser details YourSeq 26 1759 1802 2000 86.7% chr11 + 57594011 57594053 43 browser details YourSeq 24 451 479 2000 81.5% chr10 - 60765875 60765901 27

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 - 122089597 122091596 2000 browser details YourSeq 294 803 1296 2000 90.0% chrX + 152024541 152024875 335 browser details YourSeq 288 883 1294 2000 92.7% chr14 - 18868323 18868866 544 browser details YourSeq 285 804 1296 2000 89.1% chr6 + 57704469 57704805 337 browser details YourSeq 283 938 1296 2000 90.3% chr13 + 20488264 20488591 328 browser details YourSeq 282 932 1296 2000 91.5% chr12 + 84968204 84968835 632 browser details YourSeq 281 973 1296 2000 93.9% chr2 - 78989818 78990150 333 browser details YourSeq 281 943 1295 2000 90.8% chrX + 100383878 100384215 338 browser details YourSeq 281 957 1289 2000 92.9% chr3 + 54139522 54139849 328 browser details YourSeq 280 785 1287 2000 89.9% chr1 - 18757670 18758004 335 browser details YourSeq 280 947 1297 2000 92.6% chrX + 97950655 97951173 519 browser details YourSeq 280 806 1296 2000 88.6% chr8 + 128585100 128585460 361 browser details YourSeq 280 939 1289 2000 90.9% chr4 + 48374590 48374913 324 browser details YourSeq 279 959 1296 2000 91.7% chr13 + 104722005 104722336 332 browser details YourSeq 278 948 1296 2000 92.1% chr2 - 26719866 26720212 347 browser details YourSeq 278 975 1296 2000 93.8% chr9 + 96190311 96190636 326 browser details YourSeq 277 959 1289 2000 91.2% chr14 - 118920119 118920437 319 browser details YourSeq 277 943 1286 2000 91.3% chr13 - 6129872 6130214 343 browser details YourSeq 274 973 1295 2000 94.0% chr1 - 9316997 9317321 325 browser details YourSeq 274 962 1296 2000 92.6% chrX + 159543956 159544301 346

Note: The 2000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Ndufab1 NADH:ubiquinone oxidoreductase subunit AB1 [ Mus musculus (house mouse) ] Gene ID: 70316, updated on 24-Oct-2019

Gene summary

Official Symbol Ndufab1 provided by MGI Official Full Name NADH:ubiquinone oxidoreductase subunit AB1 provided by MGI Primary source MGI:MGI:1917566 See related Ensembl:ENSMUSG00000030869 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as ACP; 8kDa; SDAP; CI-SDAP; 2210401F17Rik; 2310039H15Rik; 2610003B19Rik; 9130423F15Rik Expression Broad expression in adrenal adult (RPKM 768.6), duodenum adult (RPKM 586.0) and 16 other tissues See more Orthologs human all

Genomic context

Location: 7; 7 F2 See Ndufab1 in Genome Data Viewer Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (122086815..122101848, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (129231558..129245362, complement)

Chromosome 7 - NC_000073.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 10 transcripts

Gene: Ndufab1 ENSMUSG00000030869

Description NADH:ubiquinone oxidoreductase subunit AB1 [Source:MGI Symbol;Acc:MGI:1917566] Gene Synonyms 2210401F17Rik, 2310039H15Rik, 2610003B19Rik, 8kDa, 9130423F15Rik Location Chromosome 7: 122,085,403-122,101,886 reverse strand. GRCm38:CM001000.2 About this gene This gene has 10 transcripts (splice variants), 228 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ndufab1- ENSMUST00000033157.9 3479 156aa ENSMUSP00000033157.3 Protein coding CCDS21809 Q569N0 TSL:1 201 Q9CR21 GENCODE basic APPRIS P1

Ndufab1- ENSMUST00000123296.7 1416 156aa ENSMUSP00000116177.1 Protein coding CCDS21809 Q569N0 TSL:1 203 Q9CR21 GENCODE basic APPRIS P1

Ndufab1- ENSMUST00000106471.8 552 128aa ENSMUSP00000102079.2 Protein coding - F8WJ64 CDS 5' 202 incomplete TSL:3

Ndufab1- ENSMUST00000139456.1 452 127aa ENSMUSP00000114756.1 Protein coding - F6ZFT1 CDS 5' 206 incomplete TSL:2

Ndufab1- ENSMUST00000130857.1 1653 No - Retained - - TSL:1 204 protein intron

Ndufab1- ENSMUST00000153173.7 755 No - Retained - - TSL:2 210 protein intron

Ndufab1- ENSMUST00000146022.1 443 No - Retained - - TSL:2 208 protein intron

Ndufab1- ENSMUST00000130904.1 397 No - Retained - - TSL:2 205 protein intron

Ndufab1- ENSMUST00000146964.1 351 No - Retained - - TSL:3 209 protein intron

Ndufab1- ENSMUST00000145863.1 261 No - lncRNA - - TSL:5 207 protein

Page 7 of 9 https://www.alphaknockout.com

36.48 kb Forward strand 122.08Mb 122.09Mb 122.10Mb 122.11Mb Ubfd1-201 >protein coding Gm44986-201 >TEC (Comprehensive set...

Ubfd1-204 >retained intron

Ubfd1-205 >nonsense mediated decay

Contigs < AC124379.3 Genes < Ndufab1-201protein coding < Palb2-206nonsense mediated decay (Comprehensive set...

< Ndufab1-209retained intro

< Ndufab1-203protein coding < Palb2-208retained intron

< Ndufab1-204retained intron< Ndufab1-208retained intron < Palb2-202protein coding

< Ndufab1-202protein coding < Palb2-201protein coding

< Ndufab1-205retained intron < Palb2-204protein coding

< Ndufab1-206protein coding < Palb2-203protein coding

< Ndufab1-210retained intron

Regulatory Build

122.08Mb 122.09Mb 122.10Mb 122.11Mb Reverse strand 36.48 kb

Regulation Legend CTCF Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000033157

< Ndufab1-201protein coding

Reverse strand 16.48 kb

ENSMUSP00000033... PDB-ENSP mappings Low complexity (Seg) TIGRFAM Acyl carrier protein (ACP)

Superfamily ACP-like superfamily Pfam Phosphopantetheine binding ACP domain PROSITE profiles Phosphopantetheine binding ACP domain PROSITE patterns Phosphopantetheine attachment site PANTHER PTHR20863

PTHR20863:SF5 HAMAP Acyl carrier protein (ACP) Gene3D ACP-like superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 20 40 60 80 100 120 156

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9