https://www.alphaknockout.com

Mouse Ndufaf1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Ndufaf1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ndufaf1 (NCBI Reference Sequence: NM_027175 ; Ensembl: ENSMUSG00000027305 ) is located on Mouse 2. 5 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 5 (Transcript: ENSMUST00000028768). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Ndufaf1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-22A15 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 100% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 1949 bp, and the size of intron 2 for 3'-loxP site insertion: 1542 bp. The size of effective cKO region: ~1082 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype 5' gRNA region gRNA region 3'

1 2 3 5 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends arm Exon of mouse Ndufaf1 cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7582bp) | A(26.56% 2014) | C(20.36% 1544) | T(29.69% 2251) | G(23.38% 1773)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 119660834 119663833 3000 browser details YourSeq 315 2304 2998 3000 94.2% chr10 - 128325052 128332787 7736 browser details YourSeq 291 2665 3000 3000 94.0% chr18 + 64642743 64643078 336 browser details YourSeq 290 2651 3000 3000 93.5% chr10 + 40429912 40430290 379 browser details YourSeq 282 2660 2998 3000 94.7% chr7 - 126162187 126162771 585 browser details YourSeq 275 2665 2998 3000 90.7% chr2 + 167021147 167021472 326 browser details YourSeq 274 2659 3000 3000 92.7% chr13 + 44824604 44825102 499 browser details YourSeq 274 2654 2985 3000 92.6% chr11 + 103932196 103932571 376 browser details YourSeq 262 2659 2971 3000 94.3% chrX + 101188648 101189220 573 browser details YourSeq 262 2674 2986 3000 94.9% chrX + 9515892 9516516 625 browser details YourSeq 259 2692 3000 3000 95.2% chr19 - 41463291 41463772 482 browser details YourSeq 255 2675 2986 3000 93.0% chr4 - 123580711 123581367 657 browser details YourSeq 252 2672 2986 3000 94.4% chr12 + 8666818 8667301 484 browser details YourSeq 246 2683 2986 3000 91.9% chrX - 41333584 41333890 307 browser details YourSeq 243 2651 2982 3000 90.7% chr15 - 100734610 100735213 604 browser details YourSeq 243 2647 2942 3000 92.7% chr17 + 56274834 56275172 339 browser details YourSeq 242 2675 2998 3000 92.1% chr7 - 100983704 100984147 444 browser details YourSeq 230 2683 2984 3000 94.0% chr11 + 94525314 94525620 307 browser details YourSeq 227 2659 2986 3000 92.9% chr11 - 75691220 75691824 605 browser details YourSeq 223 2667 2944 3000 93.8% chr17 - 28993081 28993781 701

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 119656752 119659751 3000 browser details YourSeq 192 1853 2132 3000 90.8% chr5 + 37748820 37749128 309 browser details YourSeq 183 1857 2138 3000 89.5% chr10 + 95110322 95110629 308 browser details YourSeq 174 1858 2125 3000 91.1% chr9 + 120037891 120038183 293 browser details YourSeq 172 1857 2094 3000 93.1% chr11 - 69726120 69726364 245 browser details YourSeq 170 1890 2100 3000 92.2% chr5 - 112943918 113084046 140129 browser details YourSeq 165 1931 2209 3000 89.2% chr5 - 113207543 113207843 301 browser details YourSeq 164 1857 2132 3000 91.5% chr18 + 37812128 37812409 282 browser details YourSeq 161 1912 2132 3000 90.5% chr5 - 149387993 149393555 5563 browser details YourSeq 160 1910 2125 3000 92.2% chr5 - 147104906 147404818 299913 browser details YourSeq 160 1888 2132 3000 89.7% chr16 + 23190173 23190428 256 browser details YourSeq 158 1857 2095 3000 87.3% chrX - 7952072 7952313 242 browser details YourSeq 156 1872 2074 3000 90.7% chr6 - 134825110 134825313 204 browser details YourSeq 156 1857 2102 3000 82.2% chr11 - 79068196 79068438 243 browser details YourSeq 156 1834 2096 3000 84.9% chr10 - 111126890 111127141 252 browser details YourSeq 155 1872 2132 3000 89.4% chr11 + 82864227 82864509 283 browser details YourSeq 154 1909 2132 3000 91.0% chr5 - 117751583 117751833 251 browser details YourSeq 154 1857 2132 3000 91.1% chrX + 51241751 51242250 500 browser details YourSeq 154 1857 2081 3000 90.7% chr2 + 72465349 72465582 234 browser details YourSeq 154 1889 2215 3000 88.2% chr16 + 92990619 92990968 350

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Ndufaf1 NADH:ubiquinone oxidoreductase complex assembly factor 1 [ Mus musculus (house mouse) ] Gene ID: 69702, updated on 24-Oct-2019

Gene summary

Official Symbol Ndufaf1 provided by MGI Official Full Name NADH:ubiquinone oxidoreductase complex assembly factor 1 provided by MGI Primary source MGI:MGI:1916952 See related Ensembl:ENSMUSG00000027305 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as CIA30; CGI-65; 2410001M24Rik Expression Ubiquitous expression in liver E14 (RPKM 7.1), heart adult (RPKM 6.2) and 25 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 E5 See Ndufaf1 in Genome Data Viewer

Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (119655446..119662839, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (119481187..119488534, complement)

Chromosome 2 - NC_000068.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Ndufaf1 ENSMUSG00000027305

Description NADH:ubiquinone oxidoreductase complex assembly factor 1 [Source:MGI Symbol;Acc:MGI:1916952] Gene Synonyms 2410001M24Rik, CGI-65, CIA30 Location Chromosome 2: 119,655,446-119,662,827 reverse strand. GRCm38:CM000995.2 About this gene This gene has 5 transcripts (splice variants), 202 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ndufaf1-201 ENSMUST00000028768.1 1451 330aa ENSMUSP00000028768.1 Protein coding CCDS38207 A0A0R4J081 TSL:1 GENCODE basic APPRIS P2

Ndufaf1-202 ENSMUST00000110801.7 1478 328aa ENSMUSP00000106425.1 Protein coding - A2AQ17 TSL:1 GENCODE basic APPRIS ALT2

Ndufaf1-203 ENSMUST00000110802.7 1308 328aa ENSMUSP00000106426.1 Protein coding - A2AQ17 TSL:1 GENCODE basic APPRIS ALT2

Ndufaf1-205 ENSMUST00000154127.1 622 No protein - lncRNA - - TSL:2

Ndufaf1-204 ENSMUST00000131596.1 291 No protein - lncRNA - - TSL:3

27.38 kb Forward strand

119.65Mb 119.66Mb 119.67Mb Nusap1-202 >protein coding Gm14383-201 >processed pseudogene (Comprehensive set...

Nusap1-201 >protein coding

Contigs AL844536.13 > Genes (Comprehensive set... < Ndufaf1-203protein coding < H3f3c-201processed pseudogene

< Ndufaf1-202protein coding

< Ndufaf1-201protein coding

< Ndufaf1-205lncRNA

< Ndufaf1-204lncRNA

Regulatory Build

119.65Mb 119.66Mb 119.67Mb Reverse strand 27.38 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene pseudogene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000028768

< Ndufaf1-201protein coding

Reverse strand 7.35 kb

ENSMUSP00000028... MobiDB lite Superfamily Galactose-binding-like domain superfamily Pfam NADH:ubiquinone oxidoreductase intermediate-associated protein 30 PANTHER PTHR13194:SF23

Complex I intermediate-associated protein 30, mitochondrial Gene3D 2.60.120.430

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 330

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7