http://www.alphaknockout.com/

Mouse Ndufa4 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Ndufa4 conditional knockout mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ndufa4 ( NCBI Reference Sequence: NM_010886 ; Ensembl: ENSMUSG00000029632 ) is located on mouse 6. 4 exons are identified , with the ATG start codon in exon 1 and the TAA stop codon in exon 4 (Transcript: ENSMUST00000204978). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the mouse Ndufa4 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-322J22 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

The knockout of Exon 2 will result in frameshift of the gene, and covers 36.18% of the coding region. The size of intron 1 for 5'-loxP site insertion: 1215 bp, and the size of intron 2 for 3'-loxP site insertion: 757 bp. The size of effective cKO region: ~1149 bp. This strategy is designed based on genetic information in existing databases. Due to the complexity of biological processes, all risk of loxP insertion on gene transcription, RNA splicing and translation cannot be predicted at existing technological level.

Page 1 of 7 http://www.alphaknockout.com/

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 4 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Ndufa4 cKO region loxP site

Page 2 of 7 http://www.alphaknockout.com/

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7089bp) | A(27.42% 1944) | C(20.23% 1434) | G(22.05% 1563) | T(30.3% 2148)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 http://www.alphaknockout.com/

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 - 11906359 11909358 3000 browser details YourSeq 133 1875 2035 3000 91.4% chrX + 5984653 5984813 161 browser details YourSeq 127 1884 2036 3000 88.6% chr8 - 78778176 78778324 149 browser details YourSeq 118 2684 2818 3000 94.1% chr12 + 3412188 3412323 136 browser details YourSeq 118 2684 2828 3000 91.1% chr11 + 51761108 51761253 146 browser details YourSeq 116 2686 2818 3000 94.0% chr9 - 108161539 108161672 134 browser details YourSeq 116 2684 2829 3000 92.1% chr11 + 59954202 59954671 470 browser details YourSeq 115 2684 2829 3000 89.8% chr16 + 34859728 34859874 147 browser details YourSeq 115 2684 2828 3000 93.3% chr11 + 97644493 97644639 147 browser details YourSeq 114 2684 2829 3000 92.0% chr1 + 84916238 84916385 148 browser details YourSeq 112 2684 2818 3000 91.9% chr11 - 84839153 84839288 136 browser details YourSeq 112 2684 2818 3000 91.9% chr11 - 53388482 53388617 136 browser details YourSeq 112 2685 2839 3000 92.4% chr8 + 34987134 34987372 239 browser details YourSeq 112 2684 2829 3000 91.8% chr3 + 10179289 10179436 148 browser details YourSeq 112 2684 2818 3000 91.9% chr11 + 84552772 84552907 136 browser details YourSeq 111 2684 2819 3000 93.1% chr11 - 86126307 86126445 139 browser details YourSeq 110 2684 2818 3000 91.2% chr3 - 130982449 130982584 136 browser details YourSeq 110 2684 2818 3000 93.6% chr2 - 57192025 57192159 135 browser details YourSeq 110 2689 2829 3000 89.4% chr1 + 6830884 6831025 142 browser details YourSeq 109 2683 2832 3000 90.4% chr11 - 20512941 20513092 152

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 - 11902770 11905769 3000 browser details YourSeq 90 1730 2127 3000 75.3% chr15 - 73459062 73459332 271 browser details YourSeq 72 70 564 3000 74.5% chr8 - 78778062 78778430 369 browser details YourSeq 72 1717 1812 3000 90.0% chr11 + 64497865 64497991 127 browser details YourSeq 72 1722 1890 3000 89.2% chr1 + 64752964 64753143 180 browser details YourSeq 71 1730 2098 3000 72.0% chr11 + 85940486 85940650 165 browser details YourSeq 70 1722 1812 3000 91.7% chr2 - 174148397 174148513 117 browser details YourSeq 70 1725 1812 3000 91.0% chr9 + 70184776 70184892 117 browser details YourSeq 69 1722 1812 3000 89.7% chr3 + 94828176 94828295 120 browser details YourSeq 64 1730 1813 3000 93.3% chr11 + 49147983 49148091 109 browser details YourSeq 63 1725 1812 3000 91.0% chr5 - 146362499 146362617 119 browser details YourSeq 63 1722 1804 3000 89.9% chr17 - 3670998 3671109 112 browser details YourSeq 63 1722 1813 3000 89.9% chr6 + 87024238 87024360 123 browser details YourSeq 63 1725 1813 3000 88.9% chr10 + 62802042 62802161 120 browser details YourSeq 62 1730 1813 3000 88.2% chr10 - 93894013 93894095 83 browser details YourSeq 59 1722 1796 3000 89.4% chr15 - 80040090 80040164 75 browser details YourSeq 59 1722 1796 3000 86.5% chr1 - 187849551 187849624 74 browser details YourSeq 59 1730 1812 3000 88.4% chr5 + 72208541 72208623 83 browser details YourSeq 59 1730 1804 3000 90.7% chr2 + 113638351 113638454 104 browser details YourSeq 59 1730 1813 3000 88.4% chr10 + 76240138 76240223 86

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 http://www.alphaknockout.com/ Gene and protein information: Ndufa4 Ndufa4, mitochondrial complex associated [ Mus musculus (house mouse) ] Gene ID: 17992, updated on 12-Aug-2019

Gene summary

Official Symbol Ndufa4 provided by MGI Official Full Name Ndufa4, mitochondrial complex associated provided by MGI Primary source MGI:MGI:107686 See related Ensembl:ENSMUSG00000029632 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as MLRQ Expression Ubiquitous expression in heart adult (RPKM 799.9), kidney adult (RPKM 524.7) and 27 other tissues See more Orthologs human all

Genomic context

Location: 6; 6 A1 See Ndufa4 in Genome Data Viewer

Exon count: 4

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (11900372..11907450, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (11850373..11857446, complement)

Chromosome 6 - NC_000072.6

Page 5 of 7 http://www.alphaknockout.com/

Transcript information: This gene has 5 transcripts

Gene: Ndufa4 ENSMUSG00000029632

Description Ndufa4, mitochondrial complex associated [Source:MGI Symbol;Acc:MGI:107686] Gene Synonyms MLRQ Location Chromosome 6: 11,900,292-11,907,497 reverse strand. GRCm38:CM000999.2 About this gene This gene has 5 transcripts (splice variants), 335 orthologues, 2 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ndufa4-205 ENSMUST00000204978.2 610 82aa ENSMUSP00000144932.1 Protein coding CCDS39427 Q62425 TSL:1 GENCODE basic APPRIS P1

Ndufa4-204 ENSMUST00000204714.1 452 82aa ENSMUSP00000145413.1 Protein coding CCDS39427 Q62425 TSL:3 GENCODE basic APPRIS P1

Ndufa4-201 ENSMUST00000031637.7 468 49aa ENSMUSP00000031637.6 Protein coding - A0A0N4SVQ1 TSL:3 GENCODE basic

Ndufa4-203 ENSMUST00000204084.2 409 49aa ENSMUSP00000145197.1 Protein coding - A0A0N4SVQ1 TSL:3 GENCODE basic

Ndufa4-202 ENSMUST00000203801.1 1232 No protein - Retained intron - - TSL:2

27.21 kb Forward strand 11.895Mb 11.900Mb 11.905Mb 11.910Mb 11.915Mb Phf14-205 >protein coding (Comprehensive set...

Phf14-208 >nonsense mediated decay

Contigs < AC153640.6 Genes < Ndufa4-205protein coding (Comprehensive set...

< Ndufa4-203protein coding

< Ndufa4-201protein coding

< Ndufa4-202retained intron

< Ndufa4-204protein coding

Regulatory Build

11.895Mb 11.900Mb 11.905Mb 11.910Mb 11.915Mb Reverse strand 27.21 kb

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Regulation Legend

CTCF Promoter Promoter Flank

Page 6 of 7 http://www.alphaknockout.com/

Transcript: ENSMUST00000204978

< Ndufa4-205protein coding

Reverse strand 7.19 kb

ENSMUSP00000144... Transmembrane heli... Pfam NADH-ubiquinone reductase complex 1 MLRQ subunit

PANTHER PTHR14256:SF4

NADH-ubiquinone reductase complex 1 MLRQ subunit

All sequence SNPs/i... Sequence variants (dbSNP and all other sources) R

Variant Legend

synonymous variant

Scale bar 0 8 16 24 32 40 48 56 64 72 82

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

Page 7 of 7