https://www.alphaknockout.com

Mouse Naa15 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Naa15 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Naa15 (NCBI Reference Sequence: NM_053089 ; Ensembl: ENSMUSG00000063273 ) is located on Mouse 3. 20 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 20 (Transcript: ENSMUST00000029303). Exon 4~5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Naa15 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-407I18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 4 starts from about 9.44% of the coding region. The knockout of Exon 4~5 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 3462 bp, and the size of intron 5 for 3'-loxP site insertion: 927 bp. The size of effective cKO region: ~1465 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4 5 6 20 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Naa15 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7965bp) | A(28.64% 2281) | C(16.28% 1297) | T(34.6% 2756) | G(20.48% 1631)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 + 51438657 51441656 3000 browser details YourSeq 280 2129 2914 3000 85.4% chr13 + 59698239 59698803 565 browser details YourSeq 265 2128 2921 3000 86.3% chr11 - 106447215 106447964 750 browser details YourSeq 249 2625 2918 3000 92.9% chr11 - 19857153 19886150 28998 browser details YourSeq 246 2179 2918 3000 85.2% chr18 - 42238275 42238794 520 browser details YourSeq 239 2623 2918 3000 92.9% chr1 - 86513304 86513600 297 browser details YourSeq 236 2625 2918 3000 91.3% chr18 - 62612540 62612836 297 browser details YourSeq 236 2624 2917 3000 92.7% chr13 - 114065687 114065978 292 browser details YourSeq 235 2626 2918 3000 91.4% chr4 - 58921792 58922082 291 browser details YourSeq 235 2625 2926 3000 90.5% chr2 + 142896398 142896703 306 browser details YourSeq 234 2624 2918 3000 92.2% chr8 - 42834484 42834791 308 browser details YourSeq 233 2624 2918 3000 90.9% chr15 + 99679926 99680220 295 browser details YourSeq 232 2624 2918 3000 91.1% chr8 + 11358744 11359037 294 browser details YourSeq 230 2625 2923 3000 90.3% chr12 + 100158117 100158413 297 browser details YourSeq 229 2624 2918 3000 91.3% chr5 - 44026949 44027242 294 browser details YourSeq 229 2624 2918 3000 92.7% chr15 - 11173372 11173677 306 browser details YourSeq 229 2625 2918 3000 90.6% chr3 + 32055111 32055402 292 browser details YourSeq 229 2624 2918 3000 88.9% chr11 + 104276922 104277210 289 browser details YourSeq 229 2628 2918 3000 91.4% chr11 + 53212058 53212344 287 browser details YourSeq 228 2625 2919 3000 90.9% chr19 + 58384319 58384611 293

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 + 51443122 51446121 3000 browser details YourSeq 60 2228 2310 3000 92.9% chr15 - 81818746 81818834 89 browser details YourSeq 59 2228 2310 3000 95.4% chr9 - 16929446 16929534 89 browser details YourSeq 59 2191 2312 3000 90.3% chr12 - 110592906 110593029 124 browser details YourSeq 58 2216 2310 3000 88.2% chr9 - 38791676 38791776 101 browser details YourSeq 58 2182 2300 3000 92.7% chr17 + 82887271 82887391 121 browser details YourSeq 57 2228 2310 3000 93.9% chr7 - 90613756 90613844 89 browser details YourSeq 57 2228 2299 3000 92.6% chr15 - 98957057 98957136 80 browser details YourSeq 56 2237 2313 3000 93.8% chr13 - 48623940 48624022 83 browser details YourSeq 55 2228 2313 3000 92.4% chr7 + 138888516 138888607 92 browser details YourSeq 55 2237 2310 3000 95.1% chr14 + 58568157 58568236 80 browser details YourSeq 55 2237 2310 3000 93.7% chr11 + 96136871 96136954 84 browser details YourSeq 53 2238 2304 3000 83.9% chr11 + 75935317 75935378 62 browser details YourSeq 52 2237 2312 3000 90.7% chr13 - 103477126 103477207 82 browser details YourSeq 52 2228 2299 3000 93.4% chr5 + 108633306 108633383 78 browser details YourSeq 52 2228 2299 3000 93.4% chr13 + 20004966 20005043 78 browser details YourSeq 50 2228 2300 3000 94.7% chr1 - 123121413 123121491 79 browser details YourSeq 49 2228 2300 3000 94.6% chr17 - 93867511 93867589 79 browser details YourSeq 48 2240 2300 3000 92.9% chr17 + 69544367 69544431 65 browser details YourSeq 48 2268 2322 3000 94.6% chr13 + 69350000 69350055 56

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Naa15 N(alpha)-acetyltransferase 15, NatA auxiliary subunit [ Mus musculus (house mouse) ] Gene ID: 74838, updated on 12-Aug-2019

Gene summary

Official Symbol Naa15 provided by MGI Official Full Name N(alpha)-acetyltransferase 15, NatA auxiliary subunit provided by MGI Primary source MGI:MGI:1922088 See related Ensembl:ENSMUSG00000063273 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Narg1; mNAT1; ASTBDN; Tbdn-1; 6330400I15; 5730450D16Rik Expression Broad expression in CNS E11.5 (RPKM 15.4), placenta adult (RPKM 10.3) and 22 other tissues See more Orthologs human all

Genomic context

Location: 3; 3 C See Naa15 in Genome Data Viewer

Exon count: 20

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (51416016..51475985)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (51219938..51279907)

Chromosome 3 - NC_000069.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Naa15 ENSMUSG00000063273

Description N(alpha)-acetyltransferase 15, NatA auxiliary subunit [Source:MGI Symbol;Acc:MGI:1922088] Gene Synonyms 5730450D16Rik, ASTBDN, Narg1, Tbdn-1, mNAT1, tubedown Location Chromosome 3: 51,415,148-51,476,507 forward strand. GRCm38:CM000996.2 About this gene This gene has 9 transcripts (splice variants), 258 orthologues, 2 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Naa15- ENSMUST00000029303.12 6090 865aa ENSMUSP00000029303.7 Protein coding CCDS17340 G3X8Y3 TSL:1 201 GENCODE basic APPRIS P1

Naa15- ENSMUST00000193266.5 7480 815aa ENSMUSP00000141433.1 Protein coding - A0A0A6YW80 TSL:1 205 GENCODE basic

Naa15- ENSMUST00000192197.1 712 237aa ENSMUSP00000141886.1 Protein coding - A0A0A6YX86 CDS 5' and 3' 202 incomplete TSL:3

Naa15- ENSMUST00000192419.5 637 126aa ENSMUSP00000141965.1 Protein coding - A0A0A6YXF4 CDS 3' incomplete 203 TSL:5

Naa15- ENSMUST00000193267.1 3073 No - Retained - - TSL:NA 206 protein intron

Naa15- ENSMUST00000195430.1 2975 No - Retained - - TSL:NA 209 protein intron

Naa15- ENSMUST00000194685.1 2319 No - Retained - - TSL:NA 208 protein intron

Naa15- ENSMUST00000193694.1 667 No - Retained - - TSL:3 207 protein intron

Naa15- ENSMUST00000192523.5 629 No - lncRNA - - TSL:3 204 protein

Page 6 of 8 https://www.alphaknockout.com

81.36 kb Forward strand

Genes (Comprehensive set... Naa15-205 >protein coding Rab33b-203 >lncRNA

Naa15-201 >protein coding

Naa15-204 >lncRNA Rab33b-202 >protein coding

Naa15-203 >protein coding Naa15-207 >retained intron Rab33b-201 >protein coding

Gm37646-201 >TEC Naa15-209 >retained intron Naa15-208 >retained intron

Naa15-202 >protein coding Naa15-206 >retained intron

Contigs AC105966.11 > AC102860.13 > < Ndufc1-201protein coding < Gm38160-201TEC (Comprehensive set...

< Ndufc1-203retained intron

< Ndufc1-202lncRNA

< Ndufc1-204protein coding

Regulatory Build

Reverse strand 81.36 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000029303

59.97 kb Forward strand

Naa15-201 >protein coding

ENSMUSP00000029... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Tetratricopeptide-like helical domain superfamily SMART Tetratricopeptide repeat Pfam N-terminal acetyltransferase A, auxiliary subunit

Tetratricopeptide repeat PROSITE profiles Tetratricopeptide repeat

Tetratricopeptide repeat-containing domain PIRSF N-terminal acetyltransferase A, auxiliary subunit PANTHER PTHR22767:SF6

PTHR22767 Gene3D 1.25.40.1040 1.25.40.1010

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 865

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8