https://www.alphaknockout.com

Mouse Naaa Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Naaa conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Naaa (NCBI Reference Sequence: NM_025972 ; Ensembl: ENSMUSG00000029413 ) is located on Mouse 5. 11 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 10 (Transcript: ENSMUST00000113102). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Naaa gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-422C17 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 35.64% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 4348 bp, and the size of intron 3 for 3'-loxP site insertion: 4311 bp. The size of effective cKO region: ~627 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 11 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Naaa Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7127bp) | A(27.14% 1934) | C(21.2% 1511) | T(28.13% 2005) | G(23.53% 1677)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 92272821 92275820 3000 browser details YourSeq 233 2317 2628 3000 89.8% chr15 + 84042117 84042472 356 browser details YourSeq 231 2314 2610 3000 91.4% chr7 - 41565185 41565487 303 browser details YourSeq 226 2318 2627 3000 90.1% chr11 - 20191037 20191361 325 browser details YourSeq 226 2315 2610 3000 91.9% chr1 + 134452669 134453001 333 browser details YourSeq 220 2315 2628 3000 90.5% chr12 - 71108040 71108383 344 browser details YourSeq 220 2314 2628 3000 86.5% chr2 + 32114738 32115042 305 browser details YourSeq 219 2315 2630 3000 90.1% chr1 + 86289305 86649455 360151 browser details YourSeq 218 2314 2628 3000 86.1% chr14 + 66968427 66968733 307 browser details YourSeq 217 2315 2628 3000 86.3% chr3 + 135668447 135668748 302 browser details YourSeq 216 2308 2628 3000 88.9% chr14 + 73068954 73069300 347 browser details YourSeq 215 2318 2610 3000 87.8% chr7 + 25871087 25871381 295 browser details YourSeq 215 2314 2610 3000 89.1% chr2 + 60513080 60513403 324 browser details YourSeq 214 2315 2628 3000 87.9% chr3 + 102795801 102796148 348 browser details YourSeq 212 2308 2610 3000 85.7% chr3 - 36283431 36283735 305 browser details YourSeq 212 2314 2628 3000 87.5% chr11 - 83267529 83267839 311 browser details YourSeq 212 2317 2628 3000 88.5% chr16 + 93635202 93635520 319 browser details YourSeq 211 2313 2605 3000 86.6% chr5 - 134450219 134450509 291 browser details YourSeq 211 2323 2628 3000 83.8% chr3 + 95831567 95831860 294 browser details YourSeq 210 2318 2628 3000 86.6% chr7 - 28017675 28017977 303

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 92269194 92272193 3000 browser details YourSeq 403 2043 3000 3000 92.6% chr4 - 86884453 87074964 190512 browser details YourSeq 253 2039 2660 3000 87.2% chr3 - 36577982 36578322 341 browser details YourSeq 239 2039 2663 3000 88.9% chr5 + 123677744 123678221 478 browser details YourSeq 236 2039 2652 3000 90.0% chr1 - 132093804 132094398 595 browser details YourSeq 235 2047 2394 3000 93.7% chr6 + 34397535 34397892 358 browser details YourSeq 234 2039 2652 3000 85.9% chr11 + 57579606 57579936 331 browser details YourSeq 233 2039 2660 3000 86.1% chr9 - 66096683 66097148 466 browser details YourSeq 233 2039 2385 3000 94.7% chr8 + 13327327 13327847 521 browser details YourSeq 227 2037 2663 3000 94.6% chr4 + 149671043 149671681 639 browser details YourSeq 220 2039 2394 3000 92.1% chr10 - 61487658 61488295 638 browser details YourSeq 208 2039 2639 3000 93.7% chr3 + 127446547 127447170 624 browser details YourSeq 205 2039 2600 3000 94.4% chr11 + 119581970 119582606 637 browser details YourSeq 203 2037 2656 3000 87.9% chr15 + 100044330 100044714 385 browser details YourSeq 201 2039 2342 3000 95.5% chr16 + 37781919 37782555 637 browser details YourSeq 200 2063 2623 3000 94.0% chr9 + 120111755 120112366 612 browser details YourSeq 198 2063 2408 3000 93.1% chr6 + 52287982 52288464 483 browser details YourSeq 195 1593 2209 3000 85.8% chr19 - 10977023 10977521 499 browser details YourSeq 192 1600 2562 3000 84.9% chr4 - 133509692 133510168 477 browser details YourSeq 192 2039 2322 3000 90.9% chr2 + 167998815 167999053 239

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and protein information: Naaa N-acylethanolamine acid amidase [ Mus musculus (house mouse) ] Gene ID: 67111, updated on 19-Oct-2019

Gene summary

Official Symbol Naaa provided by MGI Official Full Name N-acylethanolamine acid amidase provided by MGI Primary source MGI:MGI:1914361 See related Ensembl:ENSMUSG00000029413 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Asahl; 2210023K21Rik; 3830414F09Rik Expression Ubiquitous expression in mammary gland adult (RPKM 35.5), lung adult (RPKM 32.3) and 25 other tissues See more Orthologs human all

Genomic context

Location: 5; 5 E2 See Naaa in Genome Data Viewer

Exon count: 11

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (92257660..92278181, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (92686686..92707207, complement)

Chromosome 5 - NC_000071.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Naaa ENSMUSG00000029413

Description N-acylethanolamine acid amidase [Source:MGI Symbol;Acc:MGI:1914361] Gene Synonyms 2210023K21Rik, 3830414F09Rik, Asahl Location Chromosome 5: 92,257,659-92,278,170 reverse strand. GRCm38:CM000998.2 About this gene This gene has 5 transcripts (splice variants), 195 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Naaa-201 ENSMUST00000113102.9 2414 362aa ENSMUSP00000108726.3 Protein coding CCDS19428 Q9D7V9 TSL:1 GENCODE basic APPRIS P3

Naaa-202 ENSMUST00000159345.7 1629 360aa ENSMUSP00000124582.1 Protein coding CCDS51545 G3XA18 TSL:1 GENCODE basic APPRIS ALT2

Naaa-205 ENSMUST00000175656.1 652 217aa ENSMUSP00000135610.1 Protein coding - H3BL13 CDS 5' and 3' incomplete TSL:3

Naaa-203 ENSMUST00000159732.1 1632 No protein - Retained intron - - TSL:1

Naaa-204 ENSMUST00000161212.1 1519 No protein - Retained intron - - TSL:1

40.51 kb Forward strand 92.25Mb 92.26Mb 92.27Mb 92.28Mb Contigs AC125542.4 > AC122365.4 > (Comprehensive set... < Ppef2-202protein coding < Naaa-201protein coding < Sdad1-201protein coding

< Ppef2-201protein coding < Naaa-203retained intron < Naaa-204retained intron < Sdad1-203protein coding

< Naaa-202protein coding

< Naaa-205protein coding

Regulatory Build

92.25Mb 92.26Mb 92.27Mb 92.28Mb Reverse strand 40.51 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000113102

< Naaa-201protein coding

Reverse strand 20.51 kb

ENSMUSP00000108... Transmembrane heli... PDB-ENSP mappings Low complexity (Seg) Cleavage site (Sign... Pfam Acid ceramidase, N-terminal Choloylglycine hydrolase/NAAA C-terminal

PIRSF Acid ceramidase-like PANTHER PTHR28583:SF4

PTHR28583 Gene3D 3.60.60.10 CDD cd01903

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 362

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7