https://www.alphaknockout.com

Mouse Nasp Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Nasp conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Nasp (NCBI Reference Sequence: NM_016777 ; Ensembl: ENSMUSG00000028693 ) is located on Mouse 4. 16 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 16 (Transcript: ENSMUST00000030456). Exon 6~7 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Nasp gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-72F14 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null mutation display embryonic lethality before implantation.

Exon 6 starts from about 12.94% of the coding region. The knockout of Exon 6~7 will result in frameshift of the gene. The size of intron 5 for 5'-loxP site insertion: 2248 bp, and the size of intron 7 for 3'-loxP site insertion: 4639 bp. The size of effective cKO region: ~2179 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 5 6 7 16 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Nasp Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8679bp) | A(28.15% 2443) | C(17.94% 1557) | T(33.4% 2899) | G(20.51% 1780)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 - 116612332 116615331 3000 browser details YourSeq 311 240 689 3000 88.2% chrX - 70317449 70317985 537 browser details YourSeq 274 206 699 3000 87.8% chr2 + 44640559 44640958 400 browser details YourSeq 189 205 648 3000 89.9% chrX + 143072794 143073137 344 browser details YourSeq 187 284 523 3000 90.5% chrX - 70317628 70317864 237 browser details YourSeq 179 90 474 3000 93.5% chr16 + 65254793 65255420 628 browser details YourSeq 145 287 544 3000 85.9% chr9 - 59178961 59179203 243 browser details YourSeq 140 295 474 3000 95.0% chr16 + 65255056 65255376 321 browser details YourSeq 135 394 689 3000 94.9% chr17 + 53619309 53619787 479 browser details YourSeq 135 287 450 3000 91.3% chr17 + 48377963 48378116 154 browser details YourSeq 130 1385 1536 3000 92.3% chr13 + 24440932 24441078 147 browser details YourSeq 129 351 625 3000 91.1% chr1 - 85449904 85450267 364 browser details YourSeq 128 1395 1538 3000 92.2% chr8 - 70132189 70132329 141 browser details YourSeq 128 1385 1534 3000 90.8% chr5_JH584297_random - 84311 84452 142 browser details YourSeq 128 1385 1534 3000 90.8% chr5_JH584296_random - 17324 17465 142 browser details YourSeq 128 1395 1540 3000 92.5% chr15 - 80142618 80142762 145 browser details YourSeq 127 1386 1534 3000 90.7% chr5_GL456354_random - 59852 59992 141 browser details YourSeq 126 1393 1540 3000 94.4% chr4 - 125280011 125280158 148 browser details YourSeq 126 274 464 3000 92.0% chr11 - 23523246 23523512 267 browser details YourSeq 124 1395 1538 3000 90.9% chr13 + 59134160 59134301 142

Note: The 3000 bp section upstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 - 116607153 116610152 3000 browser details YourSeq 189 2096 2836 3000 87.7% chr1 - 75161215 75161737 523 browser details YourSeq 170 2298 2942 3000 91.4% chr18 - 56546438 56671521 125084 browser details YourSeq 165 2096 2415 3000 92.4% chr2 + 180295012 180295609 598 browser details YourSeq 162 2097 2423 3000 91.5% chr16 - 4990170 4990505 336 browser details YourSeq 155 2106 2922 3000 82.9% chr10 + 62256149 62256737 589 browser details YourSeq 147 2292 2908 3000 84.6% chr11 - 97073439 97073717 279 browser details YourSeq 145 2378 2942 3000 91.5% chr11 + 107504917 107505551 635 browser details YourSeq 139 2082 2423 3000 85.1% chr8 - 117073861 117074126 266 browser details YourSeq 136 2267 2424 3000 91.4% chr6 + 90050756 90050908 153 browser details YourSeq 136 2096 2419 3000 85.4% chr12 + 83607062 83607232 171 browser details YourSeq 133 2179 2424 3000 94.7% chr4 - 107242619 107243145 527 browser details YourSeq 133 2096 2423 3000 85.2% chr4 + 129767121 129767292 172 browser details YourSeq 133 2291 2469 3000 93.5% chr12 + 83572439 83572653 215 browser details YourSeq 133 2270 2423 3000 95.3% chr10 + 111491230 111491393 164 browser details YourSeq 132 2102 2454 3000 83.8% chr4 - 84513550 84513807 258 browser details YourSeq 132 2290 2443 3000 95.9% chr13 + 96100334 96100488 155 browser details YourSeq 131 2222 2415 3000 94.6% chr10 + 79810878 79811463 586 browser details YourSeq 131 2285 2424 3000 97.2% chr1 + 38506039 38506181 143 browser details YourSeq 130 2290 2792 3000 81.4% chr1 - 179488903 179489060 158

Note: The 3000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Nasp nuclear autoantigenic sperm protein (histone-binding) [ Mus musculus (house mouse) ] Gene ID: 50927, updated on 12-Aug-2019

Gene summary

Official Symbol Nasp provided by MGI Official Full Name nuclear autoantigenic sperm protein (histone-binding) provided by MGI Primary source MGI:MGI:1355328 See related Ensembl:ENSMUSG00000028693 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Epcs32; Nasp-T; AI131596; AI317140; D4Ertd767e; 5033430J04Rik Expression Biased expression in CNS E11.5 (RPKM 97.7), testis adult (RPKM 45.2) and 8 other tissues See more Orthologs human all

Genomic context

Location: 4 D1; 4 53.24 cM See Nasp in Genome Data Viewer

Exon count: 16

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (116601052..116628676, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (116273657..116300556, complement)

Chromosome 4 - NC_000070.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Nasp ENSMUSG00000028693

Description nuclear autoantigenic sperm protein (histone-binding) [Source:MGI Symbol;Acc:MGI:1355328] Gene Synonyms 5033430J04Rik, D4Ertd767e, Epcs32, Nasp-T Location Chromosome 4: 116,601,052-116,627,941 reverse strand. GRCm38:CM000997.2 About this gene This gene has 8 transcripts (splice variants), 266 orthologues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Nasp-201 ENSMUST00000030456.13 2520 773aa ENSMUSP00000030456.7 Protein coding CCDS18513 B1AU75 TSL:1 GENCODE basic

Nasp-202 ENSMUST00000030457.11 2118 448aa ENSMUSP00000030457.5 Protein coding CCDS38852 B1AU76 TSL:1 GENCODE basic APPRIS P1

Nasp-203 ENSMUST00000081182.4 1461 421aa ENSMUSP00000079946.4 Protein coding CCDS71451 Q99MD9 TSL:1 GENCODE basic

Nasp-205 ENSMUST00000134038.1 3153 No protein - lncRNA - - TSL:1

Nasp-208 ENSMUST00000155398.7 1216 No protein - lncRNA - - TSL:2

Nasp-207 ENSMUST00000154811.7 1017 No protein - lncRNA - - TSL:2

Nasp-204 ENSMUST00000130363.1 905 No protein - lncRNA - - TSL:5

Nasp-206 ENSMUST00000148436.1 560 No protein - lncRNA - - TSL:2

Page 6 of 8 https://www.alphaknockout.com

46.89 kb Forward strand 116.60Mb 116.61Mb 116.62Mb 116.63Mb Gpbp1l1-201 >protein codCincgdc17-205 >lncRNA (Comprehensive set...

Gpbp1l1-202 >protein codingCcdc17-204 >lncRNA

Ccdc17-202 >lncRNA

Ccdc17-201 >protein coding

Ccdc17-203 >lncRNA

Contigs AL669953.7 > AL831786.7 > Genes (Comprehensive set... < C530005A16Rik-201lncRNA < Nasp-202protein coding < Akr1a1-201protein coding

< Nasp-203protein coding < Akr1a1-203lncRNA

< Nasp-201protein coding < Akr1a1-202lncRNA

< Nasp-208lncRNA < Nasp-207lncRNA

< Nasp-204lncRNA < Nasp-205lncRNA

< Nasp-206lncRNA

Regulatory Build

116.60Mb 116.61Mb 116.62Mb 116.63Mb Reverse strand 46.89 kb

Regulation Legend

CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000030456

< Nasp-201protein coding

Reverse strand 26.33 kb

ENSMUSP00000030... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Tetratricopeptide-like helical domain superfamily SMART Tetratricopeptide repeat Pfam Tetratricopeptide, SHNi-TPR domain PROSITE profiles Tetratricopeptide repeat-containing domain

Tetratricopeptide repeat PANTHER PTHR15081 Gene3D Tetratricopeptide-like helical domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 773

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8