https://www.alphaknockout.com

Mouse Npat Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Npat conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Npat (NCBI Reference Sequence: NM_001081152 ; Ensembl: ENSMUSG00000033054 ) is located on Mouse 9. 18 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 18 (Transcript: ENSMUST00000035850). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Npat gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-407K6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous null embryos fail to proceed from the morula to blastocyst stage and have an uncompacted appearance.

Exon 2 starts from about 0.89% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 7979 bp, and the size of intron 2 for 3'-loxP site insertion: 3623 bp. The size of effective cKO region: ~619 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 18 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Npat Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7119bp) | A(30.4% 2164) | C(17.67% 1258) | T(33.24% 2366) | G(18.7% 1331)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr9 + 53541919 53544918 3000 browser details YourSeq 206 884 1387 3000 90.2% chr17 - 56243482 56243734 253 browser details YourSeq 201 884 1555 3000 88.2% chr7 + 45943210 45943568 359 browser details YourSeq 197 886 1222 3000 92.1% chr9 - 57575333 57575547 215 browser details YourSeq 193 884 1553 3000 87.7% chr19 + 7219271 7219639 369 browser details YourSeq 193 884 1078 3000 99.5% chr19 + 6603050 6603244 195 browser details YourSeq 192 884 1393 3000 88.8% chr8 + 42407841 42408096 256 browser details YourSeq 192 890 1541 3000 89.2% chr2 + 121330136 121330460 325 browser details YourSeq 190 884 1075 3000 99.5% chr7 + 19604796 19604987 192 browser details YourSeq 189 884 1074 3000 99.5% chr16 + 18295514 18295704 191 browser details YourSeq 189 884 1074 3000 99.5% chr15 + 50930679 50930869 191 browser details YourSeq 189 884 1075 3000 99.5% chr11 + 72726893 72727086 194 browser details YourSeq 188 884 1073 3000 99.5% chr5 - 105047686 105047875 190 browser details YourSeq 188 884 1073 3000 99.5% chr19 - 45829879 45830068 190 browser details YourSeq 188 884 1073 3000 99.5% chr17 - 32837128 32837317 190 browser details YourSeq 188 884 1073 3000 99.5% chr14 - 70118097 70118286 190 browser details YourSeq 188 884 1073 3000 99.5% chr14 - 18587976 18588165 190 browser details YourSeq 188 884 1073 3000 99.5% chr10 - 80324243 80324432 190 browser details YourSeq 188 884 1073 3000 99.5% chr10 - 5665395 5665584 190 browser details YourSeq 188 884 1073 3000 99.5% chr5 + 139917509 139917698 190

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr9 + 53545538 53548537 3000 browser details YourSeq 206 201 493 3000 88.5% chr11 + 70826138 70826750 613 browser details YourSeq 178 166 494 3000 89.1% chrX + 56813515 56813848 334 browser details YourSeq 129 139 327 3000 86.9% chr15 - 42264566 42264888 323 browser details YourSeq 129 169 332 3000 90.7% chr12 - 54751579 54751745 167 browser details YourSeq 126 169 333 3000 91.0% chr12 - 80126167 80126336 170 browser details YourSeq 125 166 336 3000 88.5% chr9 + 50580915 50581092 178 browser details YourSeq 125 162 336 3000 87.1% chr19 + 30402878 30403063 186 browser details YourSeq 121 170 339 3000 87.2% chr16 - 44131403 44131579 177 browser details YourSeq 120 169 335 3000 87.2% chr5 - 23466186 23466360 175 browser details YourSeq 120 169 337 3000 86.4% chr2 + 120180478 120180667 190 browser details YourSeq 118 184 339 3000 90.0% chr16 - 17266922 17267081 160 browser details YourSeq 115 166 330 3000 85.8% chr2 - 121488467 121488632 166 browser details YourSeq 115 169 327 3000 88.4% chr18 - 84687875 84688032 158 browser details YourSeq 115 173 330 3000 91.4% chr13 - 108032593 108032752 160 browser details YourSeq 115 173 335 3000 87.1% chr10 + 75268501 75268671 171 browser details YourSeq 114 182 335 3000 90.9% chr12 - 43656140 43656301 162 browser details YourSeq 114 181 335 3000 88.0% chr7 + 16977575 16977734 160 browser details YourSeq 114 169 335 3000 85.8% chr18 + 76340157 76340332 176 browser details YourSeq 112 169 337 3000 86.4% chr14 - 75755517 75755691 175

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Npat nuclear protein in the AT region [ Mus musculus (house mouse) ] Gene ID: 244879, updated on 12-Aug-2019

Gene summary

Official Symbol Npat provided by MGI Official Full Name nuclear protein in the AT region provided by MGI Primary source MGI:MGI:107605 See related Ensembl:ENSMUSG00000033054 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI427561; BB112559; 6820401K01 Expression Ubiquitous expression in CNS E11.5 (RPKM 4.3), placenta adult (RPKM 4.3) and 26 other tissues See more Orthologs human all

Genomic context

Location: 9 A5.3; 9 29.12 cM See Npat in Genome Data Viewer

Exon count: 20

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 9 NC_000075.6 (53536889..53575627)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 9 NC_000075.5 (53345152..53383732)

Chromosome 9 - NC_000075.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Npat ENSMUSG00000033054

Description nuclear protein in the AT region [Source:MGI Symbol;Acc:MGI:107605] Location Chromosome 9: 53,537,047-53,574,342 forward strand. GRCm38:CM001002.2 About this gene This gene has 2 transcripts (splice variants), 192 orthologues, is a member of 1 Ensembl protein family and is associated with 30 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Npat-201 ENSMUST00000035850.7 6062 1420aa ENSMUSP00000048709.7 Protein coding CCDS40637 Q8BMA5 TSL:1 GENCODE basic APPRIS P1

Npat-202 ENSMUST00000148336.1 554 No protein - Retained intron - - TSL:3

57.30 kb Forward strand

53.53Mb 53.54Mb 53.55Mb 53.56Mb 53.57Mb 53.58Mb (Comprehensive set... Npat-201 >protein coding

Npat-202 >retained intron

Contigs < AC079869.22 < AC156640.2

Genes < Atm-201protein coding < Acat1-201protein coding (Comprehensive set...

< Atm-206protein coding < Acat1-203lncRNA

< Atm-204retained intron

< Atm-205protein coding

Regulatory Build

53.53Mb 53.54Mb 53.55Mb 53.56Mb 53.57Mb 53.58Mb Reverse strand 57.30 kb

Regulation Legend

CTCF Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000035850

37.30 kb Forward strand

Npat-201 >protein coding

ENSMUSP00000048... MobiDB lite Low complexity (Seg) SMART LIS1 homology motif Pfam Protein NPAT, C-terminal domain PROSITE profiles LIS1 homology motif PANTHER PTHR15087:SF14

PTHR15087

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1200 1420

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7