https://www.alphaknockout.com

Mouse Pjvk Knockout Project (CRISPR/Cas9)

Objective: To create a Pjvk knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Pjvk (NCBI Reference Sequence: NM_001080711.2 ; Ensembl: ENSMUSG00000075267 ) is located on Mouse 2. 6 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 6 (Transcript: ENSMUST00000099986). Exon 1~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a point mutation display increased auditory thresholds.

Exon 1 starts from about 0.09% of the coding region. Exon 1~6 covers 100.0% of the coding region. The size of effective KO region: ~8121 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6

Legends Exon of mouse Pjvk Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.45% 529) | C(19.35% 387) | T(32.25% 645) | G(21.95% 439)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.25% 545) | C(19.9% 398) | T(29.9% 598) | G(22.95% 459)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 + 76648423 76650422 2000 browser details YourSeq 330 985 1333 2000 97.5% chr12 - 93565619 93565968 350 browser details YourSeq 91 359 486 2000 85.3% chr4 + 108901760 108901881 122 browser details YourSeq 86 378 534 2000 91.4% chr19 - 44736396 44736597 202 browser details YourSeq 85 359 535 2000 89.8% chr19 - 26915396 26915603 208 browser details YourSeq 85 359 522 2000 85.9% chr4 + 120844157 120844324 168 browser details YourSeq 82 359 492 2000 92.8% chr3 - 101885189 101885545 357 browser details YourSeq 82 359 528 2000 92.0% chr19 - 24911607 24911827 221 browser details YourSeq 82 359 480 2000 82.2% chr19 - 22751246 22751353 108 browser details YourSeq 82 353 850 2000 75.0% chr6 + 38375485 38375669 185 browser details YourSeq 80 372 486 2000 92.7% chr9 - 87063312 87177679 114368 browser details YourSeq 80 366 486 2000 83.6% chr5 + 141551858 141551963 106 browser details YourSeq 80 359 486 2000 80.8% chr3 + 122463718 122463844 127 browser details YourSeq 79 378 522 2000 91.6% chr2 - 105458131 105458321 191 browser details YourSeq 79 372 486 2000 81.7% chr2 - 105449827 105449927 101 browser details YourSeq 79 371 486 2000 83.6% chr10 + 94303064 94303165 102 browser details YourSeq 78 364 486 2000 86.2% chrX - 56699374 56699492 119 browser details YourSeq 78 358 486 2000 81.8% chr11 + 83907010 83907132 123 browser details YourSeq 77 372 486 2000 86.4% chr10 + 111449655 111449761 107 browser details YourSeq 76 359 484 2000 81.5% chr2 + 129654526 129654636 111

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 + 76658544 76660543 2000 browser details YourSeq 427 255 1226 2000 84.3% chr12 + 31379650 31380676 1027 browser details YourSeq 414 276 1316 2000 81.9% chr17 + 71239437 71240397 961 browser details YourSeq 391 406 1300 2000 87.5% chrX + 159517475 159529111 11637 browser details YourSeq 385 295 1128 2000 85.2% chr7 + 92769535 92770420 886 browser details YourSeq 383 363 1066 2000 87.7% chr7 - 18261514 18262229 716 browser details YourSeq 383 278 1306 2000 88.0% chr2 + 70541399 70542639 1241 browser details YourSeq 371 272 1049 2000 87.6% chr9 + 23017386 23018240 855 browser details YourSeq 370 281 1044 2000 85.8% chr8 - 61960837 61961581 745 browser details YourSeq 344 330 1056 2000 81.4% chr7 + 39435684 39436384 701 browser details YourSeq 338 273 956 2000 89.4% chr1 + 128301788 128437632 135845 browser details YourSeq 334 357 1295 2000 83.9% chr14 - 51899137 51899997 861 browser details YourSeq 333 271 977 2000 87.9% chr3 + 53489909 53490696 788 browser details YourSeq 329 274 977 2000 85.9% chr2 + 151619624 151620404 781 browser details YourSeq 325 266 967 2000 87.0% chr10 + 51100913 51101653 741 browser details YourSeq 324 322 1075 2000 86.9% chr2 + 181242584 181243356 773 browser details YourSeq 321 265 944 2000 86.0% chr12 - 56278822 56279519 698 browser details YourSeq 320 278 1308 2000 83.0% chr3 + 57732328 57733134 807 browser details YourSeq 312 273 1350 2000 84.0% chr1 - 137988990 137989941 952 browser details YourSeq 308 310 1049 2000 88.3% chr6 - 147068049 147068843 795

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Pjvk pejvakin [ Mus musculus (house mouse) ] Gene ID: 381375, updated on 26-Jun-2020

Gene summary

Official Symbol Pjvk provided by MGI Official Full Name pejvakin provided by MGI Primary source MGI:MGI:2685847 See related Ensembl:ENSMUSG00000075267 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Dfnb59; Gm1001 Expression Low expression observed in reference dataset See more Orthologs human all

Genomic context

Location: 2; 2 C3 See Pjvk in Genome Data Viewer Exon count: 6

Annotation release Status Assembly Chr Location

108.20200622 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (76650273..76658554)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (76488330..76496613)

Chromosome 2 - NC_000068.7

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Pjvk ENSMUSG00000075267

Description pejvakin [Source:MGI Symbol;Acc:MGI:2685847] Gene Synonyms Dfnb59, LOC381375, pejvakin Location : 76,648,476-76,658,556 forward strand. GRCm38:CM000995.2 About this gene This gene has 3 transcripts (splice variants), 201 orthologues, 8 paralogues, is a member of 1 Ensembl protein family and is associated with 5 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Pjvk-201 ENSMUST00000099986.2 1219 352aa ENSMUSP00000097566.2 Protein coding CCDS38153 Q0ZLH2 TSL:1 GENCODE basic APPRIS P1

Pjvk-202 ENSMUST00000144817.7 724 190aa ENSMUSP00000119264.1 Protein coding - B0R0L9 CDS 3' incomplete TSL:3

Pjvk-203 ENSMUST00000153471.1 363 73aa ENSMUSP00000114409.1 Protein coding - B0R0L8 CDS 3' incomplete TSL:3

30.08 kb Forward strand 76.64Mb 76.65Mb 76.66Mb (Comprehensive set... Pjvk-203 >protein coding

Pjvk-202 >protein coding

Pjvk-201 >protein coding

Contigs AL928867.9 > Genes < Prkra-201protein coding < Fkbp7-204nonsense mediated decay (Comprehensive set...

< Prkra-202processed transcript < Fkbp7-203processed transcript

< Fkbp7-201protein coding

Regulatory Build

76.64Mb 76.65Mb 76.66Mb Reverse strand 30.08 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000099986

8.28 kb Forward strand

Pjvk-201 >protein coding

ENSMUSP00000097... Low complexity (Seg) Pfam Gasdermin, pore forming domain PANTHER PTHR16399:SF10

Gasdermin

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend synonymous variant

Scale bar 0 40 80 120 160 200 240 280 352

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8