https://www.alphaknockout.com

Mouse Pdzk1ip1 Knockout Project (CRISPR/Cas9)

Objective: To create a Pdzk1ip1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Pdzk1ip1 (NCBI Reference Sequence: NM_001164557 ; Ensembl: ENSMUSG00000028716 ) is located on Mouse 4. 4 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 4 (Transcript: ENSMUST00000171877). Exon 1~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 1 starts from about 0.18% of the coding region. Exon 1~4 covers 100.0% of the coding region. The size of effective KO region: ~4601 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4

Legends Exon of mouse Pdzk1ip1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(23.65% 473) | C(24.7% 494) | T(25.4% 508) | G(26.25% 525)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.15% 443) | C(24.55% 491) | T(29.1% 582) | G(24.2% 484)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 115086946 115088945 2000 browser details YourSeq 150 152 1364 2000 89.1% chr4 - 115087097 115088309 1213 browser details YourSeq 31 460 495 2000 97.2% chr1 + 48602393 48602446 54 browser details YourSeq 27 469 496 2000 100.0% chrX - 107347226 107347258 33 browser details YourSeq 27 469 496 2000 100.0% chr6 - 40924240 40924272 33 browser details YourSeq 27 469 496 2000 100.0% chr6 - 8072895 8072927 33 browser details YourSeq 27 469 496 2000 100.0% chrX + 114789744 114789776 33 browser details YourSeq 27 469 496 2000 100.0% chrX + 106785845 106785877 33 browser details YourSeq 27 469 496 2000 100.0% chr10 + 48561455 48561487 33 browser details YourSeq 25 470 497 2000 96.3% chr1 + 37937434 37937467 34 browser details YourSeq 23 474 496 2000 100.0% chr4 + 94233231 94233253 23 browser details YourSeq 23 471 496 2000 96.0% chr1 + 47553941 47553972 32 browser details YourSeq 22 475 496 2000 100.0% chr4 - 18214951 18214972 22 browser details YourSeq 22 859 884 2000 92.4% chr1 - 19418119 19418144 26 browser details YourSeq 22 475 496 2000 100.0% chr9 + 88426145 88426166 22 browser details YourSeq 22 475 496 2000 100.0% chr10 + 106165151 106165172 22 browser details YourSeq 21 476 496 2000 100.0% chr6 + 41723276 41723296 21 browser details YourSeq 21 1088 1108 2000 100.0% chr1 + 72353967 72353987 21 browser details YourSeq 20 475 494 2000 100.0% chr1 - 49096359 49096378 20

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 115093547 115095546 2000 browser details YourSeq 98 518 716 2000 98.1% chr1 + 4811564 4811941 378 browser details YourSeq 74 530 603 2000 100.0% chr2 + 147510991 147511064 74 browser details YourSeq 31 104 155 2000 94.5% chr9 - 54817724 54817776 53 browser details YourSeq 25 153 182 2000 96.3% chr2 + 37523350 37523381 32 browser details YourSeq 25 1686 1715 2000 85.2% chr1 + 3700715 3700742 28 browser details YourSeq 23 1692 1715 2000 100.0% chrX - 91435882 91435907 26 browser details YourSeq 23 1692 1715 2000 100.0% chr17 - 74168814 74168839 26 browser details YourSeq 23 1686 1708 2000 100.0% chr14 + 105877061 105877083 23 browser details YourSeq 22 1613 1634 2000 100.0% chr10 + 60746810 60746831 22 browser details YourSeq 21 798 818 2000 100.0% chr12 + 41927806 41927826 21 browser details YourSeq 20 1099 1118 2000 100.0% chr1 + 13780031 13780050 20

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Pdzk1ip1 PDZK1 interacting protein 1 [ Mus musculus (house mouse) ] Gene ID: 67182, updated on 21-Oct-2019

Gene summary

Official Symbol Pdzk1ip1 provided by MGI Official Full Name PDZK1 interacting protein 1 provided by MGI Primary source MGI:MGI:1914432 See related Ensembl:ENSMUSG00000028716 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Map17; AU046042; 0610007F13Rik; 2700030M23Rik Expression Biased expression in kidney adult (RPKM 657.0), colon adult (RPKM 64.0) and 3 other tissues See more Orthologs human all

Genomic context

Location: 4; 4 D1 See Pdzk1ip1 in Genome Data Viewer Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (115088708..115093894)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (114761313..114766499)

Chromosome 4 - NC_000070.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Pdzk1ip1 ENSMUSG00000028716

Description PDZK1 interacting protein 1 [Source:MGI Symbol;Acc:MGI:1914432] Gene Synonyms 0610007F13Rik, 2700030M23Rik, Map17 Location Chromosome 4: 115,088,708-115,093,899 forward strand. GRCm38:CM000997.2 About this gene This gene has 8 transcripts (splice variants), 123 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Pdzk1ip1-207 ENSMUST00000171877.7 1129 181aa ENSMUSP00000128118.1 Protein coding CCDS51272 G3UW41 TSL:1 GENCODE basic APPRIS ALT2

Pdzk1ip1-201 ENSMUST00000030488.2 1119 114aa ENSMUSP00000030488.2 Protein coding CCDS18487 Q9CQH0 TSL:1 GENCODE basic APPRIS P3

Pdzk1ip1-202 ENSMUST00000106548.8 887 114aa ENSMUSP00000102158.2 Protein coding CCDS18487 Q9CQH0 TSL:1 GENCODE basic APPRIS P3

Pdzk1ip1-208 ENSMUST00000177647.7 826 114aa ENSMUSP00000136049.1 Protein coding CCDS18487 Q9CQH0 TSL:3 GENCODE basic APPRIS P3

Pdzk1ip1-205 ENSMUST00000146578.1 804 No protein - lncRNA - - TSL:1

Pdzk1ip1-204 ENSMUST00000139710.7 670 No protein - lncRNA - - TSL:2

Pdzk1ip1-203 ENSMUST00000124491.1 375 No protein - lncRNA - - TSL:2

Pdzk1ip1-206 ENSMUST00000149245.1 361 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

25.19 kb Forward strand 115.08Mb 115.09Mb 115.10Mb (Comprehensive set... Pdzk1ip1-208 >protein coding

Pdzk1ip1-207 >protein coding

Pdzk1ip1-201 >protein coding

Pdzk1ip1-202 >protein coding

Pdzk1ip1-203 >lncRNA

Pdzk1ip1-204 >lncRNA

Pdzk1ip1-206 >lncRNA

Pdzk1ip1-205 >lncRNA

Contigs AL670035.15 > AL645473.12 > Regulatory Build

115.08Mb 115.09Mb 115.10Mb Reverse strand 25.19 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000171877

5.19 kb Forward strand

Pdzk1ip1-207 >protein coding

ENSMUSP00000128... Transmembrane heli... Low complexity (Seg) Pfam PDZK1-interacting protein 1/SMIM24 PANTHER PDZK1-interacting protein 1

PDZK1-interacting protein 1/SMIM24

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 181

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9