https://www.alphaknockout.com

Mouse Pacsin1 Knockout Project (CRISPR/Cas9)

Objective: To create a Pacsin1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Pacsin1 (NCBI Reference Sequence: NM_011861 ; Ensembl: ENSMUSG00000040276 ) is located on Mouse 17. 10 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 10 (Transcript: ENSMUST00000232437). Exon 2~10 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygotes for a gene trapped allele show altered type I interferon responses in plasmacytoid dendritic cells. Homozygotes for a null allele show impaired synaptic vesicle formation, synaptic transmission and neuronal network activity, and develop generalized seizures with tonic-clonic convulsions.

Exon 2 starts from about 0.08% of the coding region. Exon 2~10 covers 100.0% of the coding region. The size of effective KO region: ~6645 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9 10

Legends Exon of mouse Pacsin1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.75% 435) | C(25.6% 512) | T(21.4% 428) | G(31.25% 625)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.3% 426) | C(32.0% 640) | T(21.85% 437) | G(24.85% 497)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr17 + 27699889 27701888 2000 browser details YourSeq 96 82 227 2000 92.9% chr17 + 27699638 27699900 263 browser details YourSeq 70 1 77 2000 92.0% chr17 + 27699824 27699898 75 browser details YourSeq 55 246 396 2000 92.2% chr17 - 13811382 13811542 161 browser details YourSeq 54 245 413 2000 70.6% chr14 + 52416102 52416257 156 browser details YourSeq 53 245 396 2000 93.6% chr10 + 7073421 7073611 191 browser details YourSeq 50 274 403 2000 75.6% chr13 + 96678238 96678400 163 browser details YourSeq 47 248 413 2000 78.6% chr13 - 109177920 109178107 188 browser details YourSeq 46 246 389 2000 94.4% chr1 - 26219395 26219573 179 browser details YourSeq 44 298 391 2000 76.2% chr9 - 73750670 73750765 96 browser details YourSeq 44 288 348 2000 86.9% chr12 - 40049769 40049830 62 browser details YourSeq 43 309 403 2000 67.1% chr16 - 90120590 90120677 88 browser details YourSeq 42 335 411 2000 80.9% chr16 - 37873713 37873790 78 browser details YourSeq 41 349 413 2000 81.6% chr9 + 111441337 111441401 65 browser details YourSeq 40 255 391 2000 91.7% chr1 - 126142569 126142706 138 browser details YourSeq 40 255 404 2000 90.0% chr5 + 104943486 104943636 151 browser details YourSeq 38 314 389 2000 75.0% chr13 - 102068249 102068324 76 browser details YourSeq 38 314 389 2000 75.0% chr12 - 112197390 112197465 76 browser details YourSeq 38 248 389 2000 75.4% chr14 + 102462426 102462592 167 browser details YourSeq 37 309 399 2000 70.4% chr10 - 58822628 58822718 91

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr17 + 27708534 27710533 2000 browser details YourSeq 50 130 335 2000 94.7% chr3 - 51228023 51228454 432 browser details YourSeq 34 129 439 2000 51.8% chr17 + 80004879 80005012 134 browser details YourSeq 31 398 436 2000 97.0% chr12 - 58745430 58745474 45 browser details YourSeq 29 1159 1190 2000 96.9% chr10 - 56110164 56110209 46 browser details YourSeq 25 1161 1190 2000 93.2% chr7 - 72688856 72688886 31 browser details YourSeq 24 1154 1177 2000 100.0% chr11 - 75340958 75340981 24 browser details YourSeq 24 398 422 2000 100.0% chr5 + 142699102 142699128 27

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and protein information: Pacsin1 protein kinase C and casein kinase substrate in neurons 1 [ Mus musculus (house mouse) ] Gene ID: 23969, updated on 12-Aug-2019

Gene summary

Official Symbol Pacsin1 provided by MGI Official Full Name protein kinase C and casein kinase substrate in neurons 1 provided by MGI Primary source MGI:MGI:1345181 See related Ensembl:ENSMUSG00000040276 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as H74; syndapin; mKIAA1379; A830061D09Rik Expression Biased expression in frontal lobe adult (RPKM 48.4), cerebellum adult (RPKM 46.5) and 8 other tissuesS ee more Orthologs human all

Genomic context

Location: 17; 17 A3.3 See Pacsin1 in Genome Data Viewer Exon count: 11

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (27655591..27711118)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (27792627..27848051)

Chromosome 17 - NC_000083.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 11 transcripts

Gene: Pacsin1 ENSMUSG00000040276

Description protein kinase C and casein kinase substrate in neurons 1 [Source:MGI Symbol;Acc:MGI:1345181] Gene Synonyms A830061D09Rik, Syndapin I Location Chromosome 17: 27,655,509-27,711,482 forward strand. GRCm38:CM001010.2 About this gene This gene has 11 transcripts (splice variants), 251 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 19 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Pacsin1- ENSMUST00000232437.1 4566 441aa ENSMUSP00000155999.1 Protein coding CCDS28567 Q543Y7 GENCODE basic 211 Q61644 APPRIS P1

Pacsin1- ENSMUST00000231669.1 4209 441aa ENSMUSP00000156003.1 Protein coding CCDS28567 Q543Y7 GENCODE basic 208 Q61644 APPRIS P1

Pacsin1- ENSMUST00000045896.10 4176 441aa ENSMUSP00000044168.3 Protein coding CCDS28567 Q543Y7 TSL:1 201 Q61644 GENCODE basic APPRIS P1

Pacsin1- ENSMUST00000114873.7 1857 441aa ENSMUSP00000110523.1 Protein coding CCDS28567 Q543Y7 TSL:1 204 Q61644 GENCODE basic APPRIS P1

Pacsin1- ENSMUST00000097360.2 1800 441aa ENSMUSP00000094973.2 Protein coding CCDS28567 Q543Y7 TSL:1 202 Q61644 GENCODE basic APPRIS P1

Pacsin1- ENSMUST00000114872.8 972 320aa ENSMUSP00000110522.2 Protein coding - A0A384DVB1 CDS 5' 203 incomplete TSL:1

Pacsin1- ENSMUST00000231236.1 672 109aa ENSMUSP00000155877.1 Protein coding - A0A338P6K4 CDS 3' 206 incomplete

Pacsin1- ENSMUST00000231854.1 1786 No - Retained - - - 209 protein intron

Pacsin1- ENSMUST00000232225.1 697 No - Retained - - - 210 protein intron

Pacsin1- ENSMUST00000231350.1 1368 No - lncRNA - - - 207 protein

Pacsin1- ENSMUST00000155259.1 552 No - lncRNA - - TSL:2 205 protein

Page 7 of 9 https://www.alphaknockout.com

75.97 kb Forward strand 27.66Mb 27.68Mb 27.70Mb 27.72Mb (Comprehensive set... Pacsin1-204 >protein coding

Pacsin1-201 >protein coding

Pacsin1-211 >protein coding

Pacsin1-208 >protein coding

Pacsin1-202 >protein coding

Pacsin1-205 >lncRNA Pacsin1-207 >lncRNA

Pacsin1-206 >protein coding Pacsin1-203 >protein coding

Pacsin1-210 >retained intron

Pacsin1-209 >retained intron

Contigs AC131800.4 > Genes < Spdef-206retained intron (Comprehensive set...

< Spdef-203retained intron

< Spdef-201protein coding

< Spdef-205protein coding

< Spdef-202protein coding

< Spdef-204protein coding

Regulatory Build

27.66Mb 27.68Mb 27.70Mb 27.72Mb Reverse strand 75.97 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000232437

55.90 kb Forward strand

Pacsin1-211 >protein coding

ENSMUSP00000155... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily AH/BAR domain superfamily SH3-like domain superfamily

SMART FCH domain SH3 domain

Prints SH3 domain Pfam FCH domain SH3 domain

PROSITE profiles F-BAR domain SH3 domain

PANTHER PTHR23065

Protein kinase C and casein kinase substrate in neurons protein 1 Gene3D AH/BAR domain superfamily 2.30.30.40

CDD PACSIN1, F-BAR PACSIN1/PACSIN2, SH3 domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

frameshift variant missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 441

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9