https://www.alphaknockout.com

Mouse Cpeb1 Knockout Project (CRISPR/Cas9)

Objective: To create a Cpeb1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cpeb1 (NCBI Reference Sequence: NM_001252525 ; Ensembl: ENSMUSG00000025586 ) is located on Mouse 7. 12 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 12 (Transcript: ENSMUST00000178892). Exon 2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null allele are viable and overtly normal but display a developmental arrest of both female and male germ cells at the pachytene stage, defective synaptonemal complex formation, and impaired neuronal synaptic plasticity.

Exon 2 starts from about 0.95% of the coding region. Exon 2 covers 10.38% of the coding region. The size of effective KO region: ~175 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 12

Legends Exon of mouse Cpeb1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(28.15% 563) | C(17.15% 343) | T(35.55% 711) | G(19.15% 383)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(30.5% 610) | C(18.05% 361) | T(30.25% 605) | G(21.2% 424)

Note: The 2000 bp section downstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 - 81436378 81438377 2000 browser details YourSeq 43 1619 1668 2000 94.0% chr12 - 4767122 4767181 60 browser details YourSeq 38 1382 1636 2000 93.2% chr14 - 26660119 26660619 501 browser details YourSeq 35 385 495 2000 94.8% chr4 - 6430021 6430132 112 browser details YourSeq 29 1371 1401 2000 96.8% chr9 - 14114209 14114239 31 browser details YourSeq 28 1345 1372 2000 100.0% chr4 - 136731017 136731044 28 browser details YourSeq 28 1 30 2000 96.7% chr4 + 123757362 123757391 30 browser details YourSeq 28 1371 1399 2000 100.0% chr11 + 82653527 82653558 32 browser details YourSeq 26 1345 1372 2000 96.5% chr2 + 168858006 168858033 28 browser details YourSeq 26 1344 1369 2000 100.0% chr11 + 62161610 62161635 26 browser details YourSeq 25 1388 1414 2000 96.3% chrX - 86595121 86595147 27 browser details YourSeq 25 1343 1372 2000 85.2% chr2 + 166123621 166123648 28 browser details YourSeq 25 1378 1402 2000 100.0% chr17 + 32259781 32259805 25 browser details YourSeq 24 1623 1647 2000 100.0% chr1 - 16149892 16149926 35 browser details YourSeq 24 1170 1195 2000 96.2% chr5 + 88702961 88702986 26 browser details YourSeq 24 1347 1370 2000 100.0% chr19 + 29619680 29619703 24 browser details YourSeq 22 44 66 2000 100.0% chr7 - 137562776 137562800 25 browser details YourSeq 22 1618 1639 2000 100.0% chr6 - 47587051 47587072 22 browser details YourSeq 22 1349 1370 2000 100.0% chr4 - 116134816 116134837 22 browser details YourSeq 22 43 65 2000 100.0% chr3 - 30220306 30220329 24

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 - 81434203 81436202 2000 browser details YourSeq 57 1760 1938 2000 91.2% chr10 + 18078050 18078263 214 browser details YourSeq 54 510 602 2000 84.7% chr8 - 107144297 107144390 94 browser details YourSeq 51 507 569 2000 90.5% chr9 - 115974678 115974740 63 browser details YourSeq 51 516 705 2000 90.7% chr7 - 97859777 97859966 190 browser details YourSeq 49 509 569 2000 90.2% chr6 - 88089456 88089516 61 browser details YourSeq 48 508 575 2000 85.3% chr6 - 34671001 34671068 68 browser details YourSeq 47 508 602 2000 81.4% chr12 - 8383365 8383642 278 browser details YourSeq 46 573 706 2000 85.2% chr3 - 130948000 130948130 131 browser details YourSeq 45 1617 1818 2000 88.3% chr16 + 91577744 91577943 200 browser details YourSeq 44 510 602 2000 74.5% chr11 + 61893837 61893899 63 browser details YourSeq 43 1617 1680 2000 86.8% chr11 - 20664451 20664513 63 browser details YourSeq 43 1617 1821 2000 70.4% chr1 + 60819170 60819342 173 browser details YourSeq 42 509 569 2000 85.3% chr9 - 106549801 106549862 62 browser details YourSeq 42 509 624 2000 90.4% chr2 - 153534191 153534306 116 browser details YourSeq 42 507 563 2000 92.0% chr4 + 148631941 148632177 237 browser details YourSeq 42 510 569 2000 85.0% chr11 + 98292659 98292718 60 browser details YourSeq 41 508 602 2000 90.7% chr9 - 9698099 9698191 93 browser details YourSeq 41 509 557 2000 91.9% chr11 + 59652541 59652589 49 browser details YourSeq 40 501 602 2000 95.5% chr3 - 57747927 57748037 111

Note: The 2000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Cpeb1 cytoplasmic element binding protein 1 [ Mus musculus (house mouse) ] Gene ID: 12877, updated on 3-Sep-2019

Gene summary

Official Symbol Cpeb1 provided by MGI Official Full Name cytoplasmic polyadenylation element binding protein 1 provided by MGI Primary source MGI:MGI:108442 See related Ensembl:ENSMUSG00000025586 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Cpeb; mCPEB; Cpe-bp1; mCpeb-1; AU024112 Expression Broad expression in cortex adult (RPKM 9.8), frontal lobe adult (RPKM 9.6) and 15 other tissues See more Orthologs all

Genomic context

Location: 7; 7 D3 See Cpeb1 in Genome Data Viewer Exon count: 16

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (81347026..81458001, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (88492329..88599562, complement)

Chromosome 7 - NC_000073.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Cpeb1 ENSMUSG00000025586

Description cytoplasmic polyadenylation element binding protein 1 [Source:MGI Symbol;Acc:MGI:108442] Location Chromosome 7: 81,347,026-81,455,465 reverse strand. GRCm38:CM001000.2 About this gene This gene has 9 transcripts (splice variants), 216 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 13 phenotypes. Transcripts

Name Transcript ID bp Protein ID Biotype CCDS UniProt Flags

Cpeb1-201 ENSMUST00000098331.9 3111 561aa ENSMUSP00000095936.3 Protein coding CCDS40007 P70166 TSL:1 GENCODE basic APPRIS P3

Cpeb1-209 ENSMUST00000178892.2 1777 562aa ENSMUSP00000137079.1 Protein coding CCDS57557 Q059Z2 TSL:1 GENCODE basic APPRIS ALT1

Cpeb1-203 ENSMUST00000130310.7 2969 551aa ENSMUSP00000120139.1 Protein coding - F6X7Z3 CDS 5' incomplete TSL:5

Cpeb1-207 ENSMUST00000152319.1 718 No protein - Retained intron - - TSL:3

Cpeb1-204 ENSMUST00000135177.7 821 No protein - lncRNA - - TSL:3

Cpeb1-208 ENSMUST00000153419.2 654 No protein - lncRNA - - TSL:3

Cpeb1-205 ENSMUST00000136689.1 639 No protein - lncRNA - - TSL:3

Cpeb1-202 ENSMUST00000124937.1 335 No protein - lncRNA - - TSL:3

Cpeb1-206 ENSMUST00000151661.1 326 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

128.44 kb Forward strand 81.34Mb 81.36Mb 81.38Mb 81.40Mb 81.42Mb 81.44Mb 81.46Mb Rpl7a-ps9-201 >processed pseudogene (Comprehensive set...

Cpeb1os1-201 >TEC

Contigs AC162897.4 > AC161215.6 > < AC169082.2 < AC167120.2 Genes (Comprehensive set... < Rps17-202lncRNA < Cpeb1-204lncRNA

< Rps17-201protein coding< Cpeb1-208lncRNA < Cpeb1-202lncRNA

< Rps17-206lncRNA < Cpeb1-206lncRNA < Ap3b2-204retained intron

< Rps17-203retained intron < Ap3b2-201protein coding

< Rps17-205retained intron < Ap3b2-209protein coding

< Rps17-204lncRNA < Ap3b2-207protein coding

< Cpeb1-201protein coding

< Cpeb1-203protein coding

< Cpeb1-209protein coding

< Cpeb1-207retained intron

< Cpeb1-205lncRNA

Regulatory Build

81.34Mb 81.36Mb 81.38Mb 81.40Mb 81.42Mb 81.44Mb 81.46Mb Reverse strand 128.44 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene pseudogene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000178892

< Cpeb1-209protein coding

Reverse strand 106.40 kb

ENSMUSP00000137... MobiDB lite Low complexity (Seg) Superfamily RNA-binding domain superfamily SMART RNA recognition motif domain Pfam Cytoplasmic polyadenylation element-binding protein 1, N-terminal RNA recognition motif domain Cytoplasmic polyadenylation element-binding protein, ZZ domain

PROSITE profiles RNA recognition motif domain PANTHER PTHR12566:SF9

Cytoplasmic polyadenylation element-binding protein Gene3D Nucleotide-binding alpha-beta plait domain superfamily

CEBP, ZZ domain superfamily CDD CPEB-1, RNA recognition motif 1

cd12725

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 562

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9