https://www.alphaknockout.com

Mouse Cpeb1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Cpeb1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cpeb1 (NCBI Reference Sequence: NM_001252525 ; Ensembl: ENSMUSG00000025586 ) is located on Mouse 7. 12 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 12 (Transcript: ENSMUST00000178892). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cpeb1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-335C7 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null allele are viable and overtly normal but display a developmental arrest of both female and male germ cells at the pachytene stage, defective synaptonemal complex formation, and impaired neuronal synaptic plasticity.

Exon 2 starts from about 0.95% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 18256 bp, and the size of intron 2 for 3'-loxP site insertion: 63999 bp. The size of effective cKO region: ~675 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 12 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Cpeb1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7175bp) | A(29.95% 2149) | C(17.62% 1264) | T(32.96% 2365) | G(19.47% 1397)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 - 81436628 81439627 3000 browser details YourSeq 119 255 429 3000 90.7% chr10 + 96065702 96065921 220 browser details YourSeq 118 255 443 3000 91.8% chr10 + 58924354 58924587 234 browser details YourSeq 118 257 442 3000 89.7% chr1 + 56810058 56810278 221 browser details YourSeq 106 255 394 3000 89.8% chr15 - 79009264 79009420 157 browser details YourSeq 105 1143 1279 3000 95.0% chr4 + 44248321 44248459 139 browser details YourSeq 103 91 387 3000 85.4% chr12 + 12539319 12539935 617 browser details YourSeq 101 292 442 3000 92.5% chr14 - 34046428 34046592 165 browser details YourSeq 100 255 387 3000 88.4% chr16 + 31073173 31073315 143 browser details YourSeq 98 1142 1278 3000 86.1% chrX - 92046604 92046742 139 browser details YourSeq 98 292 441 3000 92.4% chr1 + 22070653 22070833 181 browser details YourSeq 97 290 442 3000 91.5% chr1 + 181183240 181183422 183 browser details YourSeq 96 292 443 3000 89.5% chr12 + 44307192 44307364 173 browser details YourSeq 95 281 438 3000 94.5% chr5 - 123837821 123837981 161 browser details YourSeq 95 292 442 3000 91.6% chr2 + 14679815 14679970 156 browser details YourSeq 95 255 442 3000 83.1% chr11 + 46814514 46814700 187 browser details YourSeq 95 255 385 3000 88.0% chr11 + 36669357 36669497 141 browser details YourSeq 94 255 382 3000 87.5% chr2 - 180958935 180959072 138 browser details YourSeq 94 292 437 3000 89.3% chr15 - 8779266 8779421 156 browser details YourSeq 94 256 387 3000 88.7% chr14 - 56078519 56078663 145

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 - 81432953 81435952 3000 browser details YourSeq 223 2545 2935 3000 87.1% chr17 + 66128453 66129183 731 browser details YourSeq 222 2542 2859 3000 85.2% chr2 + 71720661 71720980 320 browser details YourSeq 215 2561 2861 3000 86.3% chr5 + 116328984 116329426 443 browser details YourSeq 214 2550 2859 3000 86.0% chr4 - 99411226 99411547 322 browser details YourSeq 212 2550 2858 3000 83.8% chr6 + 66818602 66818892 291 browser details YourSeq 211 2550 2861 3000 85.0% chr10 - 127735768 127736041 274 browser details YourSeq 209 2550 2858 3000 84.8% chr7 + 142353183 142353452 270 browser details YourSeq 209 2550 2861 3000 88.6% chr10 + 61192456 61287960 95505 browser details YourSeq 208 2550 2877 3000 87.8% chr13 + 53006669 53007188 520 browser details YourSeq 205 2570 2861 3000 85.3% chr17 - 31354651 31354943 293 browser details YourSeq 204 2570 2859 3000 84.4% chr8 - 11156695 11156963 269 browser details YourSeq 204 2570 2861 3000 85.0% chr3 + 66587755 66588046 292 browser details YourSeq 203 2559 2861 3000 86.3% chr14 - 120463100 120463403 304 browser details YourSeq 203 2566 2861 3000 84.4% chr10 + 121883722 121883997 276 browser details YourSeq 202 2629 2914 3000 87.7% chr8 + 125953421 125953824 404 browser details YourSeq 201 2550 2861 3000 83.7% chr2 + 168139608 168139932 325 browser details YourSeq 200 2573 2861 3000 84.3% chr4 - 115982542 115982810 269 browser details YourSeq 199 2544 2861 3000 82.0% chr4 - 99239633 99239943 311 browser details YourSeq 198 2536 2861 3000 88.7% chr5 + 30884934 30885335 402

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Cpeb1 cytoplasmic element binding protein 1 [ Mus musculus (house mouse) ] Gene ID: 12877, updated on 3-Sep-2019

Gene summary

Official Symbol Cpeb1 provided by MGI Official Full Name cytoplasmic polyadenylation element binding protein 1 provided by MGI Primary source MGI:MGI:108442 See related Ensembl:ENSMUSG00000025586 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Cpeb; mCPEB; Cpe-bp1; mCpeb-1; AU024112 Expression Broad expression in cortex adult (RPKM 9.8), frontal lobe adult (RPKM 9.6) and 15 other tissues See more Orthologs all

Genomic context

Location: 7; 7 D3 See Cpeb1 in Genome Data Viewer

Exon count: 16

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (81347026..81458001, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (88492329..88599562, complement)

Chromosome 7 - NC_000073.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Cpeb1 ENSMUSG00000025586

Description cytoplasmic polyadenylation element binding protein 1 [Source:MGI Symbol;Acc:MGI:108442] Location Chromosome 7: 81,347,026-81,455,465 reverse strand. GRCm38:CM001000.2 About this gene This gene has 9 transcripts (splice variants), 216 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 13 phenotypes. Transcripts

Name Transcript ID bp Protein ID Biotype CCDS UniProt Flags

Cpeb1-201 ENSMUST00000098331.9 3111 561aa ENSMUSP00000095936.3 Protein coding CCDS40007 P70166 TSL:1 GENCODE basic APPRIS P3

Cpeb1-209 ENSMUST00000178892.2 1777 562aa ENSMUSP00000137079.1 Protein coding CCDS57557 Q059Z2 TSL:1 GENCODE basic APPRIS ALT1

Cpeb1-203 ENSMUST00000130310.7 2969 551aa ENSMUSP00000120139.1 Protein coding - F6X7Z3 CDS 5' incomplete TSL:5

Cpeb1-207 ENSMUST00000152319.1 718 No protein - Retained intron - - TSL:3

Cpeb1-204 ENSMUST00000135177.7 821 No protein - lncRNA - - TSL:3

Cpeb1-208 ENSMUST00000153419.2 654 No protein - lncRNA - - TSL:3

Cpeb1-205 ENSMUST00000136689.1 639 No protein - lncRNA - - TSL:3

Cpeb1-202 ENSMUST00000124937.1 335 No protein - lncRNA - - TSL:3

Cpeb1-206 ENSMUST00000151661.1 326 No protein - lncRNA - - TSL:3

Page 6 of 8 https://www.alphaknockout.com

128.44 kb Forward strand 81.34Mb 81.36Mb 81.38Mb 81.40Mb 81.42Mb 81.44Mb 81.46Mb Rpl7a-ps9-201 >processed pseudogene (Comprehensive set...

Cpeb1os1-201 >TEC

Contigs AC162897.4 > AC161215.6 > < AC169082.2 < AC167120.2 Genes (Comprehensive set... < Rps17-202lncRNA < Cpeb1-204lncRNA

< Rps17-201protein coding< Cpeb1-208lncRNA < Cpeb1-202lncRNA

< Rps17-206lncRNA < Cpeb1-206lncRNA < Ap3b2-204retained intron

< Rps17-203retained intron < Ap3b2-201protein coding

< Rps17-205retained intron < Ap3b2-209protein coding

< Rps17-204lncRNA < Ap3b2-207protein coding

< Cpeb1-201protein coding

< Cpeb1-203protein coding

< Cpeb1-209protein coding

< Cpeb1-207retained intron

< Cpeb1-205lncRNA

Regulatory Build

81.34Mb 81.36Mb 81.38Mb 81.40Mb 81.42Mb 81.44Mb 81.46Mb Reverse strand 128.44 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene pseudogene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000178892

< Cpeb1-209protein coding

Reverse strand 106.40 kb

ENSMUSP00000137... MobiDB lite Low complexity (Seg) Superfamily RNA-binding domain superfamily SMART RNA recognition motif domain Pfam Cytoplasmic polyadenylation element-binding protein 1, N-terminal RNA recognition motif domain Cytoplasmic polyadenylation element-binding protein, ZZ domain

PROSITE profiles RNA recognition motif domain PANTHER PTHR12566:SF9

Cytoplasmic polyadenylation element-binding protein Gene3D Nucleotide-binding alpha-beta plait domain superfamily

CEBP, ZZ domain superfamily CDD CPEB-1, RNA recognition motif 1

cd12725

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 562

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8