https://www.alphaknockout.com

Mouse Celf5 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Celf5 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Celf5 (NCBI Reference Sequence: NM_176954 ; Ensembl: ENSMUSG00000034818 ) is located on Mouse 10. 12 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 12 (Transcript: ENSMUST00000118763). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Celf5 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-277O5 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 12.32% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 5509 bp, and the size of intron 2 for 3'-loxP site insertion: 5511 bp. The size of effective cKO region: ~583 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 12 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Celf5 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7083bp) | A(22.38% 1585) | C(24.35% 1725) | T(23.92% 1694) | G(29.35% 2079)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 81477202 81480201 3000 browser details YourSeq 351 221 741 3000 92.6% chr10 + 79641170 79642127 958 browser details YourSeq 343 221 718 3000 94.2% chr9 + 120237868 120238406 539 browser details YourSeq 339 232 742 3000 92.6% chr13 - 103756475 103757130 656 browser details YourSeq 339 264 742 3000 93.5% chr1 - 164539734 164540257 524 browser details YourSeq 327 224 753 3000 93.5% chr11 - 119582848 119583434 587 browser details YourSeq 326 226 742 3000 93.7% chr4 + 142111029 142111885 857 browser details YourSeq 304 242 668 3000 92.7% chr18 - 82631821 82632304 484 browser details YourSeq 297 224 704 3000 93.5% chr6 - 137308989 137309606 618 browser details YourSeq 296 224 721 3000 94.5% chr10 + 79641337 79642172 836 browser details YourSeq 291 336 742 3000 94.0% chr13 - 103756491 103757096 606 browser details YourSeq 290 221 742 3000 95.4% chr4 + 130204823 130207247 2425 browser details YourSeq 286 264 739 3000 93.7% chr4 + 142111075 142111665 591 browser details YourSeq 282 331 741 3000 90.9% chr10 + 79641102 79642035 934 browser details YourSeq 266 208 679 3000 88.6% chr18 - 82631826 82632284 459 browser details YourSeq 265 224 737 3000 84.9% chr4 + 154537923 154538354 432 browser details YourSeq 263 278 726 3000 92.4% chr1 - 164539722 164540258 537 browser details YourSeq 260 338 726 3000 96.2% chr18 - 82631827 82632274 448 browser details YourSeq 256 231 742 3000 90.7% chr9 + 120237843 120238346 504 browser details YourSeq 254 221 733 3000 93.3% chr10 + 79641490 79642160 671

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 81473619 81476618 3000 browser details YourSeq 59 564 671 3000 78.8% chr15 - 36975586 36975661 76 browser details YourSeq 52 573 671 3000 76.3% chr18 + 79859470 79859531 62 browser details YourSeq 37 578 671 3000 68.3% chr9 - 74056260 74056318 59 browser details YourSeq 36 561 621 3000 88.4% chr5 + 90576203 90576262 60 browser details YourSeq 33 627 692 3000 97.3% chr13 - 96513718 96513802 85 browser details YourSeq 32 612 671 3000 77.8% chr18 - 11306876 11306929 54 browser details YourSeq 30 672 719 3000 71.9% chr1 - 121578440 121578475 36 browser details YourSeq 28 712 751 3000 87.5% chr3 + 27285955 27285993 39 browser details YourSeq 26 597 629 3000 96.5% chr15 - 69718484 69718516 33 browser details YourSeq 26 676 715 3000 89.3% chr3 + 27285955 27285993 39 browser details YourSeq 25 1495 1545 3000 74.6% chr5 - 92788636 92788686 51 browser details YourSeq 22 2270 2294 3000 95.9% chr4 + 18361131 18361157 27 browser details YourSeq 21 163 199 3000 78.4% chrX + 38069205 38069241 37 browser details YourSeq 20 2038 2063 3000 88.5% chr1 - 85290826 85290851 26

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Celf5 CUGBP, Elav-like family member 5 [ Mus musculus (house mouse) ] Gene ID: 319586, updated on 24-Oct-2019

Gene summary

Official Symbol Celf5 provided by MGI Official Full Name CUGBP, Elav-like family member 5 provided by MGI Primary source MGI:MGI:2442333 See related Ensembl:ENSMUSG00000034818 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Brunol5; 4930565A21Rik Expression Biased expression in CNS E18 (RPKM 40.2), whole brain E14.5 (RPKM 38.4) and 7 other tissues See more Orthologs human all

Genomic context

Location: 10; 10 C1 See Celf5 in Genome Data Viewer

Exon count: 17

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (81459227..81482730, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (80921973..80945454, complement)

Chromosome 10 - NC_000076.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Celf5 ENSMUSG00000034818

Description CUGBP, Elav-like family member 5 [Source:MGI Symbol;Acc:MGI:2442333] Gene Synonyms 4930565A21Rik, Brunol5 Location Chromosome 10: 81,459,227-81,482,709 reverse strand. GRCm38:CM001003.2 About this gene This gene has 9 transcripts (splice variants), 42 orthologues, 6 paralogues and is a member of 3 Ensembl protein families. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Celf5-201 ENSMUST00000118763.7 4391 395aa ENSMUSP00000113675.1 Protein coding CCDS48644 D3Z4T1 TSL:1 GENCODE basic

Celf5-202 ENSMUST00000119060.7 4166 319aa ENSMUSP00000113546.1 Protein coding CCDS78867 D3Z588 TSL:1 GENCODE basic

Celf5-204 ENSMUST00000120856.7 3861 318aa ENSMUSP00000113784.1 Protein coding CCDS78868 D3Z4C5 TSL:1 GENCODE basic

Celf5-203 ENSMUST00000120508.7 2042 394aa ENSMUSP00000113592.1 Protein coding CCDS78869 D3Z580 TSL:1 GENCODE basic

Celf5-209 ENSMUST00000238823.1 1477 446aa ENSMUSP00000158971.1 Protein coding - - GENCODE basic APPRIS P1

Celf5-208 ENSMUST00000147524.2 442 92aa ENSMUSP00000117430.1 Protein coding - D3Z4K9 CDS 3' incomplete TSL:3

Celf5-206 ENSMUST00000141207.7 945 No protein - Retained intron - - TSL:1

Celf5-205 ENSMUST00000128494.1 748 No protein - Retained intron - - TSL:3

Celf5-207 ENSMUST00000145375.7 605 No protein - Retained intron - - TSL:3

Page 6 of 8 https://www.alphaknockout.com

43.48 kb Forward strand 81.45Mb 81.46Mb 81.47Mb 81.48Mb 81.49Mb Gm16105-201 >lncRNA (Comprehensive set...

Contigs < AC159474.9

Genes (Comprehensive set... < Nfic-207protein coding < Celf5-201protein coding < Ncln-201protein coding

< Celf5-202protein coding < Ncln-202protein coding

< Celf5-204protein coding < Ncln-208retained intron

< Celf5-203protein coding < Ncln-204retained intron

< Celf5-209protein coding < Ncln-206retained intron

< Celf5-207retained intron < Ncln-205retained intron

< Celf5-206retained intron < Ncln-209retained intron

< Celf5-205retained intron < Ncln-203protein coding

< Celf5-208protein coding

Regulatory Build

81.45Mb 81.46Mb 81.47Mb 81.48Mb 81.49Mb Reverse strand 43.48 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000118763

< Celf5-201protein coding

Reverse strand 23.48 kb

ENSMUSP00000113... Superfamily RNA-binding domain superfamily SMART RNA recognition motif domain Pfam RNA recognition motif domain PROSITE profiles RNA recognition motif domain PANTHER PTHR24012:SF728

PTHR24012 Gene3D Nucleotide-binding alpha-beta plait domain superfamily CDD CELF-3/4/5/6, RNA recognition motif 1

cd12635

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 395

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8