https://www.alphaknockout.com

Mouse Cep250 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Cep250 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cep250 (NCBI Reference Sequence: NM_001129999 ; Ensembl: ENSMUSG00000038241 ) is located on Mouse 2. 33 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 33 (Transcript: ENSMUST00000039994). Exon 6~7 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cep250 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-316K12 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 6 starts from about 6.71% of the coding region. The knockout of Exon 6~7 will result in frameshift of the gene. The size of intron 5 for 5'-loxP site insertion: 713 bp, and the size of intron 7 for 3'-loxP site insertion: 1993 bp. The size of effective cKO region: ~1076 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 7 8 33 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Cep250 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7576bp) | A(25.04% 1897) | C(22.7% 1720) | T(26.91% 2039) | G(25.34% 1920)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 155961740 155964739 3000 browser details YourSeq 155 775 965 3000 96.0% chr15 + 76581279 76581496 218 browser details YourSeq 103 846 970 3000 89.6% chr9 - 97254331 97254454 124 browser details YourSeq 103 846 966 3000 90.9% chr9 - 79741245 79741364 120 browser details YourSeq 100 846 972 3000 88.9% chr13 + 98674672 98674794 123 browser details YourSeq 99 847 970 3000 90.3% chr14 - 114987981 114988104 124 browser details YourSeq 99 846 963 3000 92.8% chr8 + 14293834 14293950 117 browser details YourSeq 98 847 967 3000 91.3% chr11 + 23125519 23125638 120 browser details YourSeq 97 846 969 3000 93.4% chr16 - 8961054 8961175 122 browser details YourSeq 97 846 969 3000 93.8% chr15 - 36410993 36411116 124 browser details YourSeq 96 846 969 3000 88.4% chr10 + 128215955 128216073 119 browser details YourSeq 95 847 967 3000 90.0% chr12 - 51696706 51696828 123 browser details YourSeq 95 846 967 3000 90.0% chr8 + 28232464 28232583 120 browser details YourSeq 95 847 962 3000 91.4% chr2 + 136015191 136015306 116 browser details YourSeq 94 847 969 3000 89.5% chr14 - 16151115 16151236 122 browser details YourSeq 94 845 970 3000 95.4% chr12 + 84440705 84440837 133 browser details YourSeq 92 847 954 3000 95.2% chrX - 137870422 137870529 108 browser details YourSeq 92 847 967 3000 85.5% chr7 + 63680798 63680915 118 browser details YourSeq 92 847 970 3000 93.2% chr3 + 90530975 90531097 123 browser details YourSeq 92 850 970 3000 89.9% chr16 + 88306713 88306834 122

Note: The 3000 bp section upstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 155965816 155968815 3000 browser details YourSeq 165 723 1094 3000 93.3% chr4 + 86724255 86724962 708 browser details YourSeq 157 708 1031 3000 90.7% chr11 - 97259402 97259758 357 browser details YourSeq 156 707 1034 3000 92.1% chr7 + 37262131 37262484 354 browser details YourSeq 152 653 1016 3000 92.7% chr14 - 52125327 52125852 526 browser details YourSeq 148 860 1030 3000 91.8% chr2 - 58874289 58874457 169 browser details YourSeq 147 860 1034 3000 93.7% chr18 + 18903470 18903666 197 browser details YourSeq 147 860 1031 3000 92.9% chr12 + 75805294 75805473 180 browser details YourSeq 146 860 1034 3000 90.1% chr19 - 51071479 51071651 173 browser details YourSeq 146 860 1027 3000 94.7% chr19 + 21389088 21389262 175 browser details YourSeq 146 860 1031 3000 91.8% chr17 + 32148439 32148609 171 browser details YourSeq 146 709 993 3000 95.1% chr17 + 28544712 28545073 362 browser details YourSeq 146 860 1033 3000 92.4% chr17 + 25788394 25788578 185 browser details YourSeq 145 727 1030 3000 85.9% chr11 + 84501671 84501958 288 browser details YourSeq 144 860 1034 3000 89.4% chr10 - 15555495 15555664 170 browser details YourSeq 144 860 1035 3000 90.5% chr19 + 21341936 21342108 173 browser details YourSeq 144 868 1032 3000 94.5% chr17 + 28310697 28310864 168 browser details YourSeq 143 844 1017 3000 93.4% chr1_GL456221_random - 124673 124849 177 browser details YourSeq 143 842 1060 3000 88.2% chr10 + 88396386 88396821 436 browser details YourSeq 142 860 1034 3000 89.6% chr2 - 142938037 142938202 166

Note: The 3000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Cep250 centrosomal protein 250 [ Mus musculus (house mouse) ] Gene ID: 16328, updated on 10-Oct-2019

Gene summary

Official Symbol Cep250 provided by MGI Official Full Name centrosomal protein 250 provided by MGI Primary source MGI:MGI:108084 See related Ensembl:ENSMUSG00000038241 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Cep2; Inmp; AW490617; B230210E21Rik Expression Ubiquitous expression in CNS E11.5 (RPKM 11.5), limb E14.5 (RPKM 7.8) and 28 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 H1 See Cep250 in Genome Data Viewer

Exon count: 36

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (155956285..155998900)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (155782294..155824636)

Chromosome 2 - NC_000068.7

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 10 transcripts

Gene: Cep250 ENSMUSG00000038241

Description centrosomal protein 250 [Source:MGI Symbol;Acc:MGI:108084] Gene Synonyms B230210E21Rik, Cep2, Inmp Location Chromosome 2: 155,956,458-155,998,900 forward strand. GRCm38:CM000995.2 About this gene This gene has 10 transcripts (splice variants), 310 orthologues, 4 paralogues, is a member of 1 Ensembl protein family and is associated with 8 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cep250- ENSMUST00000039994.13 7979 2434aa ENSMUSP00000038255.7 Protein coding CCDS50773 E9Q5A8 TSL:5 201 GENCODE basic APPRIS ALT2

Cep250- ENSMUST00000109619.8 7977 2435aa ENSMUSP00000105248.2 Protein coding CCDS50772 A3KGJ7 TSL:5 204 GENCODE basic APPRIS P4

Cep250- ENSMUST00000094421.10 7919 2414aa ENSMUSP00000091988.4 Protein coding CCDS50771 Q60952 TSL:5 202 GENCODE basic APPRIS ALT2

Cep250- ENSMUST00000151569.7 2310 706aa ENSMUSP00000114426.1 Protein coding - B7ZCN3 CDS 3' incomplete 208 TSL:1

Cep250- ENSMUST00000109618.1 1254 338aa ENSMUSP00000105247.1 Protein coding - Q8BZX9 TSL:1 203 GENCODE basic APPRIS ALT2

Cep250- ENSMUST00000156355.7 672 224aa ENSMUSP00000122223.1 Protein coding - F7BUJ0 CDS 5' and 3' 210 incomplete TSL:3

Cep250- ENSMUST00000128683.1 872 138aa ENSMUSP00000119845.1 Nonsense mediated - F6QCB4 CDS 5' incomplete 205 decay TSL:5

Cep250- ENSMUST00000148191.1 6361 No - Retained intron - - TSL:1 206 protein

Cep250- ENSMUST00000149905.7 4930 No - Retained intron - - TSL:2 207 protein

Cep250- ENSMUST00000155160.1 352 No - Retained intron - - TSL:3 209 protein

Page 6 of 8 https://www.alphaknockout.com

62.44 kb Forward strand 155.95Mb 155.96Mb 155.97Mb 155.98Mb 155.99Mb 156.00Mb (Comprehensive set... Cep250-207 >retained intron Ergic3-206 >retained intron

Cep250-201 >protein coding Ergic3-201 >protein coding

Cep250-202 >protein coding Ergic3-202 >protein coding

Cep250-204 >protein coding Ergic3-205 >protein coding

Cep250-208 >protein coding Cep250-206 >retained intron

Cep250-203 >protein coding Cep250-209 >retained intron

Cep250-210 >protein coding

Cep250-205 >nonsense mediated decay

Contigs AL833786.8 >

Genes < 6430550D23Rik-203nonsense mediated decay (Comprehensive set...

< 6430550D23Rik-202protein coding

< 6430550D23Rik-201protein coding

< 6430550D23Rik-204retained intron

< 6430550D23Rik-208protein coding

< 6430550D23Rik-206protein coding

< 6430550D23Rik-205protein coding

< 6430550D23Rik-209protein coding

< 6430550D23Rik-207protein coding

Regulatory Build

155.95Mb 155.96Mb 155.97Mb 155.98Mb 155.99Mb 156.00Mb Reverse strand 62.44 kb

Regulation Legend

CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000039994

42.34 kb Forward strand

Cep250-201 >protein coding

ENSMUSP00000038... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam PF15035 PANTHER PTHR23159:SF1

PTHR23159

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend splice acceptor variant missense variant splice region variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2434

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8