https://www.alphaknockout.com

Mouse Cep192 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Cep192 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cep192 (NCBI Reference Sequence: NM_027556 ; Ensembl: ENSMUSG00000024542 ) is located on Mouse 18. 46 exons are identified, with the ATG start codon in exon 3 and the TAA stop codon in exon 46 (Transcript: ENSMUST00000025425). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cep192 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-129C9 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 3.9% of the coding region. The knockout of Exon 5 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 2762 bp, and the size of intron 5 for 3'-loxP site insertion: 2670 bp. The size of effective cKO region: ~649 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 5 46 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Cep192 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7149bp) | A(28.41% 2031) | C(17.76% 1270) | T(32.33% 2311) | G(21.5% 1537)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr18 + 67804008 67807007 3000 browser details YourSeq 357 1373 2236 3000 91.8% chr10 + 68413503 68414075 573 browser details YourSeq 354 1358 2245 3000 90.8% chr5 + 144090454 144091143 690 browser details YourSeq 350 1853 2242 3000 95.9% chr9 + 3800059 3800570 512 browser details YourSeq 349 1853 2247 3000 97.4% chr9 - 39960989 39961393 405 browser details YourSeq 349 1382 2246 3000 91.2% chr8 - 103563513 103564066 554 browser details YourSeq 349 1853 2245 3000 96.0% chr3 + 33276412 33276803 392 browser details YourSeq 347 1853 2353 3000 93.1% chr16 + 88410304 88410724 421 browser details YourSeq 346 1853 2247 3000 94.7% chr3 - 156312381 156312870 490 browser details YourSeq 346 1853 2246 3000 94.9% chr18 - 18896149 18896521 373 browser details YourSeq 345 1853 2245 3000 94.4% chr17 - 60842994 60843431 438 browser details YourSeq 342 1853 2244 3000 95.7% chr18 + 58429746 58430139 394 browser details YourSeq 341 1853 2247 3000 96.0% chr16 - 65032779 65033217 439 browser details YourSeq 340 1853 2246 3000 95.9% chrX - 163774878 163775269 392 browser details YourSeq 340 1853 2236 3000 95.0% chr1 + 123329743 123330170 428 browser details YourSeq 339 1380 2233 3000 95.5% chr1 + 191384896 191385874 979 browser details YourSeq 338 1853 2247 3000 95.2% chr17 - 83346246 83346639 394 browser details YourSeq 338 1853 2247 3000 94.1% chr16 + 97721405 97721795 391 browser details YourSeq 338 1853 2258 3000 95.0% chr15 + 56876422 56877076 655 browser details YourSeq 338 1851 2233 3000 95.0% chr12 + 110305115 110308306 3192

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr18 + 67807657 67810656 3000 browser details YourSeq 470 1 2144 3000 93.5% chr5 - 122746891 123348316 601426 browser details YourSeq 221 44 645 3000 85.7% chr1 - 93743410 93743907 498 browser details YourSeq 214 1 613 3000 90.8% chr3 - 104543211 104543848 638 browser details YourSeq 209 5 618 3000 84.5% chr4 + 135848053 135848319 267 browser details YourSeq 199 7 624 3000 82.3% chr10 - 59245339 59245857 519 browser details YourSeq 163 1 572 3000 84.7% chr11 - 107187593 107187936 344 browser details YourSeq 156 33 616 3000 92.0% chr2 - 121260561 121261146 586 browser details YourSeq 154 1957 2134 3000 93.1% chr10 - 26750122 26750298 177 browser details YourSeq 153 1 552 3000 80.9% chr11 - 106064239 106064430 192 browser details YourSeq 146 1 485 3000 84.1% chr10 + 61093183 61093515 333 browser details YourSeq 145 1 501 3000 84.1% chr9 - 35196403 35196582 180 browser details YourSeq 145 1963 2134 3000 92.9% chr17 + 21565510 21565685 176 browser details YourSeq 143 1963 2145 3000 87.5% chr11 + 94302101 94302269 169 browser details YourSeq 141 1980 2134 3000 96.2% chr4 - 129037328 129037488 161 browser details YourSeq 140 1 581 3000 86.5% chr12 + 4115453 4115909 457 browser details YourSeq 138 1 503 3000 82.6% chr14 + 76039881 76040090 210 browser details YourSeq 136 1959 2134 3000 87.6% chr14 + 99414740 99414903 164 browser details YourSeq 135 1 524 3000 93.6% chrX - 144253511 144254129 619 browser details YourSeq 135 471 645 3000 87.5% chr9 + 101316351 101316517 167

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Cep192 centrosomal protein 192 [ Mus musculus (house mouse) ] Gene ID: 70799, updated on 12-Aug-2019

Gene summary

Official Symbol Cep192 provided by MGI Official Full Name centrosomal protein 192 provided by MGI Primary source MGI:MGI:1918049 See related Ensembl:ENSMUSG00000024542 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 4631422C13Rik; D430014P18Rik Expression Ubiquitous expression in CNS E11.5 (RPKM 7.7), limb E14.5 (RPKM 5.8) and 25 other tissues See more Orthologs human all

Genomic context

Location: 18; 18 E1 See Cep192 in Genome Data Viewer

Exon count: 48

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (67800054..67885170)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (67959761..68044824)

Chromosome 18 - NC_000084.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 12 transcripts

Gene: Cep192 ENSMUSG00000024542

Description centrosomal protein 192 [Source:MGI Symbol;Acc:MGI:1918049] Gene Synonyms 4631422C13Rik, D430014P18Rik Location : 67,800,107-67,885,170 forward strand. GRCm38:CM001011.2 About this gene This gene has 12 transcripts (splice variants), 210 orthologues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cep192- ENSMUST00000025425.6 8093 2514aa ENSMUSP00000025425.5 Protein coding CCDS50313 E9Q4Y4 TSL:5 201 GENCODE basic APPRIS P1

Cep192- ENSMUST00000225303.1 4548 1105aa ENSMUSP00000153461.1 Nonsense mediated - A0A286YDK4 CDS 5' 208 decay incomplete

Cep192- ENSMUST00000224921.1 3113 No - Retained intron - - - 206 protein

Cep192- ENSMUST00000225077.1 2607 No - Retained intron - - - 207 protein

Cep192- ENSMUST00000224387.1 1425 No - Retained intron - - - 204 protein

Cep192- ENSMUST00000224817.1 680 No - Retained intron - - - 205 protein

Cep192- ENSMUST00000225580.1 646 No - Retained intron - - - 209 protein

Cep192- ENSMUST00000225681.1 600 No - Retained intron - - - 212 protein

Cep192- ENSMUST00000225589.1 524 No - Retained intron - - - 210 protein

Cep192- ENSMUST00000225677.1 497 No - Retained intron - - - 211 protein

Cep192- ENSMUST00000223571.1 456 No - Retained intron - - - 202 protein

Cep192- ENSMUST00000223715.1 323 No - Retained intron - - - 203 protein

Page 6 of 8 https://www.alphaknockout.com

105.06 kb Forward strand 67.80Mb 67.82Mb 67.84Mb 67.86Mb 67.88Mb (Comprehensive set... Seh1l-201 >protein coding Cep192-208 >nonsense mediated decay

Seh1l-202 >protein coding Cep192-202 >retained intron Cep192-212 >retained intron

Cep192-201 >protein coding

Cep192-204 >retained intron Cep192-211 >retained intron Cep192-210 >retained intron

Cep192-207 >retained intron Cep192-203 >retained intron

Cep192-206 >retained intron

Cep192-205 >retained intron

Cep192-209 >retained intron

Contigs AC108434.12 > AC127236.3 >

Genes < 4930549G23Rik-201lncRNA (Comprehensive set...

< 4930549G23Rik-202lncRNA

Regulatory Build

67.80Mb 67.82Mb 67.84Mb 67.86Mb 67.88Mb Reverse strand 105.06 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000025425

85.06 kb Forward strand

Cep192-201 >protein coding

ENSMUSP00000025... MobiDB lite Low complexity (Seg) PANTHER Centrosomal protein Spd-2/CEP192 Gene3D Immunoglobulin-like fold

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained missense variant splice region variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2514

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8