http://www.alphaknockout.com/ Mouse Cep192 Knockout Project (CRISPR/Cas9)

Objective: To create a Cep192 knockout mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cep192 ( NCBI Reference Sequence: NM_027556 ; Ensembl: ENSMUSG00000024542 ) is located on mouse 18. 46 exons are identified , with the ATG start codon in exon 3 and the TAA stop codon in exon 46 (Transcript: ENSMUST00000025425). Exon 6~41 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 6 starts from about 5.87% of the coding region. Exon 6~41 covers 86.66% of the coding region. The size of effective KO region: ~62487 bp.

Page 1 of 9 http://www.alphaknockout.com/

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3' 25 28 10 14 18 24 27 30 33 37 39 41

1 6 7 8 9 11 12 13 15 16 17 19 20 21 22 23 26 29 31 32 34 35 36 38 40 46

Legends Exon of mouse Cep192 Knockout region

Page 2 of 9 http://www.alphaknockout.com/

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 6 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 41 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 http://www.alphaknockout.com/

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.4% 588) | C(18.55% 371) | G(19.65% 393) | T(32.4% 648)

Note: The 2000 bp section upstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.4% 528) | C(19.1% 382) | G(20.85% 417) | T(33.65% 673)

Note: The 2000 bp section downstream of Exon 41 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 http://www.alphaknockout.com/

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr18 + 67808077 67810076 2000 browser details YourSeq 284 63 1740 2000 90.4% chr17 + 35444320 35541463 97144 browser details YourSeq 159 1537 1724 2000 91.3% chr10 - 26750113 26750298 186 browser details YourSeq 152 1542 1724 2000 92.8% chr14 + 62781447 62781653 207 browser details YourSeq 146 1558 1724 2000 94.0% chr5 - 122746891 122747058 168 browser details YourSeq 145 1484 1713 2000 85.9% chr17 - 24410883 24411065 183 browser details YourSeq 145 1542 1724 2000 93.1% chr16 - 55657907 55658094 188 browser details YourSeq 145 1543 1714 2000 92.9% chr17 + 21565510 21565685 176 browser details YourSeq 143 1543 1725 2000 87.5% chr11 + 94302101 94302269 169 browser details YourSeq 142 18 225 2000 84.2% chr1 - 93743410 93743589 180 browser details YourSeq 141 1560 1714 2000 96.2% chr4 - 129037328 129037488 161 browser details YourSeq 141 1542 1714 2000 91.3% chr5 + 142910902 142944531 33630 browser details YourSeq 140 1542 1714 2000 91.3% chr17 - 40936576 40936750 175 browser details YourSeq 140 1552 1714 2000 93.2% chr7 + 92209565 92209741 177 browser details YourSeq 138 1562 1714 2000 96.1% chr5 + 77452845 77453003 159 browser details YourSeq 137 1556 1714 2000 94.9% chr19 - 41441888 41442047 160 browser details YourSeq 136 44 218 2000 87.2% chr15 - 73459041 73459196 156 browser details YourSeq 136 1539 1714 2000 87.6% chr14 + 99414740 99414903 164 browser details YourSeq 136 1564 1723 2000 93.7% chr1 + 160966887 160967048 162 browser details YourSeq 135 51 225 2000 87.5% chr9 + 101316351 101316517 167

Note: The 2000 bp section upstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr18 + 67871559 67873558 2000 browser details YourSeq 306 383 911 2000 94.3% chr5 - 138015066 138015645 580 browser details YourSeq 305 389 913 2000 94.8% chr11 - 101588617 101899275 310659 browser details YourSeq 296 391 914 2000 89.2% chr9 - 119988764 119989133 370 browser details YourSeq 289 399 913 2000 87.5% chr2 - 77689334 77689774 441 browser details YourSeq 288 390 913 2000 86.2% chr9 + 63619254 63619622 369 browser details YourSeq 275 392 913 2000 89.4% chr9 - 21965149 21965499 351 browser details YourSeq 267 375 916 2000 90.4% chr14 + 64673454 64686895 13442 browser details YourSeq 262 387 914 2000 90.7% chr1 - 86478221 86478782 562 browser details YourSeq 256 374 913 2000 87.3% chr11 + 69625325 69625715 391 browser details YourSeq 240 382 858 2000 91.2% chr7 + 16291500 16292148 649 browser details YourSeq 233 391 842 2000 93.7% chr7 + 13058361 13059000 640 browser details YourSeq 229 388 898 2000 91.7% chr10 - 127995666 127996321 656 browser details YourSeq 227 398 910 2000 92.6% chr13 + 98826207 98826837 631 browser details YourSeq 210 387 907 2000 93.4% chr19 - 45305201 45305728 528 browser details YourSeq 191 387 601 2000 96.2% chr2 - 38867992 38868230 239 browser details YourSeq 187 389 761 2000 95.2% chr8 + 106767748 106768169 422 browser details YourSeq 186 389 597 2000 96.1% chr6 - 32983227 32983435 209 browser details YourSeq 186 383 580 2000 96.0% chr17 - 27037424 27037620 197 browser details YourSeq 184 385 580 2000 97.0% chr7 - 34443435 34443630 196

Note: The 2000 bp section downstream of Exon 41 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 http://www.alphaknockout.com/ Gene and information: Cep192 centrosomal protein 192 [ Mus musculus (house mouse) ] Gene ID: 70799, updated on 4-Feb-2021

Gene summary

Official Symbol Cep192 provided by MGI Official Full Name centrosomal protein 192 provided by MGI Primary source MGI:MGI:1918049 See related Ensembl:ENSMUSG00000024542 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 4631422C13Rik; D430014P18Rik Expression Ubiquitous expression in CNS E11.5 (RPKM 7.7), limb E14.5 (RPKM 5.8) and 25 other tissues See more Orthologs human all NEW Try the new Gene table Try the new Transcript table

Genomic context

Location: 18; 18 E1- E2 See Cep192 in Genome Data Viewer

Exon count: 48

Annotation release Status Assembly Chr Location

109 current GRCm39 (GCF_000001635.27) 18 NC_000084.7 (67933124..68018241)

108.20200622 previous assembly GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (67800054..67885170)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (67959761..68044824)

Chromosome 18 - NC_000084.7

Page 6 of 9 http://www.alphaknockout.com/

Transcript information: This gene has 12 transcripts

Gene: Cep192 ENSMUSG00000024542

Description centrosomal protein 192 [Source:MGI Symbol;Acc:MGI:1918049] Gene Synonyms 4631422C13Rik, D430014P18Rik Location : 67,933,177-68,018,241 forward strand. GRCm39:CM001011.3 About this gene This gene has 12 transcripts (splice variants), 213 orthologues and is associated with 1 phenotype. Transcripts

UniProt Name Transcript ID bp Protein Translation ID Biotype CCDS Flags Match

Cep192- ENSMUST00000025425.7 8093 2514aa ENSMUSP00000025425.6 Protein coding CCDS50313 E9Q4Y4 TSL:5 201 GENCODE basic APPRIS P1

Cep192- ENSMUST00000225303.2 4548 1105aa ENSMUSP00000153461.2 Nonsense mediated - A0A286YDK4 CDS 5' 208 decay incomplete

Cep192- ENSMUST00000224921.2 3113 No - Retained intron - - - 206 protein

Cep192- ENSMUST00000225077.2 2607 No - Retained intron - - - 207 protein

Cep192- ENSMUST00000224387.2 1425 No - Retained intron - - - 204 protein

Cep192- ENSMUST00000224817.2 680 No - Retained intron - - - 205 protein

Cep192- ENSMUST00000225580.2 646 No - Retained intron - - - 209 protein

Cep192- ENSMUST00000225681.2 600 No - Retained intron - - - 212 protein

Cep192- ENSMUST00000225589.2 524 No - Retained intron - - - 210 protein

Cep192- ENSMUST00000225677.2 497 No - Retained intron - - - 211 protein

Cep192- ENSMUST00000223571.2 456 No - Retained intron - - - 202 protein

Cep192- ENSMUST00000223715.2 323 No - Retained intron - - - 203 protein

Page 7 of 9 http://www.alphaknockout.com/

105.06 kb Forward strand 67.94Mb 67.96Mb 67.98Mb 68.00Mb 68.02Mb (Comprehensive set... Seh1l-201 >protein coding Cep192-208 >nonsense mediated decay

Seh1l-202 >protein coding Cep192-202 >retained intron Cep192-212 >retained intron

Cep192-201 >protein coding

Cep192-204 >retained intron Cep192-211 >retained intron Cep192-210 >retained intron

Cep192-207 >retained intron Cep192-203 >retained intron

Cep192-206 >retained intron

Cep192-205 >retained intron

Cep192-209 >retained intron

Contigs AC127236.3 >

Genes < 4930549G23Rik-201lncRNA (Comprehensive set...

< 4930549G23Rik-202lncRNA

Regulatory Build

67.94Mb 67.96Mb 67.98Mb 68.00Mb 68.02Mb Reverse strand 105.06 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 http://www.alphaknockout.com/

Transcript: ENSMUST00000025425

85.06 kb Forward strand

Cep192-201 >protein coding

ENSMUSP00000025... MobiDB lite Low complexity (Seg) PANTHER Centrosomal protein Spd-2/CEP192 Gene3D Immunoglobulin-like fold

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained missense variant splice region variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2514

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

Page 9 of 9