https://www.alphaknockout.com

Mouse Cped1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Cped1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cped1 (NCBI Reference Sequence: NM_001081351 ; Ensembl: ENSMUSG00000062980 ) is located on Mouse 6. 22 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 22 (Transcript: ENSMUST00000115383). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cped1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-305J6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 14.1% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 34206 bp, and the size of intron 3 for 3'-loxP site insertion: 8477 bp. The size of effective cKO region: ~607 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 22 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Cped1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7107bp) | A(29.08% 2067) | C(19.88% 1413) | T(30.91% 2197) | G(20.12% 1430)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 22048044 22051043 3000 browser details YourSeq 215 1499 2105 3000 82.1% chr8 + 73018898 73019337 440 browser details YourSeq 205 1508 2103 3000 89.5% chr10 + 75941864 75942458 595 browser details YourSeq 165 1510 1705 3000 93.7% chr3 - 58502973 58503171 199 browser details YourSeq 163 1869 2105 3000 87.3% chr2 - 51237095 51237348 254 browser details YourSeq 163 1512 1697 3000 96.1% chr11 - 85163310 85163505 196 browser details YourSeq 162 1514 1696 3000 95.1% chr12 - 116518382 116518573 192 browser details YourSeq 155 1512 1700 3000 92.0% chr8 - 70536143 70536335 193 browser details YourSeq 155 1868 2105 3000 90.3% chr12 + 78351586 78351847 262 browser details YourSeq 155 1869 2108 3000 87.7% chr1 + 81038721 81038977 257 browser details YourSeq 154 1872 2105 3000 88.5% chr8 + 82988849 82989092 244 browser details YourSeq 153 1867 2105 3000 85.0% chr12 - 28901820 28902082 263 browser details YourSeq 152 1511 1692 3000 94.8% chr12 - 100524053 100524243 191 browser details YourSeq 150 1869 2105 3000 90.9% chr3 + 51076610 51076872 263 browser details YourSeq 149 1869 2105 3000 88.7% chr2 - 118239620 118239879 260 browser details YourSeq 149 1867 2117 3000 86.1% chr1 - 186446516 186446984 469 browser details YourSeq 149 1512 1694 3000 91.7% chr1 - 161199320 161199502 183 browser details YourSeq 144 1865 2105 3000 89.7% chr13 + 40116239 40116504 266 browser details YourSeq 142 1511 1686 3000 91.0% chr14 + 21822529 21822701 173 browser details YourSeq 141 1883 2105 3000 90.7% chr14 - 97263038 97263282 245

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 22051651 22054650 3000 browser details YourSeq 369 861 1372 3000 95.6% chr9 - 26304060 26577267 273208 browser details YourSeq 363 857 1374 3000 96.5% chr3 - 123502364 123502905 542 browser details YourSeq 361 856 1234 3000 98.7% chr2 + 19290242 19292081 1840 browser details YourSeq 358 859 1234 3000 98.2% chr2 - 48243673 48244050 378 browser details YourSeq 357 857 1350 3000 92.3% chr1 + 127283522 127283966 445 browser details YourSeq 352 856 1233 3000 97.1% chr8 - 126193688 126194066 379 browser details YourSeq 349 856 1234 3000 97.4% chr1 + 156735107 156735489 383 browser details YourSeq 348 857 1329 3000 95.6% chr8 - 53615388 53617251 1864 browser details YourSeq 347 856 1233 3000 96.6% chr17 - 46740521 46741112 592 browser details YourSeq 347 854 1246 3000 95.6% chr16 - 32042391 32043104 714 browser details YourSeq 345 857 1233 3000 96.6% chr3 - 8408327 8652278 243952 browser details YourSeq 345 855 1233 3000 96.8% chr14 - 10820917 10821299 383 browser details YourSeq 344 857 1233 3000 96.6% chr5 - 124212579 124212959 381 browser details YourSeq 343 857 1233 3000 96.8% chr19 + 16547616 16547996 381 browser details YourSeq 342 856 1233 3000 96.3% chr9 - 118804568 118804947 380 browser details YourSeq 341 856 1233 3000 96.0% chr16 - 45005975 45006357 383 browser details YourSeq 341 857 1239 3000 96.0% chr1 - 175889708 175890094 387 browser details YourSeq 340 857 1235 3000 96.0% chr6 - 126222797 126223178 382 browser details YourSeq 340 857 1233 3000 96.5% chr13 - 20927682 20928061 380

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Cped1 cadherin-like and PC-esterase domain containing 1 [ Mus musculus (house mouse) ] Gene ID: 214642, updated on 12-Aug-2019

Gene summary

Official Symbol Cped1 provided by MGI Official Full Name cadherin-like and PC-esterase domain containing 1 provided by MGI Primary source MGI:MGI:2444814 See related Ensembl:ENSMUSG00000062980 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI552584; 6720481P07; A430107O13Rik Expression Broad expression in subcutaneous fat pad adult (RPKM 13.7), bladder adult (RPKM 10.2) and 19 other tissues See more Orthologs human all

Genomic context

Location: 6; 6 A3.1 See Cped1 in Genome Data Viewer

Exon count: 24

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (21985710..22256407)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (21935910..22205606)

Chromosome 6 - NC_000072.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Cped1 ENSMUSG00000062980

Description cadherin-like and PC-esterase domain containing 1 [Source:MGI Symbol;Acc:MGI:2444814] Gene Synonyms A430107O13Rik Location Chromosome 6: 21,985,916-22,256,404 forward strand. GRCm38:CM000999.2 About this gene This gene has 8 transcripts (splice variants), 180 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cped1- ENSMUST00000115383.8 5690 1026aa ENSMUSP00000111041.2 Protein coding CCDS39435 B2RX70 TSL:1 202 GENCODE basic APPRIS P2

Cped1- ENSMUST00000137437.5 2484 828aa ENSMUSP00000119808.2 Protein coding - E9Q7L8 CDS 5' and 3' 203 incomplete TSL:1 APPRIS ALT2

Cped1- ENSMUST00000115382.7 2002 473aa ENSMUSP00000111040.1 Protein coding - D3YUQ2 TSL:5 201 GENCODE basic APPRIS ALT2

Cped1- ENSMUST00000153922.7 3520 162aa ENSMUSP00000138562.1 Nonsense mediated - S4R2A1 TSL:1 206 decay

Cped1- ENSMUST00000154734.1 643 No - Retained intron - - TSL:3 207 protein

Cped1- ENSMUST00000141064.1 1794 No - lncRNA - - TSL:1 204 protein

Cped1- ENSMUST00000151315.4 645 No - lncRNA - - TSL:3 205 protein

Cped1- ENSMUST00000156621.1 399 No - lncRNA - - TSL:5 208 protein

Page 6 of 8 https://www.alphaknockout.com

290.49 kb Forward strand

22.0Mb 22.1Mb 22.2Mb (Comprehensive set... Ing3-201 >protein coding Cped1-203 >protein coding

Cped1-202 >protein coding

Cped1-206 >nonsense mediated decay Cped1-207 >retained intron

Cped1-204 >lncRNA Cped1-208 >lncRNA

Cped1-201 >protein coding

Cped1-205 >lncRNA

Contigs AC117213.3 > AC133955.4 >

Genes < Gm42573-201processed pseudogene (Comprehensive set...

Regulatory Build

22.0Mb 22.1Mb 22.2Mb Reverse strand 290.49 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000115383

270.49 kb Forward strand

Cped1-202 >protein coding

ENSMUSP00000111... Transmembrane heli... Low complexity (Seg) Pfam Cadherin-like beta sandwich domain PANTHER PTHR14776

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

inframe insertion missense variant splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1026

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8