https://www.alphaknockout.com

Mouse Ccbe1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Ccbe1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ccbe1 (NCBI Reference Sequence: NM_178793 ; Ensembl: ENSMUSG00000046318 ) is located on Mouse 18. 11 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 11 (Transcript: ENSMUST00000130300). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Ccbe1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-138G22 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit prenatal lethality associated with edema and absence of lymphatic vessels.

Exon 3 starts from about 17.65% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 197081 bp, and the size of intron 3 for 3'-loxP site insertion: 9125 bp. The size of effective cKO region: ~553 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 11 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Ccbe1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7053bp) | A(30.13% 2125) | C(19.5% 1375) | T(27.66% 1951) | G(22.71% 1602)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr18 - 66094301 66097300 3000 browser details YourSeq 196 1141 1920 3000 87.7% chr6 + 54908158 54908961 804 browser details YourSeq 184 980 1797 3000 86.5% chrX + 59984275 59985172 898 browser details YourSeq 165 926 1714 3000 86.5% chr2 - 80769639 80770697 1059 browser details YourSeq 164 1328 2061 3000 88.1% chr9 + 67854974 67855723 750 browser details YourSeq 154 956 1607 3000 82.8% chr1 + 128770365 128770996 632 browser details YourSeq 147 1328 1804 3000 83.3% chr16 + 22174855 22175336 482 browser details YourSeq 140 888 1494 3000 75.9% chr1 + 178470446 178470975 530 browser details YourSeq 138 1322 1822 3000 85.8% chr7 + 119482947 119483469 523 browser details YourSeq 132 944 1922 3000 88.0% chr19 - 23671931 24044890 372960 browser details YourSeq 127 1184 2043 3000 90.5% chr4 - 32114901 32115932 1032 browser details YourSeq 127 1068 1495 3000 88.9% chr3 - 99231551 99232017 467 browser details YourSeq 127 1345 1922 3000 79.5% chr19 - 43865128 43865653 526 browser details YourSeq 125 1038 1785 3000 72.7% chrX - 42313602 42314148 547 browser details YourSeq 122 1104 1806 3000 86.4% chr7 - 37455977 37456682 706 browser details YourSeq 122 1465 2055 3000 87.8% chr13 - 23824073 23824707 635 browser details YourSeq 122 1162 1802 3000 85.8% chr6 + 42842745 42843402 658 browser details YourSeq 121 991 1540 3000 87.1% chr11 + 43371654 43372226 573 browser details YourSeq 119 1328 1797 3000 87.0% chr10 + 45199634 45200127 494 browser details YourSeq 107 1418 1914 3000 80.9% chr19 - 13888971 13889477 507

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr18 - 66090748 66093747 3000 browser details YourSeq 343 2072 2519 3000 93.7% chr14 + 123360324 123360973 650 browser details YourSeq 300 2091 2519 3000 92.8% chr14 + 123360325 123360985 661 browser details YourSeq 282 2089 2499 3000 92.9% chr14 + 123360541 123361025 485 browser details YourSeq 276 2087 2521 3000 91.5% chr7 + 20044344 20044752 409 browser details YourSeq 272 2079 2517 3000 95.2% chr12 - 7890052 7890572 521 browser details YourSeq 260 2128 2516 3000 94.3% chr13 - 95829094 95829470 377 browser details YourSeq 238 2201 2521 3000 96.0% chr14 + 123360389 123360949 561 browser details YourSeq 228 2031 2486 3000 95.1% chr14 + 123360439 123361004 566 browser details YourSeq 209 2089 2467 3000 89.2% chr14 + 123360697 123361037 341 browser details YourSeq 195 2109 2492 3000 89.2% chr17 - 49006434 49006773 340 browser details YourSeq 182 2229 2519 3000 96.1% chr14 + 123360355 123360771 417 browser details YourSeq 174 2103 2393 3000 89.1% chr7 + 20044572 20044790 219 browser details YourSeq 173 2132 2487 3000 95.0% chr14 + 123360388 123361029 642 browser details YourSeq 162 2103 2319 3000 95.7% chr14 + 123360579 123361009 431 browser details YourSeq 153 2331 2523 3000 97.0% chr3 + 31465589 31465925 337 browser details YourSeq 146 2299 2519 3000 95.8% chr7 + 20044422 20044642 221 browser details YourSeq 145 2233 2453 3000 97.5% chr12 - 7890520 7890746 227 browser details YourSeq 137 2107 2307 3000 88.6% chr3 + 31465751 31465933 183 browser details YourSeq 127 2295 2518 3000 95.8% chr14 + 123360351 123360722 372

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Ccbe1 collagen and calcium binding EGF domains 1 [ Mus musculus (house mouse) ] Gene ID: 320924, updated on 12-Aug-2019

Gene summary

Official Symbol Ccbe1 provided by MGI Official Full Name collagen and calcium binding EGF domains 1 provided by MGI Primary source MGI:MGI:2445053 See related Ensembl:ENSMUSG00000046318 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as mKIAA1983; 4933426F18Rik; 9430093N24Rik Expression Broad expression in adrenal adult (RPKM 8.2), ovary adult (RPKM 7.1) and 21 other tissues See more Orthologs human all

Genomic context

Location: 18; 18 E1 See Ccbe1 in Genome Data Viewer

Exon count: 12

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (66056855..66291838, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (66216510..66451492, complement)

Chromosome 18 - NC_000084.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Ccbe1 ENSMUSG00000046318

Description collagen and calcium binding EGF domains 1 [Source:MGI Symbol;Acc:MGI:2445053] Gene Synonyms 4933426F18Rik, 9430093N24Rik Location : 66,045,302-66,302,741 reverse strand. GRCm38:CM001011.2 About this gene This gene has 7 transcripts (splice variants), 180 orthologues, is a member of 1 Ensembl protein family and is associated with 16 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ccbe1- ENSMUST00000130300.2 6572 408aa ENSMUSP00000117636.1 Protein coding CCDS29314 Q3MI99 TSL:1 202 GENCODE basic APPRIS P1

Ccbe1- ENSMUST00000235872.1 2137 114aa ENSMUSP00000157925.1 Protein coding - A0A494BA41 CDS 5' 205 incomplete

Ccbe1- ENSMUST00000236402.1 1046 75aa ENSMUSP00000157390.1 Protein coding - A0A494B905 GENCODE 206 basic

Ccbe1- ENSMUST00000061103.13 5304 408aa ENSMUSP00000052011.7 Nonsense mediated - Q3MI99 TSL:1 201 decay

Ccbe1- ENSMUST00000236848.1 712 93aa ENSMUSP00000157891.1 Nonsense mediated - A0A494BA47 - 207 decay

Ccbe1- ENSMUST00000146610.1 3261 No - Retained intron - - TSL:1 203 protein

Ccbe1- ENSMUST00000151343.2 1845 No - lncRNA - - TSL:1 204 protein

Page 6 of 8 https://www.alphaknockout.com

277.44 kb Forward strand 66.1Mb 66.2Mb 66.3Mb Gm50142-201 >lncRNA Gm15958-202 >lncRNA (Comprehensive set...

Gm15958-201 >lncRNA

Gm15958-203 >lncRNA

Contigs AC102243.14 > < AC157784.2 Genes (Comprehensive set... < Ccbe1-203retained intron

< Ccbe1-202protein coding

< Ccbe1-201nonsense mediated decay

< Ccbe1-205protein coding < Gm50143-201lncRNA < Mir694-201miRNA < Ccbe1-204lncRNA

< Ccbe1-207nonsense mediated decay

< Ccbe1-206protein coding

Regulatory Build

66.1Mb 66.2Mb 66.3Mb Reverse strand 277.44 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000130300

< Ccbe1-202protein coding

Reverse strand 235.81 kb

ENSMUSP00000117... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Cleavage site (Sign... Superfamily SSF57196 SMART EGF-like domain

EGF-like calcium-binding domain Pfam PF14670 Collagen triple helix repeat

PROSITE profiles EGF-like domain PROSITE patterns EGF-like calcium-binding, conserved site

EGF-type aspartate/asparagine hydroxylation site

EGF-like, conserved site PANTHER PTHR24034

PTHR24034:SF76 Gene3D 2.10.25.10 CDD cd00054

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 408

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8