https://www.alphaknockout.com

Mouse Ccdc6 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Ccdc6 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ccdc6 (NCBI Reference Sequence: NM_001111121 ; Ensembl: ENSMUSG00000048701 ) is located on Mouse 10. 9 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 9 (Transcript: ENSMUST00000147545). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Ccdc6 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-24G13 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 47.33% of the coding region. The knockout of Exon 5 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 1986 bp, and the size of intron 5 for 3'-loxP site insertion: 5758 bp. The size of effective cKO region: ~661 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 4 5 9 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Ccdc6 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7161bp) | A(23.74% 1700) | C(23.47% 1681) | T(28.07% 2010) | G(24.72% 1770)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 + 70165843 70168842 3000 browser details YourSeq 149 1268 1568 3000 92.6% chr15 + 79017819 79018268 450 browser details YourSeq 147 1384 1568 3000 88.3% chr1 + 13424262 13424441 180 browser details YourSeq 146 1384 1567 3000 88.2% chr16 - 21970222 21970400 179 browser details YourSeq 143 1384 1568 3000 85.9% chr7 - 141105606 141105782 177 browser details YourSeq 143 1384 1568 3000 85.9% chr13 + 59699798 59699974 177 browser details YourSeq 141 1389 1568 3000 86.8% chr16 - 45078456 45078628 173 browser details YourSeq 141 1390 1569 3000 86.8% chr1 - 155759704 155759876 173 browser details YourSeq 140 1389 1572 3000 89.1% chr19 - 8668805 8668994 190 browser details YourSeq 139 1389 1570 3000 85.8% chr3 - 83229615 83229789 175 browser details YourSeq 138 1263 1534 3000 92.0% chr4 - 57919206 57919587 382 browser details YourSeq 138 1389 1568 3000 85.5% chr17 + 47237845 47238016 172 browser details YourSeq 137 1389 1569 3000 88.7% chr6 - 72089316 72089486 171 browser details YourSeq 137 1389 1585 3000 85.0% chr5 + 103307730 103307918 189 browser details YourSeq 137 1388 1565 3000 89.2% chr2 + 154649547 154649727 181 browser details YourSeq 136 1389 1567 3000 86.8% chr9 - 58508908 58509083 176 browser details YourSeq 136 1384 1582 3000 84.3% chr7 - 6205034 6205215 182 browser details YourSeq 136 1389 1569 3000 85.1% chr13 - 100537816 100537989 174 browser details YourSeq 136 1389 1568 3000 88.7% chr1 - 131198009 131198178 170 browser details YourSeq 136 1389 1569 3000 85.1% chr17 + 83579253 83579426 174

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 + 70169504 70172503 3000 browser details YourSeq 141 965 1300 3000 88.1% chr5 + 128395723 128396115 393 browser details YourSeq 138 969 1304 3000 87.9% chr9 + 72938619 72939007 389 browser details YourSeq 133 965 1233 3000 89.8% chr12 + 76201910 76202239 330 browser details YourSeq 130 968 1222 3000 85.6% chr8 - 23294822 23295119 298 browser details YourSeq 128 965 1247 3000 85.4% chr16 - 22637541 22637874 334 browser details YourSeq 127 938 1306 3000 90.0% chr18 + 71183827 71184267 441 browser details YourSeq 126 965 1226 3000 88.9% chr5 + 105641006 105641340 335 browser details YourSeq 122 968 1304 3000 88.2% chr4 + 8934388 8934772 385 browser details YourSeq 121 945 1199 3000 89.6% chr8 - 111751714 111752015 302 browser details YourSeq 119 965 1304 3000 91.1% chr6 - 67694082 67694446 365 browser details YourSeq 118 965 1267 3000 84.2% chr6 - 143370485 143371001 517 browser details YourSeq 116 918 1223 3000 84.5% chr11 - 31600274 31600624 351 browser details YourSeq 115 918 1176 3000 82.6% chr8 - 78764067 78764370 304 browser details YourSeq 115 965 1217 3000 92.7% chr10 + 42092971 42093266 296 browser details YourSeq 113 946 1309 3000 90.8% chr1 - 155066839 155067239 401 browser details YourSeq 111 977 1304 3000 86.3% chr14 - 62219175 62219720 546 browser details YourSeq 110 965 1267 3000 86.9% chr6 + 108107235 108107586 352 browser details YourSeq 108 965 1301 3000 88.0% chr6 - 65797359 65797716 358 browser details YourSeq 107 968 1304 3000 88.0% chr9 - 49889011 49889373 363

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Ccdc6 coiled-coil domain containing 6 [ Mus musculus (house mouse) ] Gene ID: 76551, updated on 12-Aug-2019

Gene summary

Official Symbol Ccdc6 provided by MGI Official Full Name coiled-coil domain containing 6 provided by MGI Primary source MGI:MGI:1923801 See related Ensembl:ENSMUSG00000048701 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AA536681; AA960498; AW061011; 2810012H18Rik Expression Ubiquitous expression in ovary adult (RPKM 13.7), bladder adult (RPKM 12.8) and 28 other tissues See more Orthologs human all

Genomic context

Location: 10; 10 B5.3 See Ccdc6 in Genome Data Viewer

Exon count: 11

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (70096878..70193200)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (69559869..69655948)

Chromosome 10 - NC_000076.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Ccdc6 ENSMUSG00000048701

Description coiled-coil domain containing 6 [Source:MGI Symbol;Acc:MGI:1923801] Gene Synonyms 2810012H18Rik Location : 70,097,121-70,193,200 forward strand. GRCm38:CM001003.2 About this gene This gene has 4 transcripts (splice variants), 210 orthologues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ccdc6-203 ENSMUST00000147545.7 5511 469aa ENSMUSP00000123374.1 Protein coding CCDS48591 D3YZP9 TSL:5 GENCODE basic APPRIS P1

Ccdc6-201 ENSMUST00000135607.1 719 136aa ENSMUSP00000116408.1 Protein coding - F7B4D5 CDS 5' incomplete TSL:2

Ccdc6-204 ENSMUST00000156001.7 630 137aa ENSMUSP00000115678.1 Protein coding - F6SXB0 CDS 5' incomplete TSL:3

Ccdc6-202 ENSMUST00000145990.1 706 No protein - Retained intron - - TSL:3

116.08 kb Forward strand 70.10Mb 70.15Mb 70.20Mb (Comprehensive set... Ccdc6-203 >protein coding

Ccdc6-202 >retained intron

Ccdc6-204 >protein coding

Ccdc6-201 >protein coding

Contigs < AC132435.3 Regulatory Build

70.10Mb 70.15Mb 70.20Mb Reverse strand 116.08 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000147545

96.08 kb Forward strand

Ccdc6-203 >protein coding

ENSMUSP00000123... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam Protein of unknown function DUF2046

PANTHER Protein of unknown function DUF2046

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

inframe insertion splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 469

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7