https://www.alphaknockout.com

Mouse Ccdc70 Knockout Project (CRISPR/Cas9)

Objective: To create a Ccdc70 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ccdc70 (NCBI Reference Sequence: NM_026459 ; Ensembl: ENSMUSG00000017049 ) is located on Mouse 8. 2 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 2 (Transcript: ENSMUST00000017193). Exon 2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 0.15% of the coding region. Exon 2 covers 100.0% of the coding region. The size of effective KO region: ~669 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2

Legends Exon of mouse Ccdc70 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.95% 459) | C(25.55% 511) | T(33.35% 667) | G(18.15% 363)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.8% 596) | C(19.6% 392) | T(24.25% 485) | G(26.35% 527)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 + 21971197 21973196 2000 browser details YourSeq 55 232 286 2000 100.0% chr15 - 31055375 31055429 55 browser details YourSeq 42 676 734 2000 97.8% chr18 - 70402674 70402732 59 browser details YourSeq 38 648 694 2000 88.1% chr15 - 53122222 53122266 45 browser details YourSeq 38 606 648 2000 95.4% chr8 + 117959005 117959048 44 browser details YourSeq 38 636 690 2000 82.3% chr14 + 110096179 110096229 51 browser details YourSeq 38 626 679 2000 95.5% chr11 + 91078027 91078088 62 browser details YourSeq 37 622 675 2000 85.2% chr16 - 75617684 75617739 56 browser details YourSeq 37 636 690 2000 76.2% chr15 - 31055843 31055888 46 browser details YourSeq 37 737 811 2000 67.5% chr14 - 68040365 68040408 44 browser details YourSeq 36 600 660 2000 66.7% chr1 + 5066486 5066527 42 browser details YourSeq 34 639 683 2000 92.4% chr15 - 36106341 36106385 45 browser details YourSeq 33 598 692 2000 64.2% chr11 - 64947092 64947140 49 browser details YourSeq 32 616 660 2000 72.3% chr6 - 30907886 30907921 36 browser details YourSeq 31 646 703 2000 97.1% chr11 + 36650562 36650623 62 browser details YourSeq 29 608 660 2000 65.7% chr16 - 71911695 71911726 32 browser details YourSeq 28 736 784 2000 96.7% chrX - 97549226 97549276 51 browser details YourSeq 25 634 659 2000 100.0% chr12 - 43548555 43548754 200 browser details YourSeq 24 232 255 2000 100.0% chr12 - 92224059 92224082 24

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 + 21973866 21975865 2000 browser details YourSeq 25 1123 1164 2000 81.0% chrX - 39118886 39118928 43 browser details YourSeq 25 345 369 2000 100.0% chr1 + 73635251 73635275 25 browser details YourSeq 24 1150 1175 2000 88.0% chr1 + 106619170 106619194 25 browser details YourSeq 22 728 749 2000 100.0% chr4 - 100448571 100448592 22

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Ccdc70 coiled-coil domain containing 70 [ Mus musculus (house mouse) ] Gene ID: 67929, updated on 24-Oct-2019

Gene summary

Official Symbol Ccdc70 provided by MGI Official Full Name coiled-coil domain containing 70 provided by MGI Primary source MGI:MGI:1915179 See related Ensembl:ENSMUSG00000017049 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 1700112P19Rik Expression Restricted expression toward testis adult (RPKM 22.2) See more Orthologs human all

Genomic context

Location: 8; 8 A2 See Ccdc70 in Genome Data Viewer Exon count: 3

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (21969775..21974055)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (23081068..23084513)

Chromosome 8 - NC_000074.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Ccdc70 ENSMUSG00000017049

Description coiled-coil domain containing 70 [Source:MGI Symbol;Acc:MGI:1915179] Gene Synonyms 1700112P19Rik Location Chromosome 8: 21,969,775-21,974,041 forward strand. GRCm38:CM001001.2 About this gene This gene has 2 transcripts (splice variants), 88 orthologues, 4 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ccdc70-201 ENSMUST00000017193.1 1054 223aa ENSMUSP00000017193.1 Protein coding CCDS22167 Q9D9B0 TSL:1 GENCODE basic APPRIS P1

Ccdc70-202 ENSMUST00000070649.1 961 223aa ENSMUSP00000069249.1 Protein coding CCDS22167 Q9D9B0 TSL:1 GENCODE basic APPRIS P1

24.27 kb Forward strand 21.96Mb 21.97Mb 21.98Mb (Comprehensive set... Fam90a1a-201 >protein coding Ccdc70-202 >protein coding

Ccdc70-201 >protein coding

Contigs < AL590619.9 Regulatory Build

21.96Mb 21.97Mb 21.98Mb Reverse strand 24.27 kb

Regulation Legend

CTCF Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000017193

3.45 kb Forward strand

Ccdc70-201 >protein coding

ENSMUSP00000017... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) PANTHER PTHR21533:SF22

PTHR21533

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 200 223

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8