https://www.alphaknockout.com

Mouse Ccdc113 Knockout Project (CRISPR/Cas9)

Objective: To create a Ccdc113 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ccdc113 (NCBI Reference Sequence: NM_172914 ; Ensembl: ENSMUSG00000036598 ) is located on Mouse 8. 9 are identified, with the ATG start codon in 1 and the TAG stop codon in exon 9 (Transcript: ENSMUST00000041569). Exon 2~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 9.02% of the coding region. Exon 2~6 covers 60.3% of the coding region. The size of effective KO region: ~9585 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 9

Legends Exon of mouse Ccdc113 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.6% 592) | C(20.1% 402) | T(26.95% 539) | G(23.35% 467)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.95% 539) | C(26.3% 526) | T(23.1% 462) | G(23.65% 473)

Note: The 2000 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 + 95534466 95536465 2000 browser details YourSeq 124 1017 1339 2000 87.2% chr18 + 42187977 42188321 345 browser details YourSeq 120 1020 1427 2000 91.1% chr11 + 53611039 53611513 475 browser details YourSeq 106 838 1276 2000 74.5% chr5 - 96788101 96788323 223 browser details YourSeq 106 1015 1284 2000 85.7% chr11 - 60988211 60988482 272 browser details YourSeq 104 1020 1307 2000 83.9% chr18 - 38410257 38410537 281 browser details YourSeq 98 1017 1273 2000 87.2% chr8 + 94044604 94044863 260 browser details YourSeq 97 1200 1426 2000 89.5% chr11 + 65611718 65612110 393 browser details YourSeq 93 1020 1281 2000 90.4% chr9 + 65050606 65050885 280 browser details YourSeq 93 1020 1283 2000 90.5% chr18 + 75063822 75064112 291 browser details YourSeq 93 1040 1329 2000 78.4% chr16 + 5230055 5230337 283 browser details YourSeq 88 1120 1342 2000 90.2% chr14 + 62268646 62268879 234 browser details YourSeq 88 1072 1281 2000 80.0% chr13 + 47699504 47699701 198 browser details YourSeq 88 1200 1322 2000 86.2% chr12 + 3934512 3934636 125 browser details YourSeq 87 1015 1342 2000 86.5% chr3 - 101196536 101196907 372 browser details YourSeq 87 1199 1339 2000 79.9% chr7 + 127316909 127317048 140 browser details YourSeq 87 1120 1281 2000 92.4% chr11 + 63478045 63478212 168 browser details YourSeq 86 1020 1286 2000 89.2% chr7 + 40780785 40781053 269 browser details YourSeq 86 1136 1284 2000 83.2% chr18 + 57329166 57329309 144 browser details YourSeq 85 1017 1286 2000 87.2% chr8 - 95588496 95588764 269

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 + 95546051 95548050 2000 browser details YourSeq 61 890 1469 2000 94.2% chr4 + 47377818 47467191 89374 browser details YourSeq 50 841 951 2000 71.3% chr13 - 117327701 117327779 79 browser details YourSeq 46 870 931 2000 94.3% chr8 - 3366713 3366776 64 browser details YourSeq 45 763 817 2000 94.5% chr3 - 35143403 35143481 79 browser details YourSeq 43 879 946 2000 90.4% chr5 - 31729213 31729281 69 browser details YourSeq 42 758 931 2000 61.6% chr1 - 9568754 9568839 86 browser details YourSeq 41 842 1003 2000 70.8% chr10 + 127917634 127917806 173 browser details YourSeq 40 844 931 2000 71.2% chr4 + 53401061 53401118 58 browser details YourSeq 40 890 969 2000 90.0% chr10 + 81645524 81645609 86 browser details YourSeq 39 839 903 2000 73.7% chr17 - 31248753 31248811 59 browser details YourSeq 38 885 931 2000 91.4% chr14 + 15070722 15070770 49 browser details YourSeq 37 886 950 2000 85.8% chr7 - 132957233 132957296 64 browser details YourSeq 36 830 901 2000 95.0% chr7 - 109441215 109441291 77 browser details YourSeq 36 889 930 2000 95.0% chr17 - 33473073 33473116 44 browser details YourSeq 36 762 931 2000 92.2% chr5 + 90113211 90113379 169 browser details YourSeq 35 699 733 2000 100.0% chr13 - 88923846 88923880 35 browser details YourSeq 35 917 969 2000 83.1% chr2 + 91706915 91706967 53 browser details YourSeq 35 841 895 2000 73.2% chr1 + 121815376 121815418 43 browser details YourSeq 34 890 938 2000 94.8% chr8 - 25747303 25747353 51

Note: The 2000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Ccdc113 coiled-coil domain containing 113 [ Mus musculus (house mouse) ] Gene ID: 244608, updated on 10-Oct-2019

Gene summary

Official Symbol Ccdc113 provided by MGI Official Full Name coiled-coil domain containing 113 provided by MGI Primary source MGI:MGI:3606076 See related Ensembl:ENSMUSG00000036598 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as BC060957; 4933409I22 Expression Biased expression in testis adult (RPKM 24.6) and lung adult (RPKM 1.2) See more Orthologs human all

Genomic context

Location: 8; 8 C5 See Ccdc113 in Genome Data Viewer Exon count: 10

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (95534060..95558888)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (98058000..98082788)

Chromosome 8 - NC_000074.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Ccdc113 ENSMUSG00000036598

Description coiled-coil domain containing 113 [Source:MGI Symbol;Acc:MGI:3606076] Location Chromosome 8: 95,534,085-95,558,890 forward strand. GRCm38:CM001001.2 About this gene This gene has 1 transcript (splice variant), 152 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ccdc113-201 ENSMUST00000041569.4 1288 377aa ENSMUSP00000049497.3 Protein coding CCDS22563 Q8C5T8 TSL:1 GENCODE basic APPRIS P1

44.81 kb Forward strand

95.53Mb 95.54Mb 95.55Mb 95.56Mb (Comprehensive set... Ccdc113-201 >protein coding

1700112L15Rik-201 >TEC

Contigs AC102555.11 >

Genes < Prss54-204lncRNA (Comprehensive set...

< Prss54-205protein coding

< Prss54-201protein coding

< Prss54-202protein coding

Regulatory Build

95.53Mb 95.54Mb 95.55Mb 95.56Mb Reverse strand 44.81 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000041569

24.81 kb Forward strand

Ccdc113-201 >protein coding

ENSMUSP00000049... Low complexity (Seg) Coiled-coils (Ncoils) Pfam Domain of unknown function DUF4201 PANTHER PTHR15654

PTHR15654:SF2

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 377

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8