https://www.alphaknockout.com

Mouse Cchcr1 Knockout Project (CRISPR/Cas9)

Objective: To create a Cchcr1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cchcr1 (NCBI Reference Sequence: NM_146248 ; Ensembl: ENSMUSG00000040312 ) is located on Mouse 17. 18 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 18 (Transcript: ENSMUST00000164242). Exon 4~18 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 4 starts from about 10.0% of the coding region. Exon 4~18 covers 90.04% of the coding region. The size of effective KO region: ~9957 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4 5 6 11 12 13 1415 16 17 18

Legends Exon of mouse Cchcr1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.8% 456) | C(25.35% 507) | T(29.1% 582) | G(22.75% 455)

Note: The 2000 bp section upstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(23.05% 461) | C(26.1% 522) | T(24.95% 499) | G(25.9% 518)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr17 + 35518876 35520875 2000 browser details YourSeq 307 98 1483 2000 90.9% chr11 + 98841424 99032555 191132 browser details YourSeq 234 110 577 2000 91.3% chr11 - 6612874 6613762 889 browser details YourSeq 228 154 577 2000 93.9% chr8 - 84195276 84195799 524 browser details YourSeq 216 165 694 2000 87.2% chr4 + 132486816 132487559 744 browser details YourSeq 212 95 489 2000 84.1% chr11 - 97283246 97283616 371 browser details YourSeq 211 101 489 2000 90.4% chr2 + 122238791 122239318 528 browser details YourSeq 209 154 490 2000 89.5% chrX - 137020180 137020517 338 browser details YourSeq 208 154 490 2000 91.6% chr4 - 149956549 150336319 379771 browser details YourSeq 205 154 489 2000 89.0% chr10 - 42553453 42554011 559 browser details YourSeq 201 98 510 2000 83.6% chr14 + 79394350 79394686 337 browser details YourSeq 192 154 490 2000 91.4% chr12 - 111237090 111607031 369942 browser details YourSeq 185 154 488 2000 89.0% chr17 - 31528893 31529375 483 browser details YourSeq 185 96 489 2000 91.9% chr15 - 38424528 38425037 510 browser details YourSeq 185 138 487 2000 85.9% chr17 + 33907490 33907803 314 browser details YourSeq 174 361 668 2000 86.9% chr7 + 102212596 102213021 426 browser details YourSeq 169 165 487 2000 90.1% chr1 - 63090941 63091460 520 browser details YourSeq 164 165 491 2000 91.1% chr7 - 101914781 101915266 486 browser details YourSeq 157 205 528 2000 89.0% chr4 + 150548958 150549511 554 browser details YourSeq 155 309 490 2000 94.3% chr16 + 20271161 20271346 186

Note: The 2000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr17 + 35530833 35532832 2000 browser details YourSeq 137 727 1378 2000 78.9% chr16 - 16993300 16993649 350 browser details YourSeq 123 646 790 2000 96.3% chr2 - 172587885 172588055 171 browser details YourSeq 117 442 1157 2000 94.0% chr15 + 88707527 89112394 404868 browser details YourSeq 108 646 1213 2000 81.8% chr1 + 16342773 16516912 174140 browser details YourSeq 107 362 734 2000 90.5% chr10 - 59975284 59975746 463 browser details YourSeq 101 646 788 2000 92.5% chr16 - 57082975 57083205 231 browser details YourSeq 88 379 623 2000 78.0% chr10 - 121490901 121491124 224 browser details YourSeq 87 649 755 2000 93.2% chr10 - 121849100 121849309 210 browser details YourSeq 76 648 778 2000 93.4% chr18 - 13430202 13430350 149 browser details YourSeq 76 1109 1433 2000 79.0% chr15 - 37678322 37678745 424 browser details YourSeq 75 646 765 2000 94.2% chr14 + 11617837 11618124 288 browser details YourSeq 74 663 1216 2000 78.7% chr14 + 48961878 48962432 555 browser details YourSeq 73 1109 1346 2000 84.2% chr1 - 133836586 133836881 296 browser details YourSeq 72 686 791 2000 95.1% chr11 + 35224765 35224934 170 browser details YourSeq 67 1109 1274 2000 77.7% chr14 - 69777894 69778087 194 browser details YourSeq 67 367 619 2000 68.4% chr16 + 18640852 18640992 141 browser details YourSeq 66 374 481 2000 83.2% chr13 + 55216797 55216913 117 browser details YourSeq 61 649 728 2000 94.3% chr12 - 105384956 105385172 217 browser details YourSeq 61 409 806 2000 64.9% chr10 - 99214281 99214372 92

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Cchcr1 coiled-coil alpha-helical rod protein 1 [ Mus musculus (house mouse) ] Gene ID: 240084, updated on 24-Oct-2019

Gene summary

Official Symbol Cchcr1 provided by MGI Official Full Name coiled-coil alpha-helical rod protein 1 provided by MGI Primary source MGI:MGI:2385321 See related Ensembl:ENSMUSG00000040312 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Hcr Expression Biased expression in testis adult (RPKM 83.7), CNS E11.5 (RPKM 7.5) and 1 other tissue See more Orthologs human all

Genomic context

Location: 17; 17 B1 See Cchcr1 in Genome Data Viewer Exon count: 21

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (35517054..35531015)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (35654061..35667960)

Chromosome 17 - NC_000083.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Cchcr1 ENSMUSG00000040312

Description coiled-coil alpha-helical rod protein 1 [Source:MGI Symbol;Acc:MGI:2385321] Gene Synonyms Hcr Location Chromosome 17: 35,517,100-35,531,015 forward strand. GRCm38:CM001010.2 About this gene This gene has 7 transcripts (splice variants), 142 orthologues, 2 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cchcr1-201 ENSMUST00000045956.13 2748 770aa ENSMUSP00000046612.7 Protein coding CCDS50090 Q8K2I2 TSL:1 GENCODE basic APPRIS P2

Cchcr1-202 ENSMUST00000164242.8 2656 770aa ENSMUSP00000132028.2 Protein coding CCDS50090 Q8K2I2 TSL:1 GENCODE basic APPRIS P2

Cchcr1-205 ENSMUST00000173903.1 3054 867aa ENSMUSP00000133407.1 Protein coding - G3UWS7 TSL:5 GENCODE basic APPRIS ALT2

Cchcr1-207 ENSMUST00000174827.2 2333 No protein - Retained intron - - TSL:1

Cchcr1-203 ENSMUST00000172893.1 464 No protein - Retained intron - - TSL:3

Cchcr1-204 ENSMUST00000173582.1 439 No protein - Retained intron - - TSL:3

Cchcr1-206 ENSMUST00000173986.1 594 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

33.92 kb Forward strand

35.51Mb 35.52Mb 35.53Mb 35.54Mb (Comprehensive set... Pou5f1-201 >protein coding Cchcr1-202 >protein coding Psors1c2-201 >protein coding

Pou5f1-205 >protein coding Cchcr1-207 >retained intron Cchcr1-204 >retained intron

Pou5f1-202 >protein coding Cchcr1-201 >protein coding

Pou5f1-204 >protein coding Cchcr1-205 >protein coding

Pou5f1-203 >protein coding Cchcr1-206 >lncRNA Cchcr1-203 >retained intron

Pou5f1-206 >protein coding

Contigs CR974473.23 > Genes < Gm19553-201lncRNA < Tcf19-201protein coding (Comprehensive set...

< Tcf19-203protein coding

< Tcf19-202protein coding

< Tcf19-204protein coding

Regulatory Build

35.51Mb 35.52Mb 35.53Mb 35.54Mb Reverse strand 33.92 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000164242

13.90 kb Forward strand

Cchcr1-202 >protein coding

ENSMUSP00000132... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam Coiled-coil alpha-helical rod protein 1 PANTHER Coiled-coil alpha-helical rod protein 1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 770

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9