https://www.alphaknockout.com

Mouse Cchcr1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Cchcr1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cchcr1 (NCBI Reference Sequence: NM_146248 ; Ensembl: ENSMUSG00000040312 ) is located on Mouse 17. 18 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 18 (Transcript: ENSMUST00000164242). Exon 5~10 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cchcr1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-152G18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 23.16% of the coding region. The knockout of Exon 5~10 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 3267 bp, and the size of intron 10 for 3'-loxP site insertion: 1353 bp. The size of effective cKO region: ~2770 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 5 6 7 8 9 10 11 12 13 14 18 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Cchcr1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(9270bp) | A(22.96% 2128) | C(25.07% 2324) | T(23.34% 2164) | G(28.63% 2654)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 + 35521197 35524196 3000 browser details YourSeq 270 728 1039 3000 93.9% chr16 - 20583434 20583955 522 browser details YourSeq 269 730 1066 3000 92.2% chr1 - 89027792 89028143 352 browser details YourSeq 265 728 1077 3000 91.4% chr5 - 124545475 124545922 448 browser details YourSeq 183 2375 2617 3000 88.6% chr1 - 160934939 160935368 430 browser details YourSeq 180 2378 2614 3000 89.7% chr12 + 54677594 54678183 590 browser details YourSeq 161 728 1023 3000 88.5% chr11 - 4956036 4956328 293 browser details YourSeq 154 1904 2482 3000 81.9% chr15 - 81218470 81218939 470 browser details YourSeq 148 1894 2476 3000 80.8% chr10 + 40996292 40996611 320 browser details YourSeq 146 732 906 3000 92.5% chr8 - 107637464 107638045 582 browser details YourSeq 145 885 1042 3000 96.8% chr9 + 106698445 106698603 159 browser details YourSeq 144 2332 2664 3000 84.6% chr10 + 77642491 77642652 162 browser details YourSeq 142 885 1040 3000 95.6% chr6 - 134686818 134686973 156 browser details YourSeq 142 882 1040 3000 96.2% chr1 + 59795859 59796020 162 browser details YourSeq 141 885 1040 3000 95.5% chr15 - 59660697 59660853 157 browser details YourSeq 140 885 1043 3000 92.4% chr6 + 47469026 47469182 157 browser details YourSeq 140 882 1037 3000 94.9% chr10 + 62769806 62769961 156 browser details YourSeq 139 721 887 3000 92.3% chr4 - 124644605 124644796 192 browser details YourSeq 138 728 1067 3000 83.8% chr7 + 39556136 39556317 182 browser details YourSeq 137 886 1033 3000 94.6% chr11 - 117623816 117623961 146

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 + 35526967 35529966 3000 browser details YourSeq 166 2251 2435 3000 96.7% chr4 - 156163132 156163340 209 browser details YourSeq 160 2250 2452 3000 93.1% chr3 + 62947994 62948207 214 browser details YourSeq 156 2249 2432 3000 94.9% chr11 - 78809207 78809405 199 browser details YourSeq 154 2248 2425 3000 95.9% chr5 - 134212069 134212259 191 browser details YourSeq 154 2257 2431 3000 95.4% chr18 - 42206733 42473241 266509 browser details YourSeq 154 2228 2432 3000 96.4% chr7 + 116578894 116579371 478 browser details YourSeq 154 2249 2433 3000 91.2% chr11 + 86862641 86862817 177 browser details YourSeq 153 2257 2432 3000 94.8% chr4 - 35277369 35277555 187 browser details YourSeq 152 2257 2433 3000 95.4% chr2 - 155367285 155367467 183 browser details YourSeq 152 2250 2425 3000 94.2% chr13 - 87991684 87991859 176 browser details YourSeq 152 2249 2432 3000 94.2% chr11 - 100598791 100598981 191 browser details YourSeq 151 2249 2432 3000 92.6% chrX - 7254446 7254621 176 browser details YourSeq 151 2249 2425 3000 95.3% chr5 - 124899259 124899584 326 browser details YourSeq 151 2260 2432 3000 92.4% chr3 - 23715709 23715879 171 browser details YourSeq 150 2262 2429 3000 97.0% chr10 - 40215350 40215523 174 browser details YourSeq 150 2260 2432 3000 94.6% chr10 + 60004283 60004476 194 browser details YourSeq 149 2257 2431 3000 94.2% chr3 - 138819477 138819662 186 browser details YourSeq 148 2253 2447 3000 89.4% chr13 + 103784577 103784762 186 browser details YourSeq 147 2249 2432 3000 89.2% chr17 - 37019895 37020065 171

Note: The 3000 bp section downstream of Exon 10 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Cchcr1 coiled-coil alpha-helical rod protein 1 [ Mus musculus (house mouse) ] Gene ID: 240084, updated on 24-Oct-2019

Gene summary

Official Symbol Cchcr1 provided by MGI Official Full Name coiled-coil alpha-helical rod protein 1 provided by MGI Primary source MGI:MGI:2385321 See related Ensembl:ENSMUSG00000040312 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Hcr Expression Biased expression in testis adult (RPKM 83.7), CNS E11.5 (RPKM 7.5) and 1 other tissue See more Orthologs human all

Genomic context

Location: 17; 17 B1 See Cchcr1 in Genome Data Viewer

Exon count: 21

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (35517054..35531015)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (35654061..35667960)

Chromosome 17 - NC_000083.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Cchcr1 ENSMUSG00000040312

Description coiled-coil alpha-helical rod protein 1 [Source:MGI Symbol;Acc:MGI:2385321] Gene Synonyms Hcr Location Chromosome 17: 35,517,100-35,531,015 forward strand. GRCm38:CM001010.2 About this gene This gene has 7 transcripts (splice variants), 142 orthologues, 2 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cchcr1-201 ENSMUST00000045956.13 2748 770aa ENSMUSP00000046612.7 Protein coding CCDS50090 Q8K2I2 TSL:1 GENCODE basic APPRIS P2

Cchcr1-202 ENSMUST00000164242.8 2656 770aa ENSMUSP00000132028.2 Protein coding CCDS50090 Q8K2I2 TSL:1 GENCODE basic APPRIS P2

Cchcr1-205 ENSMUST00000173903.1 3054 867aa ENSMUSP00000133407.1 Protein coding - G3UWS7 TSL:5 GENCODE basic APPRIS ALT2

Cchcr1-207 ENSMUST00000174827.2 2333 No protein - Retained intron - - TSL:1

Cchcr1-203 ENSMUST00000172893.1 464 No protein - Retained intron - - TSL:3

Cchcr1-204 ENSMUST00000173582.1 439 No protein - Retained intron - - TSL:3

Cchcr1-206 ENSMUST00000173986.1 594 No protein - lncRNA - - TSL:3

Page 6 of 8 https://www.alphaknockout.com

33.92 kb Forward strand

35.51Mb 35.52Mb 35.53Mb 35.54Mb (Comprehensive set... Pou5f1-201 >protein coding Cchcr1-202 >protein coding Psors1c2-201 >protein coding

Pou5f1-205 >protein coding Cchcr1-207 >retained intron Cchcr1-204 >retained intron

Pou5f1-202 >protein coding Cchcr1-201 >protein coding

Pou5f1-204 >protein coding Cchcr1-205 >protein coding

Pou5f1-203 >protein coding Cchcr1-206 >lncRNA Cchcr1-203 >retained intron

Pou5f1-206 >protein coding

Contigs CR974473.23 > Genes < Gm19553-201lncRNA < Tcf19-201protein coding (Comprehensive set...

< Tcf19-203protein coding

< Tcf19-202protein coding

< Tcf19-204protein coding

Regulatory Build

35.51Mb 35.52Mb 35.53Mb 35.54Mb Reverse strand 33.92 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000164242

13.90 kb Forward strand

Cchcr1-202 >protein coding

ENSMUSP00000132... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam Coiled-coil alpha-helical rod protein 1 PANTHER Coiled-coil alpha-helical rod protein 1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 770

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8