https://www.alphaknockout.com

Mouse Cnot2 Knockout Project (CRISPR/Cas9)

Objective: To create a Cnot2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cnot2 (NCBI Reference Sequence: NM_001037846 ; Ensembl: ENSMUSG00000020166 ) is located on Mouse 10. 16 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 16 (Transcript: ENSMUST00000105267). Exon 5~11 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 14.75% of the coding region. Exon 5~11 covers 58.02% of the coding region. The size of effective KO region: ~9103 bp. The KO region does not have any other known gene.

Page 1 of 10 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3' 11

1 5 6 7 8 9 10 16

Legends Exon of mouse Cnot2 Knockout region

Page 2 of 10 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 11 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 10 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(32.9% 658) | C(16.6% 332) | T(30.7% 614) | G(19.8% 396)

Note: The 2000 bp section upstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.0% 420) | C(23.85% 477) | T(37.5% 750) | G(17.65% 353)

Note: The 2000 bp section downstream of Exon 11 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 10 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr10 - 116507189 116509188 2000 browser details YourSeq 38 1297 1410 2000 93.2% chr3 - 138571142 138571548 407 browser details YourSeq 20 1112 1143 2000 81.3% chr15 - 5519151 5519182 32

Note: The 2000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr10 - 116496086 116498085 2000 browser details YourSeq 167 700 918 2000 89.8% chr19 + 22302568 22302768 201 browser details YourSeq 162 697 924 2000 87.8% chr11 - 78227315 78227509 195 browser details YourSeq 160 707 918 2000 89.8% chrX + 84679629 84679832 204 browser details YourSeq 159 723 927 2000 96.0% chrX + 78770159 78770366 208 browser details YourSeq 158 697 915 2000 92.0% chr10 + 43237368 43504816 267449 browser details YourSeq 158 702 928 2000 92.0% chr1 + 16658301 16658547 247 browser details YourSeq 156 709 916 2000 86.6% chr11 + 6825120 6825305 186 browser details YourSeq 155 739 923 2000 92.9% chr1 + 171737670 171737924 255 browser details YourSeq 153 697 928 2000 93.1% chr5 - 134609167 134609583 417 browser details YourSeq 150 724 918 2000 91.6% chr11 - 89191656 89191857 202 browser details YourSeq 149 704 924 2000 94.2% chr11 - 70636244 70636505 262 browser details YourSeq 149 739 924 2000 88.5% chr7 + 110764559 110764724 166 browser details YourSeq 148 715 911 2000 88.4% chr17 - 14956991 14957154 164 browser details YourSeq 146 724 915 2000 88.5% chrX - 35858218 35858382 165 browser details YourSeq 146 707 923 2000 86.2% chr12 + 15824013 15824179 167 browser details YourSeq 145 728 915 2000 90.9% chr5 + 71190657 71190849 193 browser details YourSeq 145 739 928 2000 88.1% chr1 + 36671701 36671872 172 browser details YourSeq 143 731 915 2000 90.6% chr16 + 15700199 15700400 202 browser details YourSeq 142 708 923 2000 85.3% chr13 - 51048900 51049062 163

Note: The 2000 bp section downstream of Exon 11 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 10 https://www.alphaknockout.com

Gene and information: Cnot2 CCR4-NOT transcription complex, subunit 2 [ Mus musculus (house mouse) ] Gene ID: 72068, updated on 24-Oct-2019

Gene summary

Official Symbol Cnot2 provided by MGI Official Full Name CCR4-NOT transcription complex, subunit 2 provided by MGI Primary source MGI:MGI:1919318 See related Ensembl:ENSMUSG00000020166 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as C79650; AA537049; AA959607; AW557563; 2600016M12Rik; 2810470K03Rik Expression Ubiquitous expression in limb E14.5 (RPKM 10.3), CNS E11.5 (RPKM 10.3) and 27 other tissues See more Orthologs human all

Genomic context

Location: 10; 10 D2 See Cnot2 in Genome Data Viewer Exon count: 19

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (116485160..116581900, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (115922217..116018567, complement)

Chromosome 10 - NC_000076.6

Page 6 of 10 https://www.alphaknockout.com

Transcript information: This gene has 19 transcripts

Gene: Cnot2 ENSMUSG00000020166

Description CCR4-NOT transcription complex, subunit 2 [Source:MGI Symbol;Acc:MGI:1919318] Gene Synonyms 2600016M12Rik, 2810470K03Rik Location Chromosome 10: 116,485,161-116,581,511 reverse strand. GRCm38:CM001003.2 About this gene This gene has 19 transcripts (splice variants), 252 orthologues, 1 paralogue and is a member of 2 Ensembl protein families. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cnot2- ENSMUST00000105267.7 2800 540aa ENSMUSP00000100902.1 Protein coding CCDS36064 Q8C5L3 TSL:1 203 GENCODE basic APPRIS P1

Cnot2- ENSMUST00000168036.7 2693 499aa ENSMUSP00000132315.1 Protein coding CCDS24185 E9Q027 TSL:5 210 GENCODE basic

Cnot2- ENSMUST00000164088.7 2618 499aa ENSMUSP00000127830.1 Protein coding CCDS24185 E9Q027 TSL:1 204 GENCODE basic

Cnot2- ENSMUST00000169921.7 2120 540aa ENSMUSP00000132152.1 Protein coding CCDS36064 Q8C5L3 TSL:1 213 GENCODE basic APPRIS P1

Cnot2- ENSMUST00000020374.5 1059 109aa ENSMUSP00000020374.5 Protein coding CCDS36065 H7BWX6 TSL:1 201 GENCODE basic

Cnot2- ENSMUST00000105265.7 2268 455aa ENSMUSP00000100900.1 Protein coding - Q8C5L3 TSL:1 202 GENCODE basic

Cnot2- ENSMUST00000167706.7 1621 490aa ENSMUSP00000128837.1 Protein coding - E9Q8D5 TSL:5 209 GENCODE basic

Cnot2- ENSMUST00000218744.1 358 78aa ENSMUSP00000151501.1 Protein coding - A0A1W2P771 CDS 3' 218 incomplete TSL:3

Cnot2- ENSMUST00000169576.7 2955 48aa ENSMUSP00000130192.1 Nonsense mediated - E9Q7M5 TSL:1 212 decay

Cnot2- ENSMUST00000169507.7 810 57aa ENSMUSP00000128720.1 Nonsense mediated - E9Q8R6 TSL:5 211 decay

Cnot2- ENSMUST00000218490.1 777 57aa ENSMUSP00000151847.1 Nonsense mediated - E9Q8R6 TSL:3 217 decay

Cnot2- ENSMUST00000219544.1 611 No - Retained intron - - TSL:2 219 protein

Cnot2- ENSMUST00000165527.7 1044 No - lncRNA - - TSL:1 206 protein

Cnot2- ENSMUST00000169937.7 891 No - lncRNA - - TSL:5 214 protein

Cnot2- ENSMUST00000164383.7 823 No - lncRNA - - TSL:5 205 protein

Cnot2- ENSMUST00000171214.7 623 No - lncRNA - - TSL:5 215 protein

Cnot2- ENSMUST00000167644.1 592 No - lncRNA - - TSL:3 208 protein

Page 7 of 10 https://www.alphaknockout.com

Cnot2- ENSMUST00000166166.7 442 No - lncRNA - - TSL:3 207 protein

Cnot2- ENSMUST00000171944.7 341 No - lncRNA - - TSL:3 216 protein

116.35 kb Forward strand 116.48Mb 116.50Mb 116.52Mb 116.54Mb 116.56Mb 116.58Mb Kcnmb4os2-203 >lncRNA 5330438D12Rik-203 >lncRNA Gm49344-201 >TEC (Comprehensive set...

5330438D12Rik-205 >lncRNA

5330438D12Rik-201 >pseudogene

5330438D12Rik-202 >lncRNA

5330438D12Rik-204 >lncRNA

Contigs AC139376.3 >

Genes < Cnot2-210protein coding (Comprehensive set...

< Cnot2-212nonsense mediated decay

< Cnot2-204protein coding

< Cnot2-203protein coding

< Cnot2-202protein coding

< Cnot2-213protein coding

< Cnot2-209protein coding

< Cnot2-219retained intron < Cnot2-218protein coding

< Cnot2-214lncRNA

< Cnot2-215lncRNA

< Cnot2-205lncRNA

< Cnot2-207lncRNA

< Cnot2-211nonsense mediated decay

< Cnot2-217nonsense mediated decay

< Cnot2-216lncRNA

< Cnot2-201protein coding

< Cnot2-206lncRNA

< Cnot2-208lncRNA

< Gm25190-201miRNA

Regulatory Build

116.48Mb 116.50Mb 116.52Mb 116.54Mb 116.56Mb 116.58Mb Reverse strand 116.35 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site Page 8 of 10

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene processed transcript RNA gene 116.35 kb Forward strand 116.48Mb 116.50Mb 116.52Mb 116.54Mb 116.56Mb 116.58Mb Genes Kcnmb4os2-203 >lncRNA 5330438D12Rik-203 >lncRNA Gm49344-201 >TEC (Comprehensive set...

5330438D12Rik-205 >lncRNA

5330438D12Rik-201 >pseudogene

5330438D12Rik-202 >lncRNA

5330438D12Rik-204 >lncRNA

Contigs AC139376.3 > Genes (Comprehensive set... < Cnot2-210protein coding

< Cnot2-212nonsense mediated decay

< Cnot2-204protein coding

< Cnot2-203protein coding

< Cnot2-202protein coding

< Cnot2-213protein coding

< Cnot2-209protein coding

< Cnot2-219retained intron < Cnot2-218protein coding

< Cnot2-214lncRNA

< Cnot2-215lncRNA

< Cnot2-205lncRNA

< Cnot2-207lncRNA

< Cnot2-211nonsense mediated decay

< Cnot2-217nonsense mediated decay

< Cnot2-216lncRNA

< Cnot2-201protein coding

< Cnot2-206lncRNA

< Cnot2-208lncRNA

< Gm25190-201miRNA

Regulatory Build

116.48Mb 116.50Mb 116.52Mb 116.54Mb 116.56Mb 116.58Mb Reverse strand 116.35 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter https://www.alphaknockout.com Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene processed transcript RNA gene

Page 9 of 10 https://www.alphaknockout.com

Transcript: ENSMUST00000105267

< Cnot2-203protein coding

Reverse strand 96.33 kb

ENSMUSP00000100... MobiDB lite Low complexity (Seg) Pfam NOT2/NOT3/NOT5, C-terminal PANTHER Not2/Not3/Not5

PTHR23326:SF3 Gene3D CCR4-NOT complex subunit 2/3/5, N-terminal domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 10 of 10