https://www.alphaknockout.com

Mouse Cnot1 Knockout Project (CRISPR/Cas9)

Objective: To create a Cnot1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cnot1 (NCBI Reference Sequence: NM_001205226 ; Ensembl: ENSMUSG00000036550 ) is located on Mouse 8. 49 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 49 (Transcript: ENSMUST00000211887). Exon 5~12 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice hmozygous for a conditional allele activated in cardiomyocytes exhibit postnatal lethality, decreased cardiac muscle contractility, prolonged QT interval and cardiac muscle cell death.

Exon 5 starts from about 4.28% of the coding region. Exon 5~12 covers 14.55% of the coding region. The size of effective KO region: ~9183 bp. The KO region does not have any other known gene.

Page 1 of 10 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 5 6 7 8 9 10 11 12 49

Legends Exon of mouse Cnot1 Knockout region

Page 2 of 10 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 990 bp section upstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1226 bp section downstream of Exon 12 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 10 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(990bp) | A(27.07% 268) | C(17.88% 177) | T(34.95% 346) | G(20.1% 199)

Note: The 990 bp section upstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1226bp) | A(30.18% 370) | C(14.93% 183) | T(38.01% 466) | G(16.88% 207)

Note: The 1226 bp section downstream of Exon 12 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 10 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 990 1 990 990 100.0% chr8 - 95773640 95774629 990 browser details YourSeq 276 466 949 990 94.9% chr4 + 122993520 122994099 580 browser details YourSeq 272 328 952 990 92.6% chr2 + 121607825 121608472 648 browser details YourSeq 256 459 964 990 87.5% chr13 + 98806477 98806818 342 browser details YourSeq 251 475 944 990 93.5% chr1 + 13087406 13088042 637 browser details YourSeq 228 506 949 990 92.3% chr15 + 38111462 38112107 646 browser details YourSeq 222 476 950 990 87.0% chr11 + 100475501 100475842 342 browser details YourSeq 211 466 940 990 84.6% chr11 + 97724675 97725065 391 browser details YourSeq 208 500 940 990 93.4% chr19 - 32334394 32734713 400320 browser details YourSeq 202 462 975 990 87.2% chr1 + 183326221 183326640 420 browser details YourSeq 199 482 919 990 85.4% chr7 + 4818717 4818994 278 browser details YourSeq 187 744 961 990 91.4% chr8 + 80843352 80843560 209 browser details YourSeq 183 767 953 990 99.0% chr4 - 32575709 32575895 187 browser details YourSeq 179 770 957 990 97.9% chr18 - 38173391 38173579 189 browser details YourSeq 177 749 951 990 97.4% chr11 + 70902584 70902786 203 browser details YourSeq 176 556 950 990 93.6% chr14 - 60725227 60725648 422 browser details YourSeq 176 767 952 990 97.4% chr10 - 80860264 80860449 186 browser details YourSeq 174 749 957 990 90.1% chr7 + 52755896 52756097 202 browser details YourSeq 174 768 951 990 97.3% chr1 + 170185742 170185925 184 browser details YourSeq 173 769 952 990 96.7% chr5 - 115351236 115351418 183

Note: The 990 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1226 1 1226 1226 100.0% chr8 - 95763231 95764456 1226 browser details YourSeq 93 74 235 1226 80.5% chr12 - 6555285 6555444 160 browser details YourSeq 89 69 208 1226 83.6% chr1 - 42993913 43277189 283277 browser details YourSeq 87 21 175 1226 91.6% chr4 - 116067992 116068227 236 browser details YourSeq 87 74 215 1226 82.8% chr15 + 79225965 79226108 144 browser details YourSeq 73 66 175 1226 84.2% chr16 - 4174383 4174502 120 browser details YourSeq 72 64 168 1226 83.0% chr2 + 171631822 171631915 94 browser details YourSeq 69 68 155 1226 89.8% chr12 + 69190633 69190721 89 browser details YourSeq 66 74 155 1226 91.5% chrX - 88771528 88771612 85 browser details YourSeq 66 74 155 1226 91.5% chr2 - 126457943 126458027 85 browser details YourSeq 66 85 175 1226 86.7% chr19 + 43852982 43853073 92 browser details YourSeq 65 62 168 1226 87.4% chr1 - 75354573 75354677 105 browser details YourSeq 62 104 193 1226 84.5% chr2 - 164506081 164506170 90 browser details YourSeq 62 74 175 1226 83.7% chr14 - 47553559 47553672 114 browser details YourSeq 61 54 142 1226 89.8% chr13 + 38002558 38002653 96 browser details YourSeq 60 100 175 1226 89.5% chr1 + 20762424 20762499 76 browser details YourSeq 59 69 151 1226 85.8% chr9 + 21319609 21319689 81 browser details YourSeq 58 69 163 1226 81.3% chrX - 151354578 151354669 92 browser details YourSeq 58 74 155 1226 86.6% chrX + 94185257 94185341 85 browser details YourSeq 58 74 162 1226 92.8% chr9 + 124474981 124475072 92

Note: The 1226 bp section downstream of Exon 12 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 10 https://www.alphaknockout.com

Gene and information: Cnot1 CCR4-NOT transcription complex, subunit 1 [ Mus musculus (house mouse) ] Gene ID: 234594, updated on 24-Oct-2019

Gene summary

Official Symbol Cnot1 provided by MGI Official Full Name CCR4-NOT transcription complex, subunit 1 provided by MGI Primary source MGI:MGI:2442402 See related Ensembl:ENSMUSG00000036550 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AA815922; D830048B13; 6030411K04Rik Expression Ubiquitous expression in testis adult (RPKM 29.9), placenta adult (RPKM 21.5) and 28 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 D1 See Cnot1 in Genome Data Viewer Exon count: 49

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (95719451..95808113, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (98243351..98331366, complement)

Chromosome 8 - NC_000074.6

Page 6 of 10 https://www.alphaknockout.com

Transcript information: This gene has 17 transcripts

Gene: Cnot1 ENSMUSG00000036550

Description CCR4-NOT transcription complex, subunit 1 [Source:MGI Symbol;Acc:MGI:2442402] Gene Synonyms 6030411K04Rik Location Chromosome 8: 95,719,451-95,807,464 reverse strand. GRCm38:CM001001.2 About this gene This gene has 17 transcripts (splice variants), 212 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cnot1- ENSMUST00000211887.1 8364 2369aa ENSMUSP00000148807.1 Protein coding CCDS85589 A0A1D5RMJ8 TSL:1 203 GENCODE basic APPRIS P2

Cnot1- ENSMUST00000068452.9 8152 2326aa ENSMUSP00000063565.8 Protein coding CCDS57635 B7ZWL1 TSL:1 201 GENCODE basic

Cnot1- ENSMUST00000098473.10 8390 2376aa ENSMUSP00000096073.4 Protein coding - Q6ZQ08 TSL:5 202 GENCODE basic APPRIS ALT1

Cnot1- ENSMUST00000211973.1 2469 823aa ENSMUSP00000148831.1 Protein coding - A0A1D5RML9 CDS 5' and 3' 205 incomplete TSL:5

Cnot1- ENSMUST00000213046.1 550 92aa ENSMUSP00000148709.1 Protein coding - A0A1D5RMB6 CDS 3' 217 incomplete TSL:3

Cnot1- ENSMUST00000213006.1 7479 1614aa ENSMUSP00000148735.1 Nonsense mediated - A0A1D5RMD8 TSL:1 216 decay

Cnot1- ENSMUST00000212323.1 3277 213aa ENSMUSP00000148574.1 Nonsense mediated - A0A1D5RM03 TSL:2 209 decay

Cnot1- ENSMUST00000212415.1 956 122aa ENSMUSP00000148575.1 Nonsense mediated - A0A1D5RM04 CDS 5' 212 decay incomplete TSL:5

Cnot1- ENSMUST00000212302.1 3809 No - Retained intron - - TSL:1 208 protein

Cnot1- ENSMUST00000212228.1 2925 No - Retained intron - - TSL:1 207 protein

Cnot1- ENSMUST00000212340.1 2381 No - Retained intron - - TSL:NA 210 protein

Cnot1- ENSMUST00000212195.1 696 No - Retained intron - - TSL:3 206 protein

Cnot1- ENSMUST00000212712.1 656 No - Retained intron - - TSL:3 215 protein

Cnot1- ENSMUST00000211937.1 500 No - Retained intron - - TSL:3 204 protein

Cnot1- ENSMUST00000212369.1 479 No - Retained intron - - TSL:2 211 protein

Cnot1- ENSMUST00000212535.1 1546 No - lncRNA - - TSL:NA 213 protein

Cnot1- ENSMUST00000212556.1 521 No - lncRNA - - TSL:5 214 protein

108.01 kb Forward strand 95.72Mb 95.74Mb Page 7 of 1095.76Mb 95.78Mb 95.80Mb Ndrg4-201 >protein coding 4930513N10Rik-204 >lncRNA (Comprehensive set...

Ndrg4-210 >nonsense mediated decay 4930513N10Rik-203 >lncRNA

Ndrg4-202 >protein coding 4930513N10Rik-201 >lncRNA

Ndrg4-203 >protein coding 4930513N10Rik-202 >lncRNA

Ndrg4-211 >protein coding

Ndrg4-206 >nonsense mediated decay

Ndrg4-205 >retained intron

Ndrg4-209 >protein coding

Ndrg4-204 >retained intron

Ndrg4-208 >retained intron

Setd6-201 >protein coding

Setd6-205 >lncRNA

Setd6-202 >retained intron

Setd6-203 >nonsense mediated decay

Setd6-204 >protein coding

Setd6-206 >lncRNA

Contigs < AC113951.13 < AC127300.3 Genes (Comprehensive set... < Cnot1-202protein coding

< Cnot1-203protein coding

< Cnot1-201protein coding

< Cnot1-208retained intron< Cnot1-206retained intron < Mir7073-201miRNA < Cnot1-209nonsense mediated decay

< Cnot1-216nonsense mediated decay

< Cnot1-211retained intron < Gm26493-201snoRNA < Cnot1-207retained intron

< Cnot1-212nonsense mediated decay < Cnot1-204retained intron < Gm45762-201TEC < Gm31659-201lncRNA

< Cnot1-213lncRNA< Cnot1-205protein coding < Cnot1-214lncRNA < Gm31659-202lncRNA

< Cnot1-215retained intron < Gm26265-202snoRNA < Gm31659-203lncRNA

< Cnot1-210retained intron < Gm26265-201snoRNA

< Cnot1-217protein coding

Regulatory Build

95.72Mb 95.74Mb 95.76Mb 95.78Mb 95.80Mb Reverse strand 108.01 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene https://www.alphaknockout.com 108.01 kb Forward strand 95.72Mb 95.74Mb 95.76Mb 95.78Mb 95.80Mb Genes Ndrg4-201 >protein coding 4930513N10Rik-204 >lncRNA (Comprehensive set...

Ndrg4-210 >nonsense mediated decay 4930513N10Rik-203 >lncRNA

Ndrg4-202 >protein coding 4930513N10Rik-201 >lncRNA

Ndrg4-203 >protein coding 4930513N10Rik-202 >lncRNA

Ndrg4-211 >protein coding

Ndrg4-206 >nonsense mediated decay

Ndrg4-205 >retained intron

Ndrg4-209 >protein coding

Ndrg4-204 >retained intron

Ndrg4-208 >retained intron

Setd6-201 >protein coding

Setd6-205 >lncRNA

Setd6-202 >retained intron

Setd6-203 >nonsense mediated decay

Setd6-204 >protein coding

Setd6-206 >lncRNA

Contigs < AC113951.13 < AC127300.3 Genes (Comprehensive set... < Cnot1-202protein coding

< Cnot1-203protein coding

< Cnot1-201protein coding

< Cnot1-208retained intron< Cnot1-206retained intron < Mir7073-201miRNA < Cnot1-209nonsense mediated decay

< Cnot1-216nonsense mediated decay

< Cnot1-211retained intron < Gm26493-201snoRNA < Cnot1-207retained intron

< Cnot1-212nonsense mediated decay < Cnot1-204retained intron < Gm45762-201TEC < Gm31659-201lncRNA

< Cnot1-213lncRNA< Cnot1-205protein coding < Cnot1-214lncRNA < Gm31659-202lncRNA

< Cnot1-215retained intron < Gm26265-202snoRNA < Gm31659-203lncRNA

< Cnot1-210retained intron < Gm26265-201snoRNA

< Cnot1-217protein coding

Regulatory Build

95.72Mb 95.74Mb 95.76Mb 95.78Mb 95.80Mb Reverse strand 108.01 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site Page 8 of 10 Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene 108.01 kb Forward strand

95.72Mb 95.74Mb 95.76Mb 95.78Mb 95.80Mb Genes Ndrg4-201 >protein coding 4930513N10Rik-204 >lncRNA (Comprehensive set...

Ndrg4-210 >nonsense mediated decay 4930513N10Rik-203 >lncRNA

Ndrg4-202 >protein coding 4930513N10Rik-201 >lncRNA

Ndrg4-203 >protein coding 4930513N10Rik-202 >lncRNA

Ndrg4-211 >protein coding

Ndrg4-206 >nonsense mediated decay

Ndrg4-205 >retained intron

Ndrg4-209 >protein coding

Ndrg4-204 >retained intron

Ndrg4-208 >retained intron

Setd6-201 >protein coding

Setd6-205 >lncRNA

Setd6-202 >retained intron

Setd6-203 >nonsense mediated decay

Setd6-204 >protein coding

Setd6-206 >lncRNA

Contigs < AC113951.13 < AC127300.3 Genes (Comprehensive set... < Cnot1-202protein coding

< Cnot1-203protein coding

< Cnot1-201protein coding

< Cnot1-208retained intron< Cnot1-206retained intron < Mir7073-201miRNA < Cnot1-209nonsense mediated decay

< Cnot1-216nonsense mediated decay

< Cnot1-211retained intron < Gm26493-201snoRNA < Cnot1-207retained intron

< Cnot1-212nonsense mediated decay < Cnot1-204retained intron < Gm45762-201TEC < Gm31659-201lncRNA

< Cnot1-213lncRNA< Cnot1-205protein coding < Cnot1-214lncRNA < Gm31659-202lncRNA

< Cnot1-215retained intron < Gm26265-202snoRNA < Gm31659-203lncRNA

< Cnot1-210retained intron < Gm26265-201snoRNA

< Cnot1-217protein coding

Regulatory Build

95.72Mb 95.74Mb 95.76Mb 95.78Mb 95.80Mb Reverse strand 108.01 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank https://www.alphaknockout.com Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 9 of 10 https://www.alphaknockout.com

Transcript: ENSMUST00000211887

< Cnot1-203protein coding

Reverse strand 88.01 kb

ENSMUSP00000148... MobiDB lite Low complexity (Seg) CCR4-NOT transcription complex subunit 1, HEAT repeat CCR4-Not complex component, Not1, C-terminal

CCR4-Not complex, Not1 subunit, domain of unknown function DUF3819

CCR4-NOT transcription complex subunit 1, TTP binding domain

CCR4-NOT transcription complex subunit 1, CAF1-binding domain PANTHER CCR4-NOT transcription complex subunit 1

PTHR13162:SF8 Gene3D MIF4G-like domain superfamily 1.25.40.790 1.25.40.800

CCR4-NOT subunit 1, TTP binding domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2369

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 10 of 10