https://www.alphaknockout.com

Mouse Cyth1 Knockout Project (CRISPR/Cas9)

Objective: To create a Cyth1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cyth1 (NCBI Reference Sequence: NM_011180 ; Ensembl: ENSMUSG00000017132 ) is located on Mouse 11. 13 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 13 (Transcript: ENSMUST00000106305). Exon 2~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a gene trap allele exhibit normal brain morphology and long term potentiation. Mice homozygous for a knock-out allele exhibit decreased myelin sheath thickness due to hypomyelination.

Exon 2 starts from about 1.93% of the coding region. Exon 2~6 covers 34.76% of the coding region. The size of effective KO region: ~8726 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 13

Legends Exon of mouse Cyth1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 977 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(20.9% 418) | C(25.1% 502) | T(29.2% 584) | G(24.8% 496)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(977bp) | A(22.82% 223) | C(23.03% 225) | T(33.88% 331) | G(20.27% 198)

Note: The 977 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr11 - 118193698 118195697 2000 browser details YourSeq 38 679 716 2000 100.0% chr4 - 119813780 119813817 38 browser details YourSeq 38 677 716 2000 97.5% chr13 + 42797905 42797944 40 browser details YourSeq 37 679 715 2000 100.0% chr4 + 110027755 110027791 37 browser details YourSeq 35 902 940 2000 94.9% chr3 + 85173163 85173201 39 browser details YourSeq 34 683 716 2000 100.0% chr18 + 76222423 76222456 34 browser details YourSeq 33 677 713 2000 94.6% chr8 + 25859358 25859394 37 browser details YourSeq 33 680 712 2000 100.0% chr4 + 44249901 44249933 33 browser details YourSeq 32 677 708 2000 100.0% chr14 - 63940957 63940988 32 browser details YourSeq 30 689 718 2000 100.0% chr9 - 108049013 108049042 30 browser details YourSeq 30 677 708 2000 96.9% chr11 - 57501920 57501951 32 browser details YourSeq 27 677 707 2000 93.6% chr4_JH584294_random - 2460 2490 31 browser details YourSeq 27 908 934 2000 100.0% chr18 - 32937240 32937266 27 browser details YourSeq 27 680 706 2000 100.0% chr10 - 94344861 94344887 27 browser details YourSeq 27 677 703 2000 100.0% chr13 + 107955546 107955572 27 browser details YourSeq 26 905 937 2000 75.9% chr12 - 25084448 25084476 29

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 977 1 977 977 100.0% chr11 - 118183995 118184971 977 browser details YourSeq 154 277 560 977 91.9% chr11 - 51798419 51798822 404 browser details YourSeq 137 403 561 977 94.2% chr1 + 5402876 5403036 161 browser details YourSeq 136 403 559 977 93.9% chr10 + 60050525 60050678 154 browser details YourSeq 135 278 516 977 92.1% chr11 + 3236650 3236989 340 browser details YourSeq 132 405 559 977 93.9% chr2 - 146453866 146454019 154 browser details YourSeq 131 403 560 977 90.2% chr10 - 8655522 8655676 155 browser details YourSeq 131 348 559 977 93.4% chr11 + 50181631 50181864 234 browser details YourSeq 129 405 557 977 95.2% chr2 - 5197982 5198134 153 browser details YourSeq 129 405 561 977 91.4% chr11 - 115506895 115507050 156 browser details YourSeq 129 406 559 977 90.7% chr10 + 34647649 34647799 151 browser details YourSeq 129 402 551 977 94.5% chr1 + 161960393 161960546 154 browser details YourSeq 128 359 566 977 94.5% chr7 - 114183122 114183415 294 browser details YourSeq 126 411 571 977 91.9% chr9 + 79789126 79789499 374 browser details YourSeq 126 406 559 977 88.6% chr10 + 39928858 39929006 149 browser details YourSeq 124 406 558 977 91.4% chr3 + 133829270 133829424 155 browser details YourSeq 124 418 561 977 93.7% chr3 + 57553727 57553876 150 browser details YourSeq 123 418 559 977 93.7% chr17 + 27830270 27830412 143 browser details YourSeq 122 430 564 977 95.6% chr12 + 38290308 38290445 138 browser details YourSeq 121 405 564 977 88.7% chr13 + 68040409 68040573 165

Note: The 977 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Cyth1 cytohesin 1 [ Mus musculus (house mouse) ] Gene ID: 19157, updated on 24-Oct-2019

Gene summary

Official Symbol Cyth1 provided by MGI Official Full Name cytohesin 1 provided by MGI Primary source MGI:MGI:1334257 See related Ensembl:ENSMUSG00000017132 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as CLM1; CTH-1; CYTIP; Pscd1 Expression Ubiquitous expression in cerebellum adult (RPKM 19.7), frontal lobe adult (RPKM 16.5) and 27 other tissues See more Orthologs human all

Genomic context

Location: 11; 11 E2 See Cyth1 in Genome Data Viewer Exon count: 18

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (118164166..118248616, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (118025480..118109906, complement)

Chromosome 11 - NC_000077.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Cyth1 ENSMUSG00000017132

Description cytohesin 1 [Source:MGI Symbol;Acc:MGI:1334257] Gene Synonyms CLM1, CTH-1, Pscd1 Location Chromosome 11: 118,132,019-118,248,592 reverse strand. GRCm38:CM001004.2 About this gene This gene has 7 transcripts (splice variants), 259 orthologues, 15 paralogues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cyth1- ENSMUST00000106305.8 3179 398aa ENSMUSP00000101912.2 Protein CCDS48996 Q9QX11 TSL:1 204 coding GENCODE basic APPRIS ALT1

Cyth1- ENSMUST00000017276.13 3135 397aa ENSMUSP00000017276.7 Protein CCDS48997 Q8K3E8 TSL:1 201 coding Q9QX11 GENCODE basic APPRIS ALT1

Cyth1- ENSMUST00000106302.8 3119 400aa ENSMUSP00000101909.2 Protein CCDS48995 Q3TZ02 TSL:1 203 coding GENCODE basic APPRIS P4

Cyth1- ENSMUST00000100181.10 1592 460aa ENSMUSP00000097756.4 Protein - A2A517 CDS 5' 202 coding incomplete TSL:1

Cyth1- ENSMUST00000151165.1 363 99aa ENSMUSP00000114792.1 Protein - B1AQE4 CDS 3' 207 coding incomplete TSL:2

Cyth1- ENSMUST00000141243.7 545 No - lncRNA - - TSL:3 206 protein

Cyth1- ENSMUST00000131115.1 355 No - lncRNA - - TSL:3 205 protein

Page 7 of 9 https://www.alphaknockout.com

136.57 kb Forward strand 118.14Mb 118.16Mb 118.18Mb 118.20Mb 118.22Mb 118.24Mb Contigs AL591204.14 > AL591109.8 > AL591404.4 > (Comprehensive set... < Dnah17-203protein coding < Cyth1-204protein coding

< Dnah17-202protein coding < Cyth1-203protein coding

< Dnah17-204protein coding < Cyth1-201protein coding

< Cyth1-202protein coding

< Gm24060-201snRNA < Gm11737-201processed pseudogene

< Cyth1-206lncRNA

< Cyth1-205lncRNA

< Cyth1-207protein coding

Regulatory Build

118.14Mb 118.16Mb 118.18Mb 118.20Mb 118.22Mb 118.24Mb Reverse strand 136.57 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000106305

< Cyth1-204protein coding

Reverse strand 84.43 kb

ENSMUSP00000101... Coiled-coils (Ncoils) Superfamily Sec7 domain superfamily SSF50729

SMART Sec7 domain Pleckstrin homology domain

Pfam Sec7 domain Pleckstrin homology domain

PROSITE profiles Sec7 domain Pleckstrin homology domain

PANTHER PTHR10663:SF340

PTHR10663 Gene3D 1.10.220.20 Sec7, C-terminal domain superfamily PH-like domain superfamily

CDD Sec7 domain cd01252

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 398

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9