https://www.alphaknockout.com

Mouse Clstn1 Knockout Project (CRISPR/Cas9)

Objective: To create a Clstn1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Clstn1 (NCBI Reference Sequence: NM_023051 ; Ensembl: ENSMUSG00000039953 ) is located on Mouse 4. 19 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 19 (Transcript: ENSMUST00000039144). Exon 3~9 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Juvenile mice homozygous for a null allele show reduced basal excitatory synaptic transmission, abnormal excitatory postsynaptic currents, enhanced NMDA receptor-dependent long term potentiation, and delayed dendritic spine maturation in CA1 hippocampal pyramidal cells.

Exon 3 starts from about 7.32% of the coding region. Exon 3~9 covers 38.88% of the coding region. The size of effective KO region: ~9269 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 7 8 9 19

Legends Exon of mouse Clstn1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 9 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.4% 548) | C(18.85% 377) | T(29.25% 585) | G(24.5% 490)

Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.2% 524) | C(21.05% 421) | T(27.95% 559) | G(24.8% 496)

Note: The 2000 bp section downstream of Exon 9 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 149624088 149626087 2000 browser details YourSeq 154 6 303 2000 90.0% chr2 + 154415601 154415904 304 browser details YourSeq 154 8 287 2000 92.8% chr11 + 104428539 104428985 447 browser details YourSeq 151 6 282 2000 94.2% chr5 - 92492497 92492965 469 browser details YourSeq 142 1 159 2000 95.0% chr5 + 108128762 108128921 160 browser details YourSeq 141 6 160 2000 96.2% chr15 - 73404327 73404483 157 browser details YourSeq 141 6 156 2000 97.4% chr14 - 70697019 70697172 154 browser details YourSeq 141 6 289 2000 90.8% chr11 - 80591910 80592240 331 browser details YourSeq 141 6 159 2000 96.2% chr2 + 153367615 153367769 155 browser details YourSeq 141 6 160 2000 96.2% chr14 + 65047088 65047461 374 browser details YourSeq 140 6 163 2000 92.3% chr1 - 132320173 132320327 155 browser details YourSeq 140 6 159 2000 96.2% chr11 + 72762022 72762177 156 browser details YourSeq 139 6 159 2000 95.5% chr18 - 77852976 77860596 7621 browser details YourSeq 139 1 159 2000 94.4% chr10 - 75574638 75575070 433 browser details YourSeq 139 6 165 2000 93.8% chr6 + 100822442 100822602 161 browser details YourSeq 139 9 169 2000 93.8% chr16 + 91438034 91438204 171 browser details YourSeq 138 1 159 2000 94.4% chr5 - 142831092 142831533 442 browser details YourSeq 138 8 157 2000 96.0% chrX + 41902792 41902941 150 browser details YourSeq 138 6 160 2000 94.9% chr11 + 62161050 62161207 158 browser details YourSeq 137 6 159 2000 97.3% chr6 - 149148356 149148509 154

Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 149635357 149637356 2000 browser details YourSeq 239 255 720 2000 92.5% chr2 + 30728405 30729036 632 browser details YourSeq 221 280 675 2000 88.5% chr14 - 48755120 48755481 362 browser details YourSeq 213 259 712 2000 84.6% chr10 - 80309943 80310293 351 browser details YourSeq 207 275 675 2000 92.4% chr15 - 79459651 79460222 572 browser details YourSeq 189 275 676 2000 86.3% chr13 + 93682022 93682382 361 browser details YourSeq 178 280 634 2000 93.2% chrX + 152283059 152283681 623 browser details YourSeq 158 230 405 2000 96.0% chr2 - 37853039 37853225 187 browser details YourSeq 154 252 599 2000 94.8% chr2 - 120888965 120889518 554 browser details YourSeq 151 275 647 2000 93.2% chr15 - 99761552 99761993 442 browser details YourSeq 150 231 403 2000 94.8% chr17 + 73100972 73101148 177 browser details YourSeq 149 231 403 2000 94.2% chr14 + 72875393 72875582 190 browser details YourSeq 147 236 405 2000 95.3% chr8 - 41601042 41601219 178 browser details YourSeq 147 233 403 2000 93.6% chr4 + 133173986 133174158 173 browser details YourSeq 146 242 405 2000 95.7% chr10 - 50953940 50954112 173 browser details YourSeq 145 231 399 2000 95.0% chr1 - 142591186 142591354 169 browser details YourSeq 145 230 399 2000 94.1% chr13 + 43573783 43573960 178 browser details YourSeq 144 248 403 2000 96.8% chr19 + 6029740 6029907 168 browser details YourSeq 143 230 399 2000 91.7% chr18 - 18903478 18903639 162 browser details YourSeq 143 251 405 2000 98.1% chr6 + 72160796 72160956 161

Note: The 2000 bp section downstream of Exon 9 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Clstn1 calsyntenin 1 [ Mus musculus (house mouse) ] Gene ID: 65945, updated on 12-Aug-2019

Gene summary

Official Symbol Clstn1 provided by MGI Official Full Name calsyntenin 1 provided by MGI Primary source MGI:MGI:1929895 See related Ensembl:ENSMUSG00000039953 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Cst-1; Cstn1; 1810034E21Rik Expression Broad expression in cortex adult (RPKM 93.0), frontal lobe adult (RPKM 80.3) and 23 other tissues See more Orthologs human all

Genomic context

Location: 4; 4 E2 See Clstn1 in Genome Data Viewer Exon count: 21

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (149585111..149648899)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (148960747..149022008)

Chromosome 4 - NC_000070.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Clstn1 ENSMUSG00000039953

Description calsyntenin 1 [Source:MGI Symbol;Acc:MGI:1929895] Gene Synonyms 1810034E21Rik, Cst-1, alcadein alpha, calsyntenin-1 Location Chromosome 4: 149,586,468-149,648,899 forward strand. GRCm38:CM000997.2 About this gene This gene has 4 transcripts (splice variants), 205 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 15 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Clstn1-202 ENSMUST00000105691.7 4459 969aa ENSMUSP00000101316.1 Protein coding CCDS71522 Q9EPL2 TSL:1 GENCODE basic APPRIS ALT2

Clstn1-201 ENSMUST00000039144.6 3319 979aa ENSMUSP00000036962.6 Protein coding CCDS18963 Q9EPL2 TSL:1 GENCODE basic APPRIS P3

Clstn1-204 ENSMUST00000151895.1 822 No protein - Retained intron - - TSL:2

Clstn1-203 ENSMUST00000137232.1 358 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

82.43 kb Forward strand 149.58Mb 149.60Mb 149.62Mb 149.64Mb (Comprehensive set... Clstn1-202 >protein coding

Clstn1-201 >protein coding

Clstn1-203 >lncRNA

Clstn1-204 >retained intron

Contigs AL607078.26 >

Genes < Pik3cd-204protein coding (Comprehensive set...

< Pik3cd-201protein coding

< Pik3cd-214protein coding

< Pik3cd-206protein coding

< Pik3cd-205protein coding

< Pik3cd-202protein coding

< Pik3cd-203protein coding

< Mir7023-201miRNA

Regulatory Build

149.58Mb 149.60Mb 149.62Mb 149.64Mb Reverse strand 82.43 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000039144

61.26 kb Forward strand

Clstn1-201 >protein coding

ENSMUSP00000036... Transmembrane heli... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Cleavage site (Sign... Superfamily Cadherin-like superfamily Concanavalin A-like lectin/glucanase domain superfamily SMART Cadherin-like Prints Cadherin-like Pfam Cadherin-like PF13385

PROSITE profiles PS50268 PANTHER Calsyntenin

PTHR14139:SF4 Gene3D 2.60.40.60 2.60.120.200 CDD cd11304

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 979

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9