https://www.alphaknockout.com

Mouse Clstn1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Clstn1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Clstn1 (NCBI Reference Sequence: NM_023051 ; Ensembl: ENSMUSG00000039953 ) is located on Mouse 4. 19 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 19 (Transcript: ENSMUST00000039144). Exon 3~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Clstn1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-391E9 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Juvenile mice homozygous for a null allele show reduced basal excitatory synaptic transmission, abnormal excitatory postsynaptic currents, enhanced NMDA receptor-dependent long term potentiation, and delayed dendritic spine maturation in CA1 hippocampal pyramidal cells.

Exon 3 starts from about 7.32% of the coding region. The knockout of Exon 3~4 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 11914 bp, and the size of intron 4 for 3'-loxP site insertion: 1895 bp. The size of effective cKO region: ~1821 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 19 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Clstn1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8321bp) | A(25.78% 2145) | C(23.92% 1990) | T(26.73% 2224) | G(23.58% 1962)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 + 149622838 149625837 3000 browser details YourSeq 267 1 476 3000 88.0% chr7 + 43407213 43407650 438 browser details YourSeq 238 1 476 3000 90.8% chr7 + 43407015 43407598 584 browser details YourSeq 235 5 421 3000 87.6% chr8 - 76619560 76620036 477 browser details YourSeq 233 1 452 3000 90.0% chr5 - 119924956 119925540 585 browser details YourSeq 224 1 464 3000 90.1% chr6 - 139721437 139721924 488 browser details YourSeq 217 1 420 3000 93.0% chr5 - 76757986 76758547 562 browser details YourSeq 215 2 432 3000 89.1% chr12 - 76179883 76180419 537 browser details YourSeq 214 1 364 3000 90.4% chr3 - 9049902 9050307 406 browser details YourSeq 208 34 463 3000 85.8% chr11 + 120439874 120440230 357 browser details YourSeq 205 1 400 3000 85.7% chr5 + 144277828 144278159 332 browser details YourSeq 202 3 394 3000 90.2% chr11 + 118886919 118887492 574 browser details YourSeq 197 5 388 3000 89.5% chr9 - 101057985 101058692 708 browser details YourSeq 195 2 497 3000 90.3% chr11 + 109877562 109878465 904 browser details YourSeq 191 2 479 3000 84.7% chr11 + 97620979 97621328 350 browser details YourSeq 186 1 440 3000 82.7% chr4 - 32356662 32356937 276 browser details YourSeq 185 3 406 3000 90.0% chr8 - 76619465 76620006 542 browser details YourSeq 182 28 452 3000 83.1% chr12 - 76180051 76180397 347 browser details YourSeq 179 1 380 3000 83.9% chr5 + 144277882 144278157 276 browser details YourSeq 177 1 338 3000 87.1% chr3 - 9049918 9050227 310

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 + 149627659 149630658 3000 browser details YourSeq 193 2467 2734 3000 91.1% chr4 + 134101107 134101553 447 browser details YourSeq 190 2466 2734 3000 92.2% chr5 - 29732685 29733339 655 browser details YourSeq 189 2456 2737 3000 91.1% chr19 - 4278567 4279167 601 browser details YourSeq 154 2428 2620 3000 93.3% chr10 - 52291175 52303254 12080 browser details YourSeq 151 2428 2612 3000 91.8% chr14 - 30841479 30841671 193 browser details YourSeq 151 2475 2684 3000 93.2% chr15 + 81502440 81503046 607 browser details YourSeq 150 2185 2613 3000 89.1% chr13 - 50375153 50375581 429 browser details YourSeq 143 2465 2657 3000 91.4% chr4 - 129539469 129539822 354 browser details YourSeq 143 2426 2617 3000 86.7% chrX + 122849010 122849189 180 browser details YourSeq 143 2449 2614 3000 94.5% chr1 + 126907735 126907902 168 browser details YourSeq 142 2427 2612 3000 88.1% chr13 - 12075775 12075947 173 browser details YourSeq 142 2454 3000 3000 82.0% chr10 - 4229975 4230390 416 browser details YourSeq 142 2454 2622 3000 94.0% chr9 + 48980238 48980407 170 browser details YourSeq 142 2432 2613 3000 92.9% chr19 + 49331299 49331492 194 browser details YourSeq 141 2445 2637 3000 90.9% chr17 - 28403876 28404070 195 browser details YourSeq 139 2423 2611 3000 89.1% chrX + 107964171 107964379 209 browser details YourSeq 137 2434 2612 3000 87.7% chr7 - 19506027 19506190 164 browser details YourSeq 136 2459 2622 3000 91.9% chr18 - 81003345 81003514 170 browser details YourSeq 135 2426 2612 3000 85.3% chr14 + 109731901 109732075 175

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Clstn1 calsyntenin 1 [ Mus musculus (house mouse) ] Gene ID: 65945, updated on 12-Aug-2019

Gene summary

Official Symbol Clstn1 provided by MGI Official Full Name calsyntenin 1 provided by MGI Primary source MGI:MGI:1929895 See related Ensembl:ENSMUSG00000039953 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Cst-1; Cstn1; 1810034E21Rik Expression Broad expression in cortex adult (RPKM 93.0), frontal lobe adult (RPKM 80.3) and 23 other tissues See more Orthologs human all

Genomic context

Location: 4; 4 E2 See Clstn1 in Genome Data Viewer

Exon count: 21

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (149585111..149648899)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (148960747..149022008)

Chromosome 4 - NC_000070.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Clstn1 ENSMUSG00000039953

Description calsyntenin 1 [Source:MGI Symbol;Acc:MGI:1929895] Gene Synonyms 1810034E21Rik, Cst-1, alcadein alpha, calsyntenin-1 Location Chromosome 4: 149,586,468-149,648,899 forward strand. GRCm38:CM000997.2 About this gene This gene has 4 transcripts (splice variants), 205 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 15 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Clstn1-202 ENSMUST00000105691.7 4459 969aa ENSMUSP00000101316.1 Protein coding CCDS71522 Q9EPL2 TSL:1 GENCODE basic APPRIS ALT2

Clstn1-201 ENSMUST00000039144.6 3319 979aa ENSMUSP00000036962.6 Protein coding CCDS18963 Q9EPL2 TSL:1 GENCODE basic APPRIS P3

Clstn1-204 ENSMUST00000151895.1 822 No protein - Retained intron - - TSL:2

Clstn1-203 ENSMUST00000137232.1 358 No protein - lncRNA - - TSL:3

Page 6 of 8 https://www.alphaknockout.com

82.43 kb Forward strand 149.58Mb 149.60Mb 149.62Mb 149.64Mb (Comprehensive set... Clstn1-202 >protein coding

Clstn1-201 >protein coding

Clstn1-203 >lncRNA

Clstn1-204 >retained intron

Contigs AL607078.26 >

Genes < Pik3cd-204protein coding (Comprehensive set...

< Pik3cd-201protein coding

< Pik3cd-214protein coding

< Pik3cd-206protein coding

< Pik3cd-205protein coding

< Pik3cd-202protein coding

< Pik3cd-203protein coding

< Mir7023-201miRNA

Regulatory Build

149.58Mb 149.60Mb 149.62Mb 149.64Mb Reverse strand 82.43 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000039144

61.26 kb Forward strand

Clstn1-201 >protein coding

ENSMUSP00000036... Transmembrane heli... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Cleavage site (Sign... Superfamily Cadherin-like superfamily Concanavalin A-like lectin/glucanase domain superfamily SMART Cadherin-like Prints Cadherin-like Pfam Cadherin-like PF13385

PROSITE profiles PS50268 PANTHER Calsyntenin

PTHR14139:SF4 Gene3D 2.60.40.60 2.60.120.200 CDD cd11304

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 979

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8