https://www.alphaknockout.com

Mouse Shank1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Shank1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Shank1 (NCBI Reference Sequence: NM_001034115 ; Ensembl: ENSMUSG00000038738 ) is located on Mouse 7. 25 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 25 (Transcript: ENSMUST00000107938). Exon 13~15 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Shank1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-361D8 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous mutation of this gene results in smaller pyramidal neuron dendritic spines, smaller and thinner postsynaptic density of central excitatory synapses, weaker synaptic transmission, increased anxiety-related behavior, and impaired contextual fearmemory, but enhanced spatial learning.

Exon 13 starts from about 26.89% of the coding region. The knockout of Exon 13~15 will result in frameshift of the gene. The size of intron 12 for 5'-loxP site insertion: 6065 bp, and the size of intron 15 for 3'-loxP site insertion: 8038 bp. The size of effective cKO region: ~1268 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3' 14

1 13 15 25 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Shank1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7768bp) | A(18.15% 1410) | C(28.06% 2180) | T(27.23% 2115) | G(26.56% 2063)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 44330043 44333042 3000 browser details YourSeq 276 686 1055 3000 97.7% chr3 - 129236924 129237514 591 browser details YourSeq 256 693 1054 3000 95.4% chr7 - 118187042 118187474 433 browser details YourSeq 247 685 974 3000 91.5% chr8 + 15680795 15681072 278 browser details YourSeq 237 685 1002 3000 91.3% chr19 - 17668449 17668736 288 browser details YourSeq 237 695 950 3000 97.3% chr4 + 33695569 33695955 387 browser details YourSeq 237 711 1055 3000 93.8% chr17 + 66230455 66230899 445 browser details YourSeq 235 695 1055 3000 91.6% chr10 + 95991770 95992070 301 browser details YourSeq 231 754 1054 3000 94.9% chr9 + 22392473 22392882 410 browser details YourSeq 229 695 1055 3000 89.8% chr10 - 88880205 88880472 268 browser details YourSeq 225 697 1055 3000 91.4% chr5 + 88800707 88800937 231 browser details YourSeq 225 695 1030 3000 96.7% chr4 + 33695576 33695955 380 browser details YourSeq 215 700 1055 3000 87.5% chr10 + 7486636 7486945 310 browser details YourSeq 213 685 943 3000 94.7% chr16 - 94115544 94115899 356 browser details YourSeq 209 699 1055 3000 89.5% chr12 + 67693595 67693848 254 browser details YourSeq 204 733 976 3000 94.9% chrX - 7182756 7183037 282 browser details YourSeq 203 772 1012 3000 91.5% chr5 + 126017776 126018010 235 browser details YourSeq 199 700 1054 3000 92.6% chr8 + 102122588 102122829 242 browser details YourSeq 196 695 1056 3000 96.4% chr2 - 140018940 140019380 441 browser details YourSeq 193 787 1055 3000 91.5% chr5 - 65310135 65310686 552

Note: The 3000 bp section upstream of Exon 13 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 44334311 44337310 3000 browser details YourSeq 39 2557 2633 3000 93.2% chr1 - 39145875 39145972 98 browser details YourSeq 36 2557 2633 3000 92.9% chr12 - 19884180 19884427 248 browser details YourSeq 32 2605 2649 3000 92.2% chr5 + 105403636 105403680 45 browser details YourSeq 31 2600 2638 3000 92.4% chr5 - 142045666 142045708 43 browser details YourSeq 29 2600 2633 3000 84.4% chr2 - 179048894 179048925 32 browser details YourSeq 27 2602 2634 3000 93.4% chr5 - 103014262 103014303 42 browser details YourSeq 27 1351 1389 3000 79.4% chr2 - 170317206 170317239 34 browser details YourSeq 25 1357 1389 3000 75.0% chr6 + 122666632 122666659 28 browser details YourSeq 24 1595 1624 3000 84.7% chr3 + 143806061 143806088 28 browser details YourSeq 23 1691 1715 3000 96.0% chr15 - 93336884 93336908 25 browser details YourSeq 22 2720 2743 3000 87.0% chr1 - 113330979 113331001 23

Note: The 3000 bp section downstream of Exon 15 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Shank1 SH3 and multiple ankyrin repeat domains 1 [ Mus musculus (house mouse) ] Gene ID: 243961, updated on 1-Oct-2019

Gene summary

Official Symbol Shank1 provided by MGI Official Full Name SH3 and multiple ankyrin repeat domains 1 provided by MGI Primary source MGI:MGI:3613677 See related Ensembl:ENSMUSG00000038738 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Expression Biased expression in frontal lobe adult (RPKM 16.9), cortex adult (RPKM 13.3) and 9 other tissues See more Orthologs human all

Genomic context

Location: 7; 7 B3 See Shank1 in Genome Data Viewer Exon count: 32

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (44308916..44360094)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (51565634..51613723)

Chromosome 7 - NC_000073.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Shank1 ENSMUSG00000038738

Description SH3 and multiple ankyrin repeat domains 1 [Source:MGI Symbol;Acc:MGI:3613677] Location Chromosome 7: 44,310,253-44,360,572 forward strand. GRCm38:CM001000.2 About this gene This gene has 6 transcripts (splice variants), 223 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 12 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Shank1-203 ENSMUST00000107938.7 9826 2167aa ENSMUSP00000103571.1 Protein coding CCDS52229 D3YZU1 TSL:5 GENCODE basic APPRIS P2

Shank1-202 ENSMUST00000107935.7 6649 2159aa ENSMUSP00000103568.1 Protein coding - D3YZU4 TSL:5 GENCODE basic APPRIS ALT2

Shank1-201 ENSMUST00000107934.1 6477 2158aa ENSMUSP00000103567.1 Protein coding - D3YZU5 TSL:5 GENCODE basic APPRIS ALT2

Shank1-206 ENSMUST00000154776.1 987 No protein - Retained intron - - TSL:1

Shank1-204 ENSMUST00000127164.1 784 No protein - Retained intron - - TSL:2

Shank1-205 ENSMUST00000134470.7 608 No protein - lncRNA - - TSL:3

Page 6 of 8 https://www.alphaknockout.com

70.32 kb Forward strand

Genes (Comprehensive set... Shank1-203 >protein coding Gm44780-201 >lncRNA

Shank1-202 >protein coding

Shank1-201 >protein coding

Gm44757-201 >TEC Shank1-205 >lncRNA Shank1-204 >retained intron

Shank1-206 >retained intron

Contigs AC152939.2 > < Gm7238-201processed pseudogene < 1700008O03Rik-201protein coding (Comprehensive set...

< Clec11a-201protein coding < 1700008O03Rik-203retained intron

< 1700008O03Rik-205protein coding

< 1700008O03Rik-204lncRNA

Regulatory Build

Reverse strand 70.32 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene pseudogene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000107938

50.32 kb Forward strand

Shank1-203 >protein coding

ENSMUSP00000103... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Ankyrin repeat-containing domain superfamily Sterile alpha motif/pointed domain superfamily

SH3-like domain superfamily

PDZ superfamily SMART Ankyrin repeat SH3 domain Sterile alpha motif domain

PDZ domain Pfam PDZ domain 6 Sterile alpha motif domain

SH3 domain

Ankyrin repeat-containing domain PROSITE profiles Ankyrin repeat PDZ domain Sterile alpha motif domain

Ankyrin repeat-containing domain

SH3 domain PANTHER PTHR24135

PTHR24135:SF3 Gene3D 3.10.20.90 1.25.40.960 2.30.30.40 Sterile alpha motif/pointed domain superfamily

2.30.42.10

Ankyrin repeat-containing domain superfamily CDD cd17175 cd00992 cd09506

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1200 1400 1600 1800 2167

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8