https://www.alphaknockout.com

Mouse Szt2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Szt2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Szt2 (NCBI Reference Sequence: NM_198170 ; Ensembl: ENSMUSG00000033253 ) is located on Mouse 4. 72 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 72 (Transcript: ENSMUST00000075406). Exon 42 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Szt2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-391J10 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for mutations in this gene display increased susceptibility to induced seizures. Mice homozygous for null mutations also display partial penetrance of prenatal lethality.

Exon 42 starts from about 57.34% of the coding region. The knockout of Exon 42 will result in frameshift of the gene. The size of intron 41 for 5'-loxP site insertion: 1057 bp, and the size of intron 42 for 3'-loxP site insertion: 1766 bp. The size of effective cKO region: ~630 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 41 42 43 44 45 72 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Szt2 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7130bp) | A(22.86% 1630) | C(24.52% 1748) | T(25.69% 1832) | G(26.93% 1920)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 - 118380547 118383546 3000 browser details YourSeq 103 1156 1263 3000 98.2% chr1 + 125172961 125173070 110 browser details YourSeq 75 1131 1257 3000 87.4% chr7 - 79760319 79760433 115 browser details YourSeq 75 1158 1481 3000 71.9% chr1 + 79780570 79780806 237 browser details YourSeq 71 1325 1458 3000 89.9% chr11 - 76091960 76092094 135 browser details YourSeq 70 1311 1465 3000 90.6% chr7 - 84071314 84071480 167 browser details YourSeq 70 1378 1470 3000 82.8% chr3 - 15201237 15201323 87 browser details YourSeq 70 1379 1480 3000 82.2% chr1 + 71888973 71889071 99 browser details YourSeq 64 1379 1466 3000 86.4% chr14 - 25897869 25897956 88 browser details YourSeq 64 1379 1466 3000 86.4% chr14 - 26177253 26177340 88 browser details YourSeq 63 1379 1464 3000 87.3% chr14 - 26037492 26037726 235 browser details YourSeq 62 1323 1466 3000 93.1% chr4 - 142643200 142643343 144 browser details YourSeq 62 1379 1466 3000 88.8% chr16 + 56161841 56161937 97 browser details YourSeq 61 1379 1466 3000 85.1% chr11 - 109314021 109314108 88 browser details YourSeq 60 1400 1478 3000 89.7% chr3 + 108583031 108583452 422 browser details YourSeq 59 1166 1435 3000 90.5% chr3 - 86176458 86176982 525 browser details YourSeq 59 1370 1451 3000 82.5% chr2 - 125586166 125586245 80 browser details YourSeq 58 1376 1451 3000 88.2% chr7 - 118838843 118838918 76 browser details YourSeq 57 1379 1479 3000 78.3% chr6 - 30201041 30201141 101 browser details YourSeq 57 1379 1458 3000 80.6% chr11 - 83390312 83390388 77

Note: The 3000 bp section upstream of Exon 42 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 - 118376917 118379916 3000 browser details YourSeq 172 1102 1297 3000 96.3% chr5 - 134609142 134609352 211 browser details YourSeq 163 1109 1293 3000 95.6% chr12 + 111661504 111661689 186 browser details YourSeq 162 1113 1297 3000 96.0% chr1 + 51504536 51504720 185 browser details YourSeq 160 1118 1284 3000 98.3% chr17 - 88020761 88020928 168 browser details YourSeq 160 1118 1281 3000 98.8% chr14 - 61613858 61614021 164 browser details YourSeq 160 1118 1285 3000 97.7% chr12 - 19048819 19048986 168 browser details YourSeq 160 1118 1281 3000 98.8% chr7 + 27399861 27400024 164 browser details YourSeq 160 1118 1281 3000 98.8% chr2 + 20817050 20817213 164 browser details YourSeq 160 1118 1281 3000 98.8% chr15 + 97149450 97149613 164 browser details YourSeq 159 1116 1280 3000 98.2% chr13 - 44708884 44709048 165 browser details YourSeq 159 1113 1281 3000 95.9% chr2 + 119968851 119969018 168 browser details YourSeq 159 1113 1281 3000 97.1% chr2 + 31070375 31070543 169 browser details YourSeq 158 1119 1280 3000 98.8% chr1 - 6091894 6092055 162 browser details YourSeq 158 1118 1281 3000 98.2% chrX + 36815736 36815899 164 browser details YourSeq 157 1118 1280 3000 98.2% chr11 - 78175664 78175826 163 browser details YourSeq 157 1118 1280 3000 98.2% chr18 + 36393775 36393937 163 browser details YourSeq 157 1118 1280 3000 98.2% chr14 + 22232970 22233132 163 browser details YourSeq 157 1118 1280 3000 98.2% chr14 + 8108564 8108726 163 browser details YourSeq 157 1118 1280 3000 98.2% chr12 + 44271610 44271772 163

Note: The 3000 bp section downstream of Exon 42 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Szt2 SZT2 subunit of KICSTOR complex [ Mus musculus (house mouse) ] Gene ID: 230676, updated on 12-Aug-2019

Gene summary

Official Symbol Szt2 provided by MGI Official Full Name SZT2 subunit of KICSTOR complex provided by MGI Primary source MGI:MGI:3033336 See related Ensembl:ENSMUSG00000033253 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as TIGR; 6430407N22 Summary This gene encodes a protein associated with low seizure threshold in mice and may contribute to susceptibility to epilepsy. Expression [provided by RefSeq, Aug 2011] Orthologs Ubiquitous expression in thymus adult (RPKM 18.9), lung adult (RPKM 13.3) and 28 other tissuesS ee more human all

Genomic context

Location: 4; 4 D2.1 See Szt2 in Genome Data Viewer

Exon count: 72

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (118362740..118409286, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (118035350..118081868, complement)

Chromosome 4 - NC_000070.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Szt2 ENSMUSG00000033253

Description SZT2 subunit of KICSTOR complex [Source:MGI Symbol;Acc:MGI:3033336] Gene Synonyms seaizure threshold 2 Location Chromosome 4: 118,362,743-118,409,273 reverse strand. GRCm38:CM000997.2 About this gene This gene has 6 transcripts (splice variants), 227 orthologues, is a member of 1 Ensembl protein family and is associated with 9 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Szt2- ENSMUST00000075406.11 10965 3431aa ENSMUSP00000074862.5 Protein coding CCDS51283 A2A9C3 TSL:1 201 GENCODE basic APPRIS P1

Szt2- ENSMUST00000183402.7 3652 559aa ENSMUSP00000139348.1 Nonsense mediated - V9GXW1 CDS 5' 206 decay incomplete TSL:1

Szt2- ENSMUST00000138386.1 791 No - Retained intron - - TSL:3 204 protein

Szt2- ENSMUST00000138632.1 605 No - Retained intron - - TSL:3 205 protein

Szt2- ENSMUST00000130058.1 599 No - lncRNA - - TSL:2 202 protein

Szt2- ENSMUST00000136737.1 444 No - lncRNA - - TSL:5 203 protein

Page 6 of 8 https://www.alphaknockout.com

66.53 kb Forward strand 118.36Mb 118.37Mb 118.38Mb 118.39Mb 118.40Mb 118.41Mb Hyi-205 >polymorphic pseudogene Med8-204 >protein coding (Comprehensive set...

Hyi-202 >nonsense mediated decay Med8-203 >protein coding

Hyi-204 >retained intron Med8-206 >protein coding

Hyi-201 >protein coding Med8-202 >protein coding

Hyi-203 >retained intron Med8-207 >lncRNA

Med8-201 >protein coding

Med8-209 >protein coding

Med8-210 >lncRNA

Med8-205 >lncRNA

Med8-208 >lncRNA

Contigs AL627212.21 > Genes (Comprehensive set... < Szt2-201protein coding

< Szt2-206nonsense mediated decay < Szt2-203lncRNA

< Szt2-202lncRNA < Szt2-204retained intron

< Szt2-205retained intron

Regulatory Build

118.36Mb 118.37Mb 118.38Mb 118.39Mb 118.40Mb 118.41Mb Reverse strand 66.53 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000075406

< Szt2-201protein coding

Reverse strand 46.53 kb

ENSMUSP00000074... MobiDB lite Low complexity (Seg) PANTHER Protein SZT2

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant missense variant splice region variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2400 2800 3431

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8