https://www.alphaknockout.com

Mouse Strn4 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Strn4 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Strn4 (NCBI Reference Sequence: NM_133789 ; Ensembl: ENSMUSG00000030374 ) is located on Mouse 7. 18 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 17 (Transcript: ENSMUST00000019220). Exon 2~3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Strn4 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-316H6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 12.41% of the coding region. The knockout of Exon 2~3 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 5861 bp, and the size of intron 3 for 3'-loxP site insertion: 1059 bp. The size of effective cKO region: ~1084 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 18 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Strn4 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7584bp) | A(23.31% 1768) | C(23.09% 1751) | T(27.24% 2066) | G(26.36% 1999)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 16819247 16822246 3000 browser details YourSeq 50 197 452 3000 73.1% chr11 + 57593303 57593471 169 browser details YourSeq 50 191 451 3000 69.3% chr1 + 54843405 54843539 135 browser details YourSeq 41 193 233 3000 100.0% chr11 + 50690542 50690582 41 browser details YourSeq 39 191 229 3000 100.0% chr1 - 192303563 192303601 39 browser details YourSeq 39 193 231 3000 100.0% chr1 - 94546681 94546719 39 browser details YourSeq 39 191 229 3000 100.0% chr10 + 44920098 44920136 39 browser details YourSeq 38 192 229 3000 100.0% chr11 - 6892468 6892505 38 browser details YourSeq 38 193 230 3000 100.0% chr12 + 90508710 90508747 38 browser details YourSeq 38 193 230 3000 100.0% chr11 + 10952638 10952675 38 browser details YourSeq 38 192 229 3000 100.0% chr1 + 84555129 84555166 38

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 16823331 16826330 3000 browser details YourSeq 131 1436 1601 3000 92.9% chr3 + 154916030 154916202 173 browser details YourSeq 127 1452 1600 3000 94.6% chr1 - 69932215 69932448 234 browser details YourSeq 124 1430 1601 3000 93.2% chr4 - 100924695 100924874 180 browser details YourSeq 123 1433 1601 3000 91.4% chr2 - 76460711 76460881 171 browser details YourSeq 121 1452 1597 3000 91.8% chr16 - 26877131 26877295 165 browser details YourSeq 120 1452 1600 3000 92.4% chr18 + 59873546 59874157 612 browser details YourSeq 120 1437 1587 3000 94.3% chr1 + 93527247 93527421 175 browser details YourSeq 119 1452 1600 3000 92.9% chr15 - 34856614 34856802 189 browser details YourSeq 118 1431 1597 3000 91.1% chr14 - 96663215 96663383 169 browser details YourSeq 118 1452 1600 3000 93.4% chr19 + 9628389 9628584 196 browser details YourSeq 118 1432 1590 3000 87.2% chr1 + 16152555 16152704 150 browser details YourSeq 117 1452 1600 3000 87.2% chr5 - 52384345 52384487 143 browser details YourSeq 117 1452 1600 3000 92.2% chr16 + 95163901 95164057 157 browser details YourSeq 116 1452 1589 3000 93.5% chr14 - 78296575 78374656 78082 browser details YourSeq 115 1452 1600 3000 90.0% chr1 - 194480884 194481026 143 browser details YourSeq 115 1452 1600 3000 88.0% chr8 + 95318893 95319035 143 browser details YourSeq 114 1436 1602 3000 84.6% chr13 - 30773064 30773206 143 browser details YourSeq 114 1452 1600 3000 91.5% chr10 + 24986407 24986960 554 browser details YourSeq 113 1452 1601 3000 86.9% chr9 + 71740089 71740221 133

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Strn4 striatin, calmodulin binding protein 4 [ Mus musculus (house mouse) ] Gene ID: 97387, updated on 12-Aug-2019

Gene summary

Official Symbol Strn4 provided by MGI Official Full Name striatin, calmodulin binding protein 4 provided by MGI Primary source MGI:MGI:2142346 See related Ensembl:ENSMUSG00000030374 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as ZIN; C80611; zinedin Expression Ubiquitous expression in whole brain E14.5 (RPKM 50.8), CNS E18 (RPKM 43.7) and 28 other tissues See more Orthologs human all

Genomic context

Location: 7; 7 A2 See Strn4 in Genome Data Viewer

Exon count: 19

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (16815889..16840931)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (17401238..17426280)

Chromosome 7 - NC_000073.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 14 transcripts

Gene: Strn4 ENSMUSG00000030374

Description striatin, calmodulin binding protein 4 [Source:MGI Symbol;Acc:MGI:2142346] Gene Synonyms ZIN, zinedin Location Chromosome 7: 16,815,889-16,840,931 forward strand. GRCm38:CM001000.2 About this gene This gene has 14 transcripts (splice variants), 180 orthologues, 2 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Strn4- ENSMUST00000019220.15 3535 760aa ENSMUSP00000019220.8 Protein coding CCDS39786 P58404 TSL:1 201 GENCODE basic APPRIS P4

Strn4- ENSMUST00000108495.8 3082 753aa ENSMUSP00000104135.1 Protein coding CCDS39787 P58404 TSL:1 202 GENCODE basic APPRIS ALT1

Strn4- ENSMUST00000184694.1 921 198aa ENSMUSP00000139113.1 Protein coding - V9GXE7 CDS 5' 212 incomplete TSL:2

Strn4- ENSMUST00000184708.7 1262 293aa ENSMUSP00000139171.1 Nonsense mediated - V9GXI9 CDS 5' 213 decay incomplete TSL:5

Strn4- ENSMUST00000184280.7 913 225aa ENSMUSP00000139307.1 Nonsense mediated - V9GXT1 CDS 5' 211 decay incomplete TSL:5

Strn4- ENSMUST00000185011.7 910 72aa ENSMUSP00000139290.1 Nonsense mediated - V9GXS1 CDS 5' 214 decay incomplete TSL:5

Strn4- ENSMUST00000131721.1 4114 No - Retained intron - - TSL:1 206 protein

Strn4- ENSMUST00000123510.2 839 No - Retained intron - - TSL:3 204 protein

Strn4- ENSMUST00000123247.7 813 No - Retained intron - - TSL:3 203 protein

Strn4- ENSMUST00000134350.2 653 No - Retained intron - - TSL:3 207 protein

Strn4- ENSMUST00000138070.1 435 No - Retained intron - - TSL:3 209 protein

Strn4- ENSMUST00000183680.1 373 No - Retained intron - - TSL:1 210 protein

Strn4- ENSMUST00000137885.2 402 No - lncRNA - - TSL:3 208 protein

Strn4- ENSMUST00000127835.1 389 No - lncRNA - - TSL:2 205 protein

Page 6 of 8 https://www.alphaknockout.com

45.04 kb Forward strand 16.81Mb 16.82Mb 16.83Mb 16.84Mb 16.85Mb (Comprehensive set... Strn4-201 >protein coding Prkd2-202 >protein coding

Strn4-202 >protein coding Prkd2-201 >protein coding

Strn4-211 >nonsense mediated decay Strn4-205 >lncRNA Strn4-206 >retained intron Prkd2-206 >lncRNA

Strn4-208 >lncRNA Strn4-203 >retained intron Strn4-212 >protein coding Prkd2-203 >retained intron

Strn4-213 >nonsense mediated decay Strn4-210 >retained intron Prkd2-207 >lncRNA

Strn4-209 >retained intron Strn4-214 >nonsense mediated decay Prkd2-204 >retained intron

Strn4-207 >retained intron

Strn4-204 >retained intron

Contigs < AC148981.7 Genes < Fkrp-201protein coding (Comprehensive set...

< Fkrp-204protein coding

< Fkrp-203retained intron

< Fkrp-202retained intron

Regulatory Build

16.81Mb 16.82Mb 16.83Mb 16.84Mb 16.85Mb Reverse strand 45.04 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000019220

25.04 kb Forward strand

Strn4-201 >protein coding

ENSMUSP00000019... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily WD40-repeat-containing domain superfamily SMART WD40 repeat Prints G-protein beta WD-40 repeat Pfam Striatin, N-terminal WD40 repeat PROSITE profiles PS51257 WD40-repeat-containing domain

WD40 repeat PROSITE patterns WD40 repeat, conserved site PANTHER PTHR15653:SF1

PTHR15653 Gene3D 1.20.5.300 WD40/YVTN repeat-like-containing domain superfamily CDD cd00200

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 760

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8