https://www.alphaknockout.com

Mouse Stkld1 Knockout Project (CRISPR/Cas9)

Objective: To create a Stkld1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Stkld1 (NCBI Reference Sequence: NM_198628 ; Ensembl: ENSMUSG00000049897 ) is located on Mouse 2. 19 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 19 (Transcript: ENSMUST00000055406). Exon 3~11 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 3.98% of the coding region. Exon 3~11 covers 45.82% of the coding region. The size of effective KO region: ~9347 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 7 8 9 10 11 19

Legends Exon of mouse Stkld1 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1984 bp section downstream of Exon 11 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(23.6% 472) | C(21.3% 426) | T(30.15% 603) | G(24.95% 499)

Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1984bp) | A(25.4% 504) | C(25.86% 513) | T(26.16% 519) | G(22.58% 448)

Note: The 1984 bp section downstream of Exon 11 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 + 26935982 26937981 2000 browser details YourSeq 314 1115 1587 2000 86.9% chr4 - 74283060 74283478 419 browser details YourSeq 314 1115 1595 2000 91.0% chr1 + 164021830 164022410 581 browser details YourSeq 277 1116 1595 2000 89.0% chr2 - 167592156 167592625 470 browser details YourSeq 268 1165 1588 2000 89.5% chr6 - 120175232 120418126 242895 browser details YourSeq 258 1259 1595 2000 90.3% chr3 + 130862286 130862617 332 browser details YourSeq 250 1115 1595 2000 92.0% chr9 - 66113187 66113722 536 browser details YourSeq 250 1164 1595 2000 90.4% chr7 - 4429512 4430226 715 browser details YourSeq 249 1131 1587 2000 88.6% chr11 - 80765467 80766170 704 browser details YourSeq 247 1297 1596 2000 90.7% chr9 + 120768036 120768334 299 browser details YourSeq 242 1116 1561 2000 85.5% chrX - 48464861 48465405 545 browser details YourSeq 237 951 1596 2000 90.2% chr6 + 122884438 122885182 745 browser details YourSeq 235 1115 1595 2000 84.5% chr1 + 86539603 86539937 335 browser details YourSeq 231 950 1596 2000 90.3% chr19 + 44317825 44318599 775 browser details YourSeq 231 1298 1595 2000 90.9% chr12 + 3636250 3636580 331 browser details YourSeq 228 1115 1595 2000 84.4% chr10 + 117753606 117753925 320 browser details YourSeq 227 1307 1596 2000 89.4% chr5 + 104796861 104797151 291 browser details YourSeq 227 1289 1594 2000 89.3% chr11 + 95112102 95112404 303 browser details YourSeq 226 1307 1596 2000 89.3% chr2 - 158436148 158436429 282 browser details YourSeq 225 1115 1596 2000 87.2% chr3 + 94910727 94911078 352

Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1984 1 1984 1984 100.0% chr2 + 26947329 26949312 1984 browser details YourSeq 231 697 1005 1984 87.7% chr1 - 166020046 166020357 312 browser details YourSeq 230 698 1012 1984 88.1% chr9 - 46437540 46437854 315 browser details YourSeq 228 703 1008 1984 87.6% chr10 - 82685955 82686264 310 browser details YourSeq 227 697 1006 1984 87.3% chr5 + 125584946 125585275 330 browser details YourSeq 226 697 1005 1984 88.3% chr7 + 35060151 35060465 315 browser details YourSeq 222 697 1007 1984 88.7% chr18 - 35753748 35754065 318 browser details YourSeq 222 697 1007 1984 92.5% chr10 - 121319708 121320020 313 browser details YourSeq 218 697 998 1984 86.9% chr5 + 92779522 92984592 205071 browser details YourSeq 213 693 1005 1984 87.5% chr13 - 9007796 9008115 320 browser details YourSeq 213 698 1009 1984 88.5% chr1 + 17136389 17136709 321 browser details YourSeq 210 717 1037 1984 84.7% chr17 - 87077808 87078119 312 browser details YourSeq 209 697 1006 1984 88.6% chr8 - 119326641 119326952 312 browser details YourSeq 208 703 998 1984 86.3% chr6 - 100769606 100769907 302 browser details YourSeq 205 369 991 1984 79.8% chr6 - 28952706 28953137 432 browser details YourSeq 202 697 1005 1984 88.0% chr5 + 37548051 37548360 310 browser details YourSeq 202 697 1005 1984 87.6% chr13 + 25145538 25145844 307 browser details YourSeq 199 703 1001 1984 87.6% chr4 + 127920149 127920453 305 browser details YourSeq 198 684 986 1984 85.9% chr10 - 127348001 127348329 329 browser details YourSeq 198 717 1005 1984 84.8% chr5 + 124642892 124643186 295

Note: The 1984 bp section downstream of Exon 11 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and protein information: Stkld1 serine/threonine kinase-like domain containing 1 [ Mus musculus (house mouse) ] Gene ID: 279029, updated on 12-Aug-2019

Gene summary

Official Symbol Stkld1 provided by MGI Official Full Name serine/threonine kinase-like domain containing 1 provided by MGI Primary source MGI:MGI:2685557 See related Ensembl:ENSMUSG00000049897 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gm711; Sgk071 Expression Biased expression in testis adult (RPKM 36.3), duodenum adult (RPKM 31.6) and 8 other tissues See more Orthologs all

Genomic context

Location: 2; 2 A3 See Stkld1 in Genome Data Viewer Exon count: 22

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (26933521..26953496)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (26789589..26809016)

Chromosome 2 - NC_000068.7

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Stkld1 ENSMUSG00000049897

Description serine/threonine kinase-like domain containing 1 [Source:MGI Symbol;Acc:MGI:2685557] Gene Synonyms Gm711, LOC279029 Location Chromosome 2: 26,934,047-26,953,496 forward strand. GRCm38:CM000995.2 About this gene This gene has 2 transcripts (splice variants), 106 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Stkld1-201 ENSMUST00000055406.8 2169 662aa ENSMUSP00000062967.8 Protein coding CCDS15818 Q80YS9 TSL:1 GENCODE basic APPRIS P1

Stkld1-202 ENSMUST00000153771.7 400 116aa ENSMUSP00000121332.1 Protein coding - B0R044 CDS 3' incomplete TSL:5

39.45 kb Forward strand 26.93Mb 26.94Mb 26.95Mb 26.96Mb (Comprehensive set... Stkld1-202 >protein coding

Stkld1-201 >protein coding

Contigs AL773563.12 > Genes < Surf4-201protein coding < Rexo4-201protein coding (Comprehensive set...

< Surf4-202nonsense mediated decay < Rexo4-202protein coding

< Surf4-203protein coding < Rexo4-203lncRNA < Rexo4-205lncRNA

< Rexo4-209lncRNA

< Rexo4-207protein coding

< Rexo4-208lncRNA

< Rexo4-206lncRNA

< Rexo4-204lncRNA

Regulatory Build

26.93Mb 26.94Mb 26.95Mb 26.96Mb Reverse strand 39.45 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000055406

19.43 kb Forward strand

Stkld1-201 >protein coding

ENSMUSP00000062... MobiDB lite Low complexity (Seg) Superfamily Protein kinase-like domain superfamily Armadillo-type fold

Pfam Protein kinase domain

PROSITE profiles Protein kinase domain

PANTHER PTHR24363

PTHR24363:SF5 Gene3D 1.10.510.10

CDD cd00180

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 662

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8