https://www.alphaknockout.com

Mouse Stkld1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Stkld1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Stkld1 (NCBI Reference Sequence: NM_198628 ; Ensembl: ENSMUSG00000049897 ) is located on Mouse 2. 19 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 19 (Transcript: ENSMUST00000055406). Exon 6 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Stkld1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-414L19 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 6 starts from about 15.31% of the coding region. The knockout of Exon 6 will result in frameshift of the gene. The size of intron 5 for 5'-loxP site insertion: 2015 bp, and the size of intron 6 for 3'-loxP site insertion: 675 bp. The size of effective cKO region: ~571 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 5 6 7 8 19 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Stkld1 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7071bp) | A(25.03% 1770) | C(24.1% 1704) | T(25.46% 1800) | G(25.41% 1797)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 26939882 26942881 3000 browser details YourSeq 138 1232 1447 3000 86.4% chr4 + 77868593 77868813 221 browser details YourSeq 115 1280 1447 3000 91.5% chr6 + 87421896 87422079 184 browser details YourSeq 99 1250 1397 3000 84.4% chr19 + 5977047 5977188 142 browser details YourSeq 97 1252 1396 3000 86.1% chr11 + 22961750 22961890 141 browser details YourSeq 96 1273 1431 3000 87.1% chr10 - 42050193 42050766 574 browser details YourSeq 96 1280 1745 3000 93.0% chr10 + 29850749 29851320 572 browser details YourSeq 94 1266 1553 3000 90.5% chr15 + 76359515 76359946 432 browser details YourSeq 93 1252 1396 3000 84.3% chr18 - 3355858 3355998 141 browser details YourSeq 93 1252 1396 3000 83.2% chr6 + 116190434 116190573 140 browser details YourSeq 92 1233 1397 3000 83.0% chr11 - 46660232 46660390 159 browser details YourSeq 90 1280 1397 3000 90.3% chr5 - 139373882 139373996 115 browser details YourSeq 89 1280 1398 3000 88.3% chr1 - 16077295 16077412 118 browser details YourSeq 89 1189 1368 3000 86.3% chr18 + 79495021 79495583 563 browser details YourSeq 83 1272 1399 3000 86.8% chr2 - 128475175 128475300 126 browser details YourSeq 83 1252 1396 3000 83.7% chr4 + 85268575 85268715 141 browser details YourSeq 81 1251 1364 3000 92.8% chr15 + 80172611 80172726 116 browser details YourSeq 80 1287 1397 3000 87.7% chr6 + 52891188 52891295 108 browser details YourSeq 80 1282 1399 3000 84.0% chr10 + 84806597 84806711 115 browser details YourSeq 79 1282 1397 3000 88.4% chr8 - 94202933 94203051 119

Note: The 3000 bp section upstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 26943453 26946452 3000 browser details YourSeq 189 1093 1353 3000 91.0% chr10 - 61525524 61525985 462 browser details YourSeq 174 1111 1328 3000 90.7% chr11 - 42476360 42956679 480320 browser details YourSeq 154 1092 1259 3000 96.5% chr11 - 52674676 52674854 179 browser details YourSeq 152 1091 1264 3000 94.8% chr1 - 37249281 37249464 184 browser details YourSeq 151 1094 1344 3000 90.7% chr3 - 157583525 157583770 246 browser details YourSeq 150 1092 1273 3000 95.2% chr5 - 147751059 147751253 195 browser details YourSeq 150 1094 1352 3000 86.3% chr7 + 100589131 100589330 200 browser details YourSeq 148 1095 1329 3000 89.0% chr11 - 67023079 67023275 197 browser details YourSeq 145 1092 1253 3000 96.8% chr5 - 141235480 141235643 164 browser details YourSeq 145 1094 1253 3000 97.4% chr7 + 28531446 28531607 162 browser details YourSeq 144 959 1254 3000 91.2% chr5 - 107275292 107275586 295 browser details YourSeq 144 1092 1259 3000 96.8% chr1 - 34050507 34050685 179 browser details YourSeq 143 1093 1259 3000 93.9% chrX - 42230139 42230312 174 browser details YourSeq 143 1090 1250 3000 95.0% chr9 - 82892378 82892549 172 browser details YourSeq 143 1051 1252 3000 91.8% chr1 - 136888318 136888753 436 browser details YourSeq 143 1091 1254 3000 94.5% chr5 + 108691058 108691228 171 browser details YourSeq 143 1093 1254 3000 95.0% chr1 + 33723129 33723297 169 browser details YourSeq 142 1093 1353 3000 86.2% chr13 - 72868226 72868451 226 browser details YourSeq 142 1077 1254 3000 92.1% chr1 - 9807383 9807559 177

Note: The 3000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and protein information: Stkld1 serine/threonine kinase-like domain containing 1 [ Mus musculus (house mouse) ] Gene ID: 279029, updated on 12-Aug-2019

Gene summary

Official Symbol Stkld1 provided by MGI Official Full Name serine/threonine kinase-like domain containing 1 provided by MGI Primary source MGI:MGI:2685557 See related Ensembl:ENSMUSG00000049897 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gm711; Sgk071 Expression Biased expression in testis adult (RPKM 36.3), duodenum adult (RPKM 31.6) and 8 other tissues See more Orthologs all

Genomic context

Location: 2; 2 A3 See Stkld1 in Genome Data Viewer

Exon count: 22

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (26933521..26953496)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (26789589..26809016)

Chromosome 2 - NC_000068.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Stkld1 ENSMUSG00000049897

Description serine/threonine kinase-like domain containing 1 [Source:MGI Symbol;Acc:MGI:2685557] Gene Synonyms Gm711, LOC279029 Location Chromosome 2: 26,934,047-26,953,496 forward strand. GRCm38:CM000995.2 About this gene This gene has 2 transcripts (splice variants), 106 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Stkld1-201 ENSMUST00000055406.8 2169 662aa ENSMUSP00000062967.8 Protein coding CCDS15818 Q80YS9 TSL:1 GENCODE basic APPRIS P1

Stkld1-202 ENSMUST00000153771.7 400 116aa ENSMUSP00000121332.1 Protein coding - B0R044 CDS 3' incomplete TSL:5

39.45 kb Forward strand 26.93Mb 26.94Mb 26.95Mb 26.96Mb (Comprehensive set... Stkld1-202 >protein coding

Stkld1-201 >protein coding

Contigs AL773563.12 > Genes < Surf4-201protein coding < Rexo4-201protein coding (Comprehensive set...

< Surf4-202nonsense mediated decay < Rexo4-202protein coding

< Surf4-203protein coding < Rexo4-203lncRNA < Rexo4-205lncRNA

< Rexo4-209lncRNA

< Rexo4-207protein coding

< Rexo4-208lncRNA

< Rexo4-206lncRNA

< Rexo4-204lncRNA

Regulatory Build

26.93Mb 26.94Mb 26.95Mb 26.96Mb Reverse strand 39.45 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000055406

19.43 kb Forward strand

Stkld1-201 >protein coding

ENSMUSP00000062... MobiDB lite Low complexity (Seg) Superfamily Protein kinase-like domain superfamily Armadillo-type fold

Pfam Protein kinase domain

PROSITE profiles Protein kinase domain

PANTHER PTHR24363

PTHR24363:SF5 Gene3D 1.10.510.10

CDD cd00180

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 662

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7