https://www.alphaknockout.com

Mouse St8sia6 Knockout Project (CRISPR/Cas9)

Objective: To create a St8sia6 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The St8sia6 (NCBI Reference Sequence: NM_145838 ; Ensembl: ENSMUSG00000003418 ) is located on Mouse 2. 8 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 8 (Transcript: ENSMUST00000003509). Exon 5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 31.66% of the coding region. Exon 5 covers 12.14% of the coding region. The size of effective KO region: ~145 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 5 8

Legends Exon of mouse St8sia6 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 5 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.05% 541) | C(19.9% 398) | T(32.45% 649) | G(20.6% 412)

Note: The 2000 bp section upstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.2% 584) | C(19.85% 397) | T(30.8% 616) | G(20.15% 403)

Note: The 2000 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 13672628 13674627 2000 browser details YourSeq 108 352 693 2000 79.2% chr17 - 50254143 50254389 247 browser details YourSeq 76 354 473 2000 91.4% chr9 - 25661203 25661326 124 browser details YourSeq 63 352 473 2000 83.2% chr3 + 54288087 54288204 118 browser details YourSeq 61 1380 1525 2000 82.2% chr4 + 86916145 86916292 148 browser details YourSeq 60 1072 1416 2000 84.9% chr1 + 153993355 153993688 334 browser details YourSeq 56 1393 1524 2000 91.2% chr5 - 124761975 124762107 133 browser details YourSeq 55 1383 1580 2000 88.8% chr14 - 47336332 47336531 200 browser details YourSeq 54 1364 1465 2000 91.1% chr7 + 138958131 138958233 103 browser details YourSeq 53 352 461 2000 96.6% chrX - 153065686 153065797 112 browser details YourSeq 53 1380 1513 2000 85.1% chr2 - 170060262 170060393 132 browser details YourSeq 51 1372 1593 2000 72.9% chr13 - 100736507 100736861 355 browser details YourSeq 50 1395 1501 2000 89.1% chr2 + 24746118 24746225 108 browser details YourSeq 48 1373 1453 2000 93.0% chr1 + 134531963 134532176 214 browser details YourSeq 46 450 508 2000 96.1% chr15 + 39283593 39283676 84 browser details YourSeq 45 1372 1454 2000 91.0% chr15 - 79189630 79189712 83 browser details YourSeq 45 1380 1464 2000 89.5% chr12 + 86121228 86121312 85 browser details YourSeq 44 1077 1120 2000 100.0% chr15 - 30065681 30065724 44 browser details YourSeq 44 1380 1433 2000 88.5% chrX + 163655658 163655710 53 browser details YourSeq 44 1380 1465 2000 92.4% chr4 + 137669192 137669277 86

Note: The 2000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 13670483 13672482 2000 browser details YourSeq 42 1155 1237 2000 74.5% chr14 - 7209176 7209226 51 browser details YourSeq 37 143 200 2000 95.3% chr15 - 84420901 84420959 59 browser details YourSeq 31 142 199 2000 97.1% chr4 + 57671389 57671447 59 browser details YourSeq 29 1209 1244 2000 94.2% chr4 + 80077054 80077090 37 browser details YourSeq 27 1244 1272 2000 96.6% chr16 + 44200053 44200081 29 browser details YourSeq 26 1244 1269 2000 100.0% chr5 + 136679577 136679602 26 browser details YourSeq 24 1244 1272 2000 81.5% chr2 - 166128605 166128631 27 browser details YourSeq 24 1249 1272 2000 100.0% chr17 - 69404445 69404468 24 browser details YourSeq 23 1246 1268 2000 100.0% chr17 - 85215226 85215248 23 browser details YourSeq 22 144 165 2000 100.0% chr4 - 139748000 139748021 22 browser details YourSeq 22 1244 1265 2000 100.0% chr11 + 22904915 22904936 22 browser details YourSeq 21 645 665 2000 100.0% chr2 - 39790926 39790946 21 browser details YourSeq 21 1374 1394 2000 100.0% chr1 - 80277181 80277201 21 browser details YourSeq 21 1248 1268 2000 100.0% chr3 + 104550271 104550291 21

Note: The 2000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: St8sia6 ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 6 [ Mus musculus (house mouse) ] Gene ID: 241230, updated on 12-Aug-2019

Gene summary

Official Symbol St8sia6 provided by MGI Official Full Name ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 6 provided by MGI Primary source MGI:MGI:2386797 See related Ensembl:ENSMUSG00000003418 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Siat8f; AI314453; AI875066; ST8SiaVI; 1700007J08Rik Expression Biased expression in bladder adult (RPKM 5.5), genital fat pad adult (RPKM 4.9) and 10 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 A1 See St8sia6 in Genome Data Viewer Exon count: 8

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (13651018..13794734, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (13576561..13715147, complement)

Chromosome 2 - NC_000068.7

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: St8sia6 ENSMUSG00000003418

Description ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 6 [Source:MGI Symbol;Acc:MGI:2386797] Gene Synonyms 1700007J08Rik, ST8Sia VI, Siat8f Location Chromosome 2: 13,651,021-13,794,064 reverse strand. GRCm38:CM000995.2 About this gene This gene has 2 transcripts (splice variants), 312 orthologues, 5 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

St8sia6-201 ENSMUST00000003509.9 7618 398aa ENSMUSP00000003509.8 Protein coding CCDS15697 Q148N5 Q8K4T1 TSL:1 GENCODE basic APPRIS P1

St8sia6-202 ENSMUST00000150781.1 810 No protein - lncRNA - - TSL:3

163.04 kb Forward strand

13.65Mb 13.70Mb 13.75Mb 13.80Mb Gm37126-201 >lncRNA (Comprehensive set...

Contigs AL772303.11 > BX322642.15 > AL928918.8 > Genes (Comprehensive set... < St8sia6-201protein coding

< St8sia6-202lncRNA

Regulatory Build

13.65Mb 13.70Mb 13.75Mb 13.80Mb Reverse strand 163.04 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000003509

< St8sia6-201protein coding

Reverse strand 143.04 kb

ENSMUSP00000003... Transmembrane heli... Low complexity (Seg) Cleavage site (Sign... Pfam Glycosyl transferase family 29 PIRSF Sialyltransferase PANTHER PTHR11987

PTHR11987:SF29 Gene3D GT29-like superfamiliy

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 398

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8