https://www.alphaknockout.com

Mouse Sp2 Knockout Project (CRISPR/Cas9)

Objective: To create a Sp2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Sp2 (NCBI Reference Sequence: NM_030220 ; Ensembl: ENSMUSG00000018678 ) is located on Mouse 11. 7 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 7 (Transcript: ENSMUST00000107624). Exon 3~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: No homozygous null mice survived beyond E10.5, with decrease embryo size and embryonic growth retardation starting at E7.5.

Exon 3 starts from about 4.79% of the coding region. Exon 3~6 covers 89.94% of the coding region. The size of effective KO region: ~6235 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 7

Legends Exon of mouse Sp2 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1355 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1204 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(1355bp) | A(24.87% 337) | C(23.69% 321) | T(24.43% 331) | G(27.01% 366)

Note: The 1355 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1204bp) | A(23.84% 287) | C(25.33% 305) | T(24.25% 292) | G(26.58% 320)

Note: The 1204 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1355 1 1355 1355 100.0% chr11 - 96962010 96963364 1355 browser details YourSeq 132 443 588 1355 95.9% chr11 + 74715349 74715506 158 browser details YourSeq 130 444 588 1355 95.2% chr3 - 37562404 37562557 154 browser details YourSeq 129 443 590 1355 94.0% chr1 - 136703211 136703367 157 browser details YourSeq 126 444 587 1355 95.1% chr8 - 76699117 76699283 167 browser details YourSeq 126 445 590 1355 94.5% chr6 - 113633791 113633948 158 browser details YourSeq 125 444 592 1355 93.8% chr11 + 69265267 69265444 178 browser details YourSeq 125 444 590 1355 93.2% chr10 + 45428841 45428997 157 browser details YourSeq 124 201 574 1355 93.2% chr19 - 43505284 43505744 461 browser details YourSeq 124 449 706 1355 86.4% chr19 + 3797769 3798011 243 browser details YourSeq 123 443 590 1355 92.5% chr6 - 143488931 143489084 154 browser details YourSeq 123 449 589 1355 94.3% chr4 - 133952123 133952274 152 browser details YourSeq 123 444 589 1355 93.1% chrX + 101611055 101611208 154 browser details YourSeq 123 405 589 1355 93.7% chr17 + 92884304 92884583 280 browser details YourSeq 122 444 590 1355 91.9% chr18 - 53663881 53664036 156 browser details YourSeq 122 444 590 1355 92.5% chr17 - 83951600 83951761 162 browser details YourSeq 122 449 590 1355 93.7% chr15 - 34450625 34450803 179 browser details YourSeq 122 445 590 1355 93.1% chr18 + 76755446 76755593 148 browser details YourSeq 122 444 590 1355 92.5% chr11 + 100473820 100473974 155 browser details YourSeq 121 442 590 1355 91.8% chr2 + 20974086 20974253 168

Note: The 1355 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1204 1 1204 1204 100.0% chr11 - 96954571 96955774 1204 browser details YourSeq 50 433 582 1204 89.3% chr19 + 3676757 3676907 151 browser details YourSeq 48 427 773 1204 62.5% chr17 - 87601067 87601155 89 browser details YourSeq 44 433 528 1204 90.8% chr6 + 114165779 114166084 306 browser details YourSeq 43 452 509 1204 91.7% chr4 + 148628266 148628322 57 browser details YourSeq 40 434 482 1204 93.5% chr5 - 131547236 131547285 50 browser details YourSeq 40 425 485 1204 93.5% chr9 + 45143703 45143763 61 browser details YourSeq 37 433 484 1204 87.9% chr10 + 85174714 85174763 50 browser details YourSeq 35 600 772 1204 95.2% chr4 - 54068277 54068450 174 browser details YourSeq 34 433 479 1204 97.3% chr19 + 6159370 6159417 48 browser details YourSeq 34 183 249 1204 97.4% chr13 + 97510951 97511044 94 browser details YourSeq 33 433 478 1204 97.2% chr14 + 25938905 25938951 47 browser details YourSeq 33 433 478 1204 97.2% chr14 + 26078658 26078704 47 browser details YourSeq 33 433 478 1204 97.2% chr14 + 26218273 26218319 47 browser details YourSeq 33 433 470 1204 97.2% chr12 + 55011428 55011466 39 browser details YourSeq 30 433 475 1204 87.5% chr11 - 106702616 106702656 41 browser details YourSeq 30 455 489 1204 84.9% chr14 + 100004506 100004538 33 browser details YourSeq 30 454 487 1204 94.2% chr11 + 29236050 29236083 34 browser details YourSeq 28 433 468 1204 93.8% chr10 - 115312529 115312566 38 browser details YourSeq 28 435 475 1204 96.7% chr11 + 51446040 51446082 43

Note: The 1204 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Sp2 Sp2 [ Mus musculus (house mouse) ] Gene ID: 78912, updated on 24-Oct-2019

Gene summary

Official Symbol Sp2 provided by MGI Official Full Name provided by MGI Primary source MGI:MGI:1926162 See related Ensembl:ENSMUSG00000018678 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as mKIAA0048; 4930480I16Rik Summary This gene encodes a member of the Sp subfamily of Sp/XKLF transcription factors. Sp family are sequence- Expression specific DNA-binding proteins characterized by an amino-terminal trans-activation domain and three carboxy-terminal motifs. This protein contains the least conserved DNA-binding domain within the Sp subfamily of proteins, and its DNA sequence specificity differs from the other Sp proteins. The protein can act as a transcriptional activator or repressor, depending on the promoter and cell type. Multiple transcript variants encoding different isoforms have been found for this gene. [provided by RefSeq, Jul 2008] Orthologs Ubiquitous expression in thymus adult (RPKM 15.0), spleen adult (RPKM 11.1) and 28 other tissues See more human all

Genomic context

Location: 11; 11 D See Sp2 in Genome Data Viewer

Exon count: 11

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (96953332..96982959, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (96814651..96839002, complement)

Chromosome 11 - NC_000077.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Sp2 ENSMUSG00000018678

Description Sp2 transcription factor [Source:MGI Symbol;Acc:MGI:1926162] Location Chromosome 11: 96,953,341-96,982,959 reverse strand. GRCm38:CM001004.2 About this gene This gene has 3 transcripts (splice variants), 195 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 13 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Sp2-203 ENSMUST00000107624.7 3055 613aa ENSMUSP00000103250.2 Protein coding CCDS36291 Q8C5J0 TSL:1 GENCODE basic APPRIS P1

Sp2-202 ENSMUST00000107623.7 2907 607aa ENSMUSP00000103249.1 Protein coding - Q8BNQ4 TSL:1 GENCODE basic

Sp2-201 ENSMUST00000062652.6 2367 607aa ENSMUSP00000051403.6 Protein coding - Q8BNQ4 TSL:1 GENCODE basic

49.62 kb Forward strand 96.95Mb 96.96Mb 96.97Mb 96.98Mb 96.99Mb D030028A08Rik-202 >lncRNA (Comprehensive set...

D030028A08Rik-203 >lncRNA

D030028A08Rik-201 >lncRNA

D030028A08Rik-204 >lncRNA

Contigs AL596384.17 > Genes < Pnpo-201protein coding < Sp2-203protein coding (Comprehensive set...

< Pnpo-203lncRNA < Sp2-202protein coding

< Pnpo-202protein coding < Sp2-201protein coding

Regulatory Build

96.95Mb 96.96Mb 96.97Mb 96.98Mb 96.99Mb Reverse strand 49.62 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000107624

< Sp2-203protein coding

Reverse strand 24.37 kb

ENSMUSP00000103... MobiDB lite Low complexity (Seg) Superfamily Zinc finger C2H2 superfamily SMART Zinc finger C2H2-type Pfam Zinc finger C2H2-type PROSITE profiles Zinc finger C2H2-type PROSITE patterns Zinc finger C2H2-type PANTHER Transcription factor Sp2

PTHR23235 Gene3D 3.30.160.60

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 613

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8