https://www.alphaknockout.com

Mouse Sh2b3 Knockout Project (CRISPR/Cas9)

Objective: To create a Sh2b3 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Sh2b3 (NCBI Reference Sequence: NM_008507 ; Ensembl: ENSMUSG00000042594 ) is located on Mouse 5. 8 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 8 (Transcript: ENSMUST00000040308). Exon 2~8 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit severe perturbations in hematopoiesis, splenomegaly, and abnormal lymphoid and myeloid homeostasis. Mice homozygous for a different knock-out allele display altered mobility of hematopoietic stem/progenitor cells.

Exon 2 starts from about 0.06% of the coding region. Exon 2~8 covers 100.0% of the coding region. The size of effective KO region: ~11274 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8

Legends Exon of mouse Sh2b3 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(20.75% 415) | C(25.3% 506) | T(28.3% 566) | G(25.65% 513)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.5% 490) | C(27.2% 544) | T(24.8% 496) | G(23.5% 470)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr5 - 121829050 121831049 2000 browser details YourSeq 253 620 1480 2000 94.8% chr10 + 61194047 61606421 412375 browser details YourSeq 247 642 1489 2000 93.4% chr1 - 33646538 33792122 145585 browser details YourSeq 223 638 1482 2000 94.9% chr1 - 36290978 36362941 71964 browser details YourSeq 162 589 793 2000 97.7% chr12 - 83551816 83552188 373 browser details YourSeq 157 613 793 2000 95.5% chr12 + 105736205 105736612 408 browser details YourSeq 155 635 793 2000 98.8% chrX - 164367894 164368052 159 browser details YourSeq 153 632 808 2000 93.8% chr3 - 127257178 127257372 195 browser details YourSeq 153 213 793 2000 87.4% chr16 + 18298308 18298758 451 browser details YourSeq 152 635 793 2000 98.2% chr12 - 3313763 3313923 161 browser details YourSeq 152 635 793 2000 98.2% chr10 - 111568110 111568270 161 browser details YourSeq 152 635 793 2000 98.2% chrX + 104920743 104920902 160 browser details YourSeq 152 635 793 2000 98.2% chr12 + 102541968 102542128 161 browser details YourSeq 151 635 792 2000 98.2% chr15 - 61617529 61617688 160 browser details YourSeq 151 635 793 2000 97.5% chr17 + 87631335 87631493 159 browser details YourSeq 151 635 793 2000 97.5% chr10 + 25337305 25337463 159 browser details YourSeq 151 635 794 2000 97.5% chr10 + 3176273 3176434 162 browser details YourSeq 151 635 794 2000 97.5% chr1 + 140242928 140243089 162 browser details YourSeq 150 635 792 2000 97.5% chr9 - 106871354 106871511 158 browser details YourSeq 150 635 810 2000 93.0% chr17 - 7436354 7436528 175

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr5 - 121815774 121817773 2000 browser details YourSeq 21 22 42 2000 100.0% chr13 - 96322436 96322456 21 browser details YourSeq 20 262 281 2000 100.0% chr1 - 130294818 130294837 20

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Sh2b3 SH2B adaptor protein 3 [ Mus musculus (house mouse) ] Gene ID: 16923, updated on 12-Aug-2019

Gene summary

Official Symbol Sh2b3 provided by MGI Official Full Name SH2B adaptor protein 3 provided by MGI Primary source MGI:MGI:893598 See related Ensembl:ENSMUSG00000042594 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Lnk; AI429800 Summary This gene encodes a member of the SH2B family of adapter that play an important role in T cell receptor signaling. Expression This gene is preferentially expressed in hematopoietic stem cells, hematopoietic progenitors, pre and immature B cells, as well as megakaryocytes and mastocytes. In hematopoietic stem cells, the encoded protein is a key regulator of self- renewal, proliferation and apoptosis. Mice lacking the encoded protein exhibit pre and immature B cell expansion in spleen and the bone marrow. Alternative splicing results in multiple transcript variants encoding different isoforms. [provided by RefSeq, Apr 2015] Orthologs Ubiquitous expression in spleen adult (RPKM 35.7), testis adult (RPKM 30.2) and 28 other tissues See more human all

Genomic context

Location: 5 F; 5 61.99 cM See Sh2b3 in Genome Data Viewer

Exon count: 12

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (121815481..121837645, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (122267224..122286810, complement)

Chromosome 5 - NC_000071.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Sh2b3 ENSMUSG00000042594

Description SH2B adaptor protein 3 [Source:MGI Symbol;Acc:MGI:893598] Gene Synonyms Lnk Location Chromosome 5: 121,815,488-121,837,646 reverse strand. GRCm38:CM000998.2 About this gene This gene has 8 transcripts (splice variants), 191 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 24 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Sh2b3- ENSMUST00000086310.7 4195 548aa ENSMUSP00000083490.1 Protein coding CCDS19641 O09039 TSL:1 202 GENCODE basic APPRIS P3

Sh2b3- ENSMUST00000122426.7 2777 548aa ENSMUSP00000113926.1 Protein coding CCDS19641 O09039 TSL:1 204 GENCODE basic APPRIS P3

Sh2b3- ENSMUST00000040308.13 2530 548aa ENSMUSP00000041611.7 Protein coding CCDS19641 O09039 TSL:1 201 GENCODE basic APPRIS P3

Sh2b3- ENSMUST00000118580.5 2393 538aa ENSMUSP00000113808.1 Protein coding CCDS84952 D3Z3Y5 TSL:2 203 GENCODE basic APPRIS ALT2

Sh2b3- ENSMUST00000197892.2 850 250aa ENSMUSP00000142666.1 Protein coding - A0A0G2JE79 CDS 3' 207 incomplete TSL:3

Sh2b3- ENSMUST00000137682.1 642 160aa ENSMUSP00000118523.1 Protein coding - D3Z0T9 CDS 3' 206 incomplete TSL:3

Sh2b3- ENSMUST00000198161.1 391 102aa ENSMUSP00000143505.1 Protein coding - A0A0G2JGB8 CDS 5' 208 incomplete TSL:3

Sh2b3- ENSMUST00000136960.7 4313 370aa ENSMUSP00000119086.1 Nonsense mediated - D6RCY9 TSL:2 205 decay

Page 7 of 9 https://www.alphaknockout.com

42.16 kb Forward strand

121.81Mb 121.82Mb 121.83Mb 121.84Mb Atxn2-201 >protein coding 1700008B11Rik-201 >lncRNA (Comprehensive set...

Atxn2-209 >protein coding

Atxn2-214 >protein coding

Atxn2-215 >retained intronAtxn2-205 >nonsense mediated decay

Atxn2-217 >retained intronAtxn2-218 >protein coding

Atxn2-210 >protein coding

Atxn2-203 >retained intron

Atxn2-206 >nonsense mediated decay

Atxn2-212 >retained intron

Atxn2-211 >retained intron

Contigs < AC113302.13 Genes (Comprehensive set... < Sh2b3-205nonsense mediated decay

< Sh2b3-202protein coding

< Sh2b3-208protein coding

< Sh2b3-204protein coding

< Sh2b3-201protein coding

< Sh2b3-203protein coding

< Sh2b3-207protein coding

< Mir7031-201miRNA < Sh2b3-206protein coding

Regulatory Build

121.81Mb 121.82Mb 121.83Mb 121.84Mb Reverse strand 42.16 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000040308

< Sh2b3-201protein coding

Reverse strand 19.64 kb

ENSMUSP00000041... MobiDB lite Low complexity (Seg) Superfamily SSF50729 SH2 domain superfamily

Phenylalanine zipper superfamily SMART SH2 domain Prints SH2 domain Pfam Phenylalanine zipper SH2 domain

PROSITE profiles SH2 domain PANTHER SH2B adapter protein 3

SH2B adapter protein Gene3D PH-like domain superfamily SH2 domain superfamily

CDD cd01231 SH2B3, SH2 domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

inframe insertion missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 548

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9