https://www.alphaknockout.com

Mouse Sh2d1b1 Knockout Project (CRISPR/Cas9)

Objective: To create a Sh2d1b1 knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Sh2d1b1 (NCBI Reference Sequence: NM_012009 ; Ensembl: ENSMUSG00000102418 ) is located on Mouse 1. 4 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 4 (Transcript: ENSMUST00000179976). Exon 2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit NK cells with enhanced cytolytic capacity and an increased IFN- gamma secretion. Mice homozygous for a different knock-out allele exhibit impaired NK cell cytolysis.

Exon 2 starts from about 34.09% of the coding region. Exon 2 covers 16.16% of the coding region. The size of effective KO region: ~64 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 4

Legends Exon of mouse Sh2d1b1 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.7% 554) | C(18.2% 364) | T(27.25% 545) | G(26.85% 537)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(31.6% 632) | C(19.85% 397) | T(29.8% 596) | G(18.75% 375)

Note: The 2000 bp section downstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 170277785 170279784 2000 browser details YourSeq 828 1 1987 2000 93.3% chr1 + 170233280 170248120 14841 browser details YourSeq 358 1009 1423 2000 92.8% chr6 - 70111491 70111904 414 browser details YourSeq 355 1009 1423 2000 93.4% chrX + 15392617 15393031 415 browser details YourSeq 354 1009 1423 2000 92.8% chr13 - 93916990 93917401 412 browser details YourSeq 351 1009 1493 2000 91.3% chr9 + 38210538 38211050 513 browser details YourSeq 350 1009 1416 2000 93.6% chr3 - 34500447 34500854 408 browser details YourSeq 350 1009 1423 2000 92.2% chr7 + 110390470 110390882 413 browser details YourSeq 348 1009 1423 2000 91.8% chrX - 164582135 164582544 410 browser details YourSeq 348 1009 1423 2000 92.1% chr13 + 38667161 38667575 415 browser details YourSeq 347 1009 1416 2000 92.9% chr2 + 6671769 6672179 411 browser details YourSeq 347 1011 1423 2000 91.7% chr11 + 23809893 23810303 411 browser details YourSeq 346 747 1423 2000 92.0% chr12 - 13400731 13401440 710 browser details YourSeq 346 1009 1423 2000 92.4% chr12 + 85799341 85799755 415 browser details YourSeq 345 1009 1416 2000 93.5% chr13 + 48557441 48557848 408 browser details YourSeq 344 1017 1423 2000 91.9% chr8 + 68246479 68246878 400 browser details YourSeq 343 1009 1423 2000 91.0% chr16 - 52627126 52627538 413 browser details YourSeq 342 1009 1429 2000 91.0% chrX - 32946327 32947005 679 browser details YourSeq 342 1009 1423 2000 91.0% chr3 - 15120186 15120599 414 browser details YourSeq 342 1019 1423 2000 92.4% chr12 - 51384776 51385181 406

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 170279849 170281848 2000 browser details YourSeq 379 289 1307 2000 88.9% chr1 + 170249265 170534946 285682 browser details YourSeq 109 1131 1307 2000 88.7% chr17 - 54333622 54333788 167 browser details YourSeq 107 1150 1335 2000 85.2% chr3 + 37301195 37301362 168 browser details YourSeq 106 1207 1441 2000 85.2% chr5 - 120591527 120591740 214 browser details YourSeq 104 1136 1307 2000 86.4% chr11 + 26177556 26177710 155 browser details YourSeq 102 1131 1307 2000 88.1% chr9 - 9240133 9240302 170 browser details YourSeq 100 1132 1315 2000 83.4% chr1 - 109862313 109862479 167 browser details YourSeq 98 1132 1307 2000 88.3% chr14 - 20904931 20905099 169 browser details YourSeq 96 1150 1313 2000 87.8% chr4 - 123671377 123671534 158 browser details YourSeq 96 1132 1307 2000 83.2% chr4 - 58658786 58658936 151 browser details YourSeq 95 1339 1674 2000 85.3% chr19 - 13442556 13443093 538 browser details YourSeq 95 1133 1306 2000 84.0% chr11 - 32108254 32108396 143 browser details YourSeq 94 1160 1316 2000 88.2% chr17 - 25916413 25916562 150 browser details YourSeq 94 1144 1307 2000 85.1% chr11 + 97170054 97170201 148 browser details YourSeq 93 1137 1307 2000 84.1% chr13 + 66962421 66962576 156 browser details YourSeq 92 1149 1308 2000 84.5% chr18 + 11968499 11968646 148 browser details YourSeq 91 1137 1302 2000 84.2% chr5 - 135963651 135963798 148 browser details YourSeq 89 1163 1315 2000 85.5% chr6 - 115472498 115472662 165 browser details YourSeq 89 1136 1307 2000 84.2% chr11 - 40660616 40660770 155

Note: The 2000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Sh2d1b1 SH2 domain containing 1B1 [ Mus musculus (house mouse) ] Gene ID: 26904, updated on 12-Aug-2019

Gene summary

Official Symbol Sh2d1b1 provided by MGI Official Full Name SH2 domain containing 1B1 provided by MGI Primary source MGI:MGI:1349420 See related Ensembl:ENSMUSG00000102418 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Eat2; EAT-2; Eat2a; EAT-2A; Sh2d1b Expression Biased expression in lung adult (RPKM 2.2), spleen adult (RPKM 1.5) and 12 other tissues See more

Genomic context

Location: 1; 1 H3 See Sh2d1b1 in Genome Data Viewer

Exon count: 4

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (170277318..170286769)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (172207455..172215793)

Chromosome 1 - NC_000067.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Sh2d1b1 ENSMUSG00000102418

Description SH2 domain containing 1B1 [Source:MGI Symbol;Acc:MGI:1349420] Gene Synonyms EAT-2, Eat2, Eat2a, Sh2d1b Location : 170,277,320-170,286,769 forward strand. GRCm38:CM000994.2 About this gene This gene has 1 transcript (splice variant), 133 orthologues, 14 paralogues, is a member of 1 Ensembl protein family and is associated with 5 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Sh2d1b1-201 ENSMUST00000179976.2 2544 132aa ENSMUSP00000137069.1 Protein coding CCDS35767 Q149T1 TSL:1 GENCODE basic APPRIS P1

29.45 kb Forward strand

170.27Mb 170.28Mb 170.29Mb (Comprehensive set... Sh2d1b1-201 >protein coding

Contigs < AC123650.7 Regulatory Build

170.27Mb 170.28Mb 170.29Mb Reverse strand 29.45 kb

Regulation Legend

CTCF Enhancer Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000179976

9.45 kb Forward strand

Sh2d1b1-201 >protein coding

ENSMUSP00000137... Superfamily SH2 domain superfamily SMART SH2 domain Prints SH2 domain Pfam SH2 domain PROSITE profiles SH2 domain PANTHER PTHR46051:SF7

PTHR46051 Gene3D SH2 domain superfamily CDD EAT-2, SH2 domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 132

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8