https://www.alphaknockout.com

Mouse Sh3bp5 Knockout Project (CRISPR/Cas9)

Objective: To create a Sh3bp5 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Sh3bp5 (NCBI Reference Sequence: NM_011894 ; Ensembl: ENSMUSG00000021892 ) is located on Mouse 14. 9 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 9 (Transcript: ENSMUST00000091903). Exon 4~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 4 starts from about 24.75% of the coding region. Exon 4~5 covers 21.54% of the coding region. The size of effective KO region: ~8495 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4 5 9

Legends Exon of mouse Sh3bp5 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1068 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(20.25% 405) | C(24.45% 489) | T(26.9% 538) | G(28.4% 568)

Note: The 2000 bp section upstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1068bp) | A(21.54% 230) | C(25.66% 274) | T(30.15% 322) | G(22.66% 242)

Note: The 1068 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr14 - 31387884 31389883 2000 browser details YourSeq 59 148 526 2000 85.1% chr14 - 64082239 64082609 371 browser details YourSeq 52 725 1299 2000 92.1% chr12 + 4116851 4126362 9512 browser details YourSeq 48 727 1036 2000 74.1% chrX - 12535136 12535400 265 browser details YourSeq 46 763 874 2000 90.6% chr17 + 5556951 5557447 497 browser details YourSeq 40 436 521 2000 73.3% chr2 + 21466663 21466748 86 browser details YourSeq 38 437 494 2000 82.8% chr4 + 97519944 97520001 58 browser details YourSeq 36 799 874 2000 92.9% chr2 - 4396170 4396398 229 browser details YourSeq 36 856 939 2000 82.0% chr12 + 77352952 77353034 83 browser details YourSeq 34 452 524 2000 76.4% chr3 - 98233981 98234044 64 browser details YourSeq 34 445 521 2000 89.5% chr1 + 182152922 182152997 76 browser details YourSeq 33 366 499 2000 79.0% chr13 - 51279684 51279813 130 browser details YourSeq 33 294 368 2000 97.2% chr12 + 13595639 13595715 77 browser details YourSeq 31 482 526 2000 84.5% chr11 + 65692820 65692864 45 browser details YourSeq 30 732 783 2000 94.2% chr3 - 41408170 41408225 56 browser details YourSeq 29 961 1004 2000 87.2% chr12 + 100396082 100396126 45 browser details YourSeq 29 745 781 2000 94.0% chr12 + 85504708 85504842 135 browser details YourSeq 29 480 526 2000 80.9% chr10 + 24399817 24399863 47 browser details YourSeq 28 745 776 2000 93.8% chr6 + 9235741 9235772 32 browser details YourSeq 27 436 474 2000 93.6% chr13 + 102105581 102105621 41

Note: The 2000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1068 1 1068 1068 100.0% chr14 - 31378321 31379388 1068 browser details YourSeq 28 613 641 1068 100.0% chr1 - 101367553 101367948 396 browser details YourSeq 24 613 640 1068 96.2% chr1 - 192767731 192767759 29 browser details YourSeq 24 848 875 1068 80.0% chr11 + 59011524 59011548 25 browser details YourSeq 23 1034 1060 1068 84.7% chr2 + 158128766 158128791 26 browser details YourSeq 22 128 152 1068 96.0% chr1 + 13135591 13135617 27 browser details YourSeq 21 271 293 1068 95.7% chr1 - 184772774 184772796 23 browser details YourSeq 20 1019 1038 1068 100.0% chr10 - 126462905 126462924 20

Note: The 1068 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Sh3bp5 SH3-domain binding protein 5 (BTK-associated) [ Mus musculus (house mouse) ] Gene ID: 24056, updated on 12-Aug-2019

Gene summary

Official Symbol Sh3bp5 provided by MGI Official Full Name SH3-domain binding protein 5 (BTK-associated) provided by MGI Primary source MGI:MGI:1344391 See related Ensembl:ENSMUSG00000021892 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Sab; SH3BP-5; AI606498 Expression Ubiquitous expression in CNS E18 (RPKM 46.1), CNS E14 (RPKM 37.5) and 25 other tissues See more Orthologs human all

Genomic context

Location: 14; 14 B See Sh3bp5 in Genome Data Viewer Exon count: 12

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 14 NC_000080.6 (31372614..31436100, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 14 NC_000080.5 (32187141..32249219, complement)

Chromosome 14 - NC_000080.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Sh3bp5 ENSMUSG00000021892

Description SH3-domain binding protein 5 (BTK-associated) [Source:MGI Symbol;Acc:MGI:1344391] Gene Synonyms Sab Location Chromosome 14: 31,359,880-31,436,078 reverse strand. GRCm38:CM001007.2 About this gene This gene has 4 transcripts (splice variants), 263 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Sh3bp5- ENSMUST00000091903.4 2684 458aa ENSMUSP00000089517.4 Protein coding CCDS36855 Q9Z131 TSL:1 201 GENCODE basic APPRIS P3

Sh3bp5- ENSMUST00000100730.9 2633 456aa ENSMUSP00000098296.3 Protein coding CCDS84108 Q9Z131 TSL:1 202 GENCODE basic APPRIS ALT2

Sh3bp5- ENSMUST00000140002.7 1686 463aa ENSMUSP00000117152.1 Nonsense mediated - Q9Z131 TSL:1 203 decay

Sh3bp5- ENSMUST00000147586.1 382 No - lncRNA - - TSL:5 204 protein

Page 7 of 9 https://www.alphaknockout.com

96.20 kb Forward strand 31.36Mb 31.38Mb 31.40Mb 31.42Mb 31.44Mb Capn7-201 >protein coding (Comprehensive set...

Capn7-202 >nonsense mediated decay

Capn7-203 >nonsense mediated decay

Capn7-204 >retained intron

Contigs AC108416.5 > AC154616.2 > Genes (Comprehensive set... < Sh3bp5-203nonsense mediated decay

< Sh3bp5-201protein coding

< Sh3bp5-202protein coding

< Sh3bp5-204lncRNA

Regulatory Build

31.36Mb 31.38Mb 31.40Mb 31.42Mb 31.44Mb Reverse strand 96.20 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000091903

< Sh3bp5-201protein coding

Reverse strand 62.28 kb

ENSMUSP00000089... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam SH3-binding 5

PANTHER PTHR19423:SF11

SH3-binding 5

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 458

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9