https://www.alphaknockout.com

Mouse Sh3glb1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Sh3glb1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Sh3glb1 (NCBI Reference Sequence: NM_001282037 ; Ensembl: ENSMUSG00000037062 ) is located on Mouse 3. 11 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 11 (Transcript: ENSMUST00000198254). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Sh3glb1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-271A9 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous mutation of this gene results in delayed apoptosis of embryonic fibroblasts in response to serum withdrawal or treatment with a mitochondrial stress inducer.

Exon 2 starts from about 6.3% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 7253 bp, and the size of intron 2 for 3'-loxP site insertion: 3225 bp. The size of effective cKO region: ~642 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 11 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Sh3glb1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7142bp) | A(29.53% 2109) | C(21.91% 1565) | T(29.21% 2086) | G(19.35% 1382)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 - 144712979 144715978 3000 browser details YourSeq 131 2341 2529 3000 94.2% chr10 - 121052693 121052989 297 browser details YourSeq 130 2340 2534 3000 94.1% chr19 + 56972148 56972657 510 browser details YourSeq 129 2337 2537 3000 87.6% chr6 - 148738776 148738962 187 browser details YourSeq 126 2348 2534 3000 95.1% chr11 - 109565500 109565854 355 browser details YourSeq 124 2337 2542 3000 86.6% chr12 - 75694735 75694898 164 browser details YourSeq 122 2376 2536 3000 95.7% chr3 + 35034072 35034336 265 browser details YourSeq 116 2402 2531 3000 96.8% chr5 + 52991263 52991615 353 browser details YourSeq 113 2335 2534 3000 95.4% chr12 + 106191006 106191414 409 browser details YourSeq 98 2399 2532 3000 94.7% chr11 + 19284983 19285317 335 browser details YourSeq 97 2390 2536 3000 94.6% chr16 - 18355812 18355982 171 browser details YourSeq 95 2328 2536 3000 81.5% chr14 + 64224048 64224176 129 browser details YourSeq 88 2366 2519 3000 81.4% chr1 - 151987922 151988054 133 browser details YourSeq 86 2369 2488 3000 93.0% chr4 + 58005536 58005677 142 browser details YourSeq 79 2418 2540 3000 86.3% chr10 - 5335971 5336141 171 browser details YourSeq 78 2436 2538 3000 96.6% chr5 + 52991385 52991498 114 browser details YourSeq 72 2335 2495 3000 94.1% chr5 - 147207856 147208152 297 browser details YourSeq 72 2430 2537 3000 87.5% chr1 + 187904554 187904663 110 browser details YourSeq 71 2421 2509 3000 95.2% chr15 - 101296914 101297211 298 browser details YourSeq 69 2408 2499 3000 95.0% chr10 - 39325221 39325454 234

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 - 144709337 144712336 3000 browser details YourSeq 159 159 544 3000 85.8% chrX - 157867092 157867474 383 browser details YourSeq 150 179 544 3000 83.4% chr7 + 109362421 109362748 328 browser details YourSeq 148 161 517 3000 84.0% chr11 - 40462334 40462686 353 browser details YourSeq 148 161 544 3000 81.7% chr16 + 13000483 13000860 378 browser details YourSeq 145 1203 1373 3000 92.9% chr17 - 7552362 7552539 178 browser details YourSeq 143 1210 1378 3000 92.2% chr11 + 107866649 107866816 168 browser details YourSeq 142 1208 1372 3000 90.7% chr4 - 131853271 131853431 161 browser details YourSeq 142 180 544 3000 86.5% chr11 - 80869398 80869757 360 browser details YourSeq 142 1217 1377 3000 95.0% chr12 + 24701541 24701701 161 browser details YourSeq 141 1217 1373 3000 96.7% chr1 + 13570751 13570912 162 browser details YourSeq 140 1208 1364 3000 92.3% chr2 + 120291924 120292077 154 browser details YourSeq 140 1202 1369 3000 90.1% chr1 + 189735241 189735403 163 browser details YourSeq 139 1216 1373 3000 94.9% chr1 - 60175829 60175986 158 browser details YourSeq 139 1208 1373 3000 90.2% chr5 + 135401149 135401311 163 browser details YourSeq 139 1219 1368 3000 96.7% chr12 + 87294556 87294706 151 browser details YourSeq 139 183 544 3000 81.5% chr11 + 35089064 35089393 330 browser details YourSeq 138 1208 1368 3000 91.1% chr14 - 34690696 34690853 158 browser details YourSeq 138 1210 1374 3000 89.2% chr11 + 85175027 85175183 157 browser details YourSeq 137 1208 1363 3000 91.6% chr4 - 129275141 129275293 153

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Sh3glb1 SH3-domain GRB2-like B1 (endophilin) [ Mus musculus (house mouse) ] Gene ID: 54673, updated on 12-Aug-2019

Gene summary

Official Symbol Sh3glb1 provided by MGI Official Full Name SH3-domain GRB2-like B1 (endophilin) provided by MGI Primary source MGI:MGI:1859730 See related Ensembl:ENSMUSG00000037062 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Bif-1; AA409932; AI314629; AU015566; mKIAA0491 Expression Ubiquitous expression in testis adult (RPKM 28.5), CNS E18 (RPKM 9.1) and 27 other tissues See more Orthologs human all

Genomic context

Location: 3; 3 H2 See Sh3glb1 in Genome Data Viewer

Exon count: 12

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (144683678..144720556, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (144351808..144383287, complement)

Chromosome 3 - NC_000069.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Sh3glb1 ENSMUSG00000037062

Description SH3-domain GRB2-like B1 (endophilin) [Source:MGI Symbol;Acc:MGI:1859730] Gene Synonyms Bif-1, Endophilin B1 Location Chromosome 3: 144,683,678-144,720,335 reverse strand. GRCm38:CM000996.2 About this gene This gene has 6 transcripts (splice variants), 258 orthologues, 12 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Sh3glb1- ENSMUST00000198254.4 5945 386aa ENSMUSP00000143312.1 Protein CCDS80043 Q9JK48 TSL:1 202 coding GENCODE basic APPRIS ALT1

Sh3glb1- ENSMUST00000163279.5 3961 365aa ENSMUSP00000129800.1 Protein CCDS17885 Q9JK48 TSL:1 201 coding GENCODE basic APPRIS P3

Sh3glb1- ENSMUST00000199531.4 2584 355aa ENSMUSP00000143433.1 Protein CCDS80042 Q9JK48 TSL:1 204 coding GENCODE basic

Sh3glb1- ENSMUST00000199854.4 1416 394aa ENSMUSP00000142716.1 Protein - A0A0G2JEC4 TSL:5 205 coding GENCODE basic

Sh3glb1- ENSMUST00000199350.4 665 140aa ENSMUSP00000143031.1 Protein - A0A0G2JF57 CDS 5' incomplete 203 coding TSL:3

Sh3glb1- ENSMUST00000200532.1 582 194aa ENSMUSP00000142626.1 Protein - A0A0G2JE45 CDS 5' and 3' 206 coding incomplete TSL:3

Page 6 of 8 https://www.alphaknockout.com

56.66 kb Forward strand 144.68Mb 144.69Mb 144.70Mb 144.71Mb 144.72Mb 144.73Mb Gm9419-201 >processed pseudogene (Comprehensive set...

Contigs < AC134404.17

Genes (Comprehensive set... < Sh3glb1-204protein coding < Clca3a1-202protein coding

< Sh3glb1-203protein coding < Clca3a1-201nonsense mediated decay

< Sh3glb1-202protein coding

< Sh3glb1-201protein coding

< Sh3glb1-205protein coding

< Sh3glb1-206protein coding

Regulatory Build

144.68Mb 144.69Mb 144.70Mb 144.71Mb 144.72Mb 144.73Mb Reverse strand 56.66 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript pseudogene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000198254

< Sh3glb1-202protein coding

Reverse strand 33.41 kb

ENSMUSP00000143... Low complexity (Seg) Coiled-coils (Ncoils) Superfamily AH/BAR domain superfamily

SH3-like domain superfamily SMART BAR domain SH3 domain

Pfam BAR domain SH3 domain

PROSITE profiles BAR domain SH3 domain

PANTHER PTHR14167

Endophilin-B1 Gene3D AH/BAR domain superfamily 2.30.30.40

CDD Endophilin-B1, BAR domain cd11945

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 386

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8