https://www.alphaknockout.com

Mouse Bag2 Knockout Project (CRISPR/Cas9)

Objective: To create a Bag2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Bag2 (NCBI Reference Sequence: NM_145392 ; Ensembl: ENSMUSG00000042215 ) is located on Mouse 1. 3 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 3 (Transcript: ENSMUST00000044691). Exon 1~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 1 starts from about 0.16% of the coding region. Exon 1~3 covers 100.0% of the coding region. The size of effective KO region: ~12080 bp. The function of mouse Gm37905 will be affected by deleting this KO region..

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3

Legends Exon of mouse Bag2 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.2% 504) | C(24.15% 483) | T(24.75% 495) | G(25.9% 518)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.2% 484) | C(26.45% 529) | T(28.7% 574) | G(20.65% 413)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 - 33757692 33759691 2000 browser details YourSeq 84 770 1043 2000 91.3% chr11 - 61646044 61646357 314 browser details YourSeq 79 790 1038 2000 91.6% chr18 + 4843541 4843839 299 browser details YourSeq 77 790 1043 2000 93.4% chr4 + 111764747 111765044 298 browser details YourSeq 75 805 1043 2000 88.7% chr14 - 70813743 70814026 284 browser details YourSeq 65 755 910 2000 86.6% chr18 - 24023511 24023706 196 browser details YourSeq 64 760 857 2000 83.7% chr16 - 21101399 21101598 200 browser details YourSeq 63 754 1039 2000 87.1% chr9 - 60986467 60986798 332 browser details YourSeq 63 754 982 2000 85.4% chr11 - 20842630 20843036 407 browser details YourSeq 63 753 1043 2000 86.3% chr1 - 170641513 170641885 373 browser details YourSeq 63 754 1043 2000 84.7% chr11 + 70817184 70817550 367 browser details YourSeq 63 900 1043 2000 92.0% chr10 + 89441416 89441596 181 browser details YourSeq 62 757 1041 2000 88.8% chr3 - 154095462 154095798 337 browser details YourSeq 62 754 1038 2000 90.8% chr13 + 35711595 35711970 376 browser details YourSeq 62 754 941 2000 94.5% chr10 + 72579936 72580144 209 browser details YourSeq 61 756 848 2000 85.9% chr9 - 118800451 118800546 96 browser details YourSeq 59 804 1042 2000 94.1% chr1 + 90316703 90316988 286 browser details YourSeq 57 924 1044 2000 92.6% chr6 - 24705210 24705332 123 browser details YourSeq 56 752 863 2000 76.1% chr13 + 38754138 38754227 90 browser details YourSeq 54 754 824 2000 92.2% chr1 + 75990076 75990149 74

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 - 33744610 33746609 2000 browser details YourSeq 278 1242 1808 2000 96.4% chr10 + 128101400 128170377 68978 browser details YourSeq 275 1242 1808 2000 91.0% chr19 + 37478341 37478677 337 browser details YourSeq 271 1238 1811 2000 95.4% chr1 + 136649465 136650049 585 browser details YourSeq 270 1241 1807 2000 88.9% chr7 - 127243031 127243384 354 browser details YourSeq 264 1242 1807 2000 86.4% chr16 + 91484123 91484456 334 browser details YourSeq 256 1249 1822 2000 91.1% chr17 - 29515518 29516074 557 browser details YourSeq 255 1249 1809 2000 88.7% chr17 - 33893664 33893992 329 browser details YourSeq 246 1242 1808 2000 88.3% chr4 - 133760470 133760958 489 browser details YourSeq 241 1255 1808 2000 88.9% chr6 + 51519549 51519888 340 browser details YourSeq 237 1238 1808 2000 88.3% chr10 - 62579141 62579462 322 browser details YourSeq 237 1238 1808 2000 89.3% chr11 + 72801771 72802142 372 browser details YourSeq 236 1238 1808 2000 90.6% chr11 + 80452630 80453142 513 browser details YourSeq 229 1268 1808 2000 88.0% chr11 - 97272356 97272854 499 browser details YourSeq 219 1286 1808 2000 89.4% chr11 - 3314643 3315100 458 browser details YourSeq 207 1238 1808 2000 93.7% chr9 - 65778205 65778798 594 browser details YourSeq 205 1244 1808 2000 88.6% chr11 - 20029651 20030041 391 browser details YourSeq 203 1301 1807 2000 93.6% chr19 - 45339616 45340205 590 browser details YourSeq 198 1310 1822 2000 98.1% chr10 + 128052046 128052670 625 browser details YourSeq 178 1636 1845 2000 96.4% chrX + 162723681 162724170 490

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Bag2 BCL2-associated athanogene 2 [ Mus musculus (house mouse) ] Gene ID: 213539, updated on 12-Aug-2019

Gene summary

Official Symbol Bag2 provided by MGI Official Full Name BCL2-associated athanogene 2 provided by MGI Primary source MGI:MGI:1891254 See related Ensembl:ENSMUSG00000042215 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as BC016230; 2610042A13Rik Expression Ubiquitous expression in bladder adult (RPKM 26.9), placenta adult (RPKM 26.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 1; 1 B See Bag2 in Genome Data Viewer Exon count: 4

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (33745484..33757750, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (33802329..33814595, complement)

Chromosome 1 - NC_000067.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Bag2 ENSMUSG00000042215

Description BCL2-associated athanogene 2 [Source:MGI Symbol;Acc:MGI:1891254] Gene Synonyms 2610042A13Rik Location Chromosome 1: 33,745,484-33,757,795 reverse strand. GRCm38:CM000994.2 About this gene This gene has 4 transcripts (splice variants), 197 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Bag2-201 ENSMUST00000044691.8 1860 210aa ENSMUSP00000042009.7 Protein coding CCDS14864 Q91YN9 TSL:1 GENCODE basic APPRIS P1

Bag2-203 ENSMUST00000187602.1 378 57aa ENSMUSP00000139538.1 Protein coding - A0A087WNX9 TSL:2 GENCODE basic

Bag2-204 ENSMUST00000189741.1 955 No protein - Retained intron - - TSL:NA

Bag2-202 ENSMUST00000155484.1 781 No protein - Retained intron - - TSL:2

32.31 kb Forward strand

33.74Mb 33.75Mb 33.76Mb Rab23-201 >protein coding Gm37905-201 >TEC (Comprehensive set...

Rab23-206 >nonsense mediated decay

Rab23-202 >protein coding

Rab23-208 >retained intron

Rab23-203 >retained intron

Rab23-204 >retained intron

Contigs < AC163668.5 Genes (Comprehensive set... < Bag2-201protein coding < Zfp451-206nonsense mediated decay

< Bag2-202retained intron < Bag2-204retained intron < Zfp451-201protein coding

< Bag2-203protein coding

Regulatory Build

33.74Mb 33.75Mb 33.76Mb Reverse strand 32.31 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000044691

< Bag2-201protein coding

Reverse strand 12.31 kb

ENSMUSP00000042... PDB-ENSP mappings Coiled-coils (Ncoils) SMART BAG domain

PROSITE profiles BAG domain PANTHER BAG family molecular chaperone regulator 2

Gene3D 1.20.58.890

CDD cd17282

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant splice region variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 210

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8