http://www.alphaknockout.com/ Mouse Bag2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Bag2 conditional knockout mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Bag2 ( NCBI Reference Sequence: NM_145392 ; Ensembl: ENSMUSG00000042215 ) is located on mouse 1. 3 exons are identified , with the ATG start codon in exon 1 and the TAG stop codon in exon 3 (Transcript: ENSMUST00000044691). Exon 2~3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the mouse Bag2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-23B12 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 18.1% of the coding region. The knockout of Exon 2~3 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 9252 bp. The size of effective cKO region: ~3900 bp. This strategy is designed based on genetic information in existing databases. Due to the complexity of biological processes, all risk of loxP insertion on gene transcription, RNA splicing and translation cannot be predicted at existing technological level. The function of Gm37905-201 will be affected by deleting this cKO region.

Page 1 of 7 http://www.alphaknockout.com/

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Bag2 Homology arm cKO region loxP site

Page 2 of 7 http://www.alphaknockout.com/

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(9343bp) | A(26.45% 2471) | C(23.22% 2169) | G(24.16% 2257) | T(26.18% 2446)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 http://www.alphaknockout.com/

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 - 33748577 33751576 3000 browser details YourSeq 306 1246 1614 3000 92.5% chr7 + 110610525 110610897 373 browser details YourSeq 277 1246 1588 3000 92.6% chr9 + 21118832 21119173 342 browser details YourSeq 275 1249 1577 3000 92.9% chr14 - 52157176 52157505 330 browser details YourSeq 269 1255 1581 3000 91.0% chr16 - 13803490 13803815 326 browser details YourSeq 267 1246 1581 3000 91.1% chr10 + 100554761 100555290 530 browser details YourSeq 262 1246 1605 3000 93.4% chr12 + 75729498 75729990 493 browser details YourSeq 261 1171 1576 3000 92.0% chr16 - 17864139 18290628 426490 browser details YourSeq 252 1246 1568 3000 88.4% chr10 + 62385066 62385379 314 browser details YourSeq 250 1246 1586 3000 92.0% chr1 + 84834294 84834637 344 browser details YourSeq 248 1246 1582 3000 92.0% chr13 + 62798056 62798393 338 browser details YourSeq 246 1246 1569 3000 89.0% chr17 + 47510164 47510489 326 browser details YourSeq 235 1245 1595 3000 92.5% chr1 - 63244746 63245283 538 browser details YourSeq 234 1246 1532 3000 92.4% chr2 + 26070928 26071519 592 browser details YourSeq 228 1246 1587 3000 87.6% chr1 - 136972058 136972384 327 browser details YourSeq 225 1248 1584 3000 92.5% chr10 + 4352297 4352659 363 browser details YourSeq 224 1265 1570 3000 91.7% chr4 + 124878333 124878637 305 browser details YourSeq 221 1247 1597 3000 92.4% chr1 - 156551504 156551890 387 browser details YourSeq 221 1273 1619 3000 92.4% chrX + 101483044 101483537 494 browser details YourSeq 220 1246 1561 3000 92.4% chr8 - 120384524 120384844 321

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 - 33742484 33745483 3000 browser details YourSeq 326 500 1872 3000 91.0% chr13 + 64331042 64383818 52777 browser details YourSeq 289 116 982 3000 90.2% chr19 + 37478341 37478768 428 browser details YourSeq 271 112 685 3000 95.4% chr1 + 136649465 136650049 585 browser details YourSeq 270 115 681 3000 88.9% chr7 - 127243031 127243384 354 browser details YourSeq 269 123 696 3000 91.3% chr17 - 29515518 29516074 557 browser details YourSeq 265 116 683 3000 93.2% chr9 - 21119007 21119592 586 browser details YourSeq 264 116 681 3000 86.4% chr16 + 91484123 91484456 334 browser details YourSeq 255 123 683 3000 88.7% chr17 - 33893664 33893992 329 browser details YourSeq 246 116 682 3000 88.3% chr4 - 133760470 133760958 489 browser details YourSeq 241 129 682 3000 88.9% chr6 + 51519549 51519888 340 browser details YourSeq 236 112 682 3000 90.6% chr11 + 80452630 80453142 513 browser details YourSeq 234 112 682 3000 88.2% chr10 - 62579141 62579462 322 browser details YourSeq 225 1519 2183 3000 91.9% chr11 + 101693579 101694308 730 browser details YourSeq 219 160 682 3000 89.4% chr11 - 3314643 3315100 458 browser details YourSeq 215 1537 1878 3000 84.6% chr16 - 91723465 91723880 416 browser details YourSeq 207 112 682 3000 93.7% chr9 - 65778205 65778798 594 browser details YourSeq 203 175 681 3000 93.6% chr19 - 45339616 45340205 590 browser details YourSeq 198 184 696 3000 98.1% chr10 + 128052046 128052670 625 browser details YourSeq 197 194 682 3000 88.9% chr11 - 20029651 20029989 339

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 http://www.alphaknockout.com/ Gene and protein information: Bag2 BCL2-associated athanogene 2 [ Mus musculus (house mouse) ] Gene ID: 213539, updated on 25-Sep-2020

Gene summary

Official Symbol Bag2 provided by MGI Official Full Name BCL2-associated athanogene 2 provided by MGI Primary source MGI:MGI:1891254 See related Ensembl:ENSMUSG00000042215 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as BC016230; 2610042A13Rik Expression Ubiquitous expression in bladder adult (RPKM 26.9), placenta adult (RPKM 26.6) and 28 other tissues See more Orthologs human all NEW Try the new Data Table view

Genomic context

Location: 1; 1 B See Bag2 in Genome Data Viewer

Exon count: 3

Annotation release Status Assembly Chr Location

109 current GRCm39 (GCF_000001635.27) 1 NC_000067.7 (33784565..33796831, complement)

108.20200622 previous assembly GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (33745484..33757750, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (33802329..33814595, complement)

Chromosome 1 - NC_000067.7

Page 5 of 7 http://www.alphaknockout.com/

Transcript information: This gene has 4 transcripts

Gene: Bag2 ENSMUSG00000042215

Description BCL2-associated athanogene 2 [Source:MGI Symbol;Acc:MGI:1891254] Gene Synonyms 2610042A13Rik Location Chromosome 1: 33,745,484-33,757,795 reverse strand. GRCm38:CM000994.2 About this gene This gene has 4 transcripts (splice variants) and 290 orthologues. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Match Flags

Bag2-201 ENSMUST00000044691.8 1860 210aa ENSMUSP00000042009.7 Protein coding CCDS14864 Q91YN9 TSL:1 GENCODE basic APPRIS P1

Bag2-203 ENSMUST00000187602.1 378 57aa ENSMUSP00000139538.1 Protein coding - A0A087WNX9 TSL:2 GENCODE basic

Bag2-204 ENSMUST00000189741.1 955 No protein - Retained intron - - TSL:NA

Bag2-202 ENSMUST00000155484.1 781 No protein - Retained intron - - TSL:2

32.31 kb Forward strand

33.74Mb 33.75Mb 33.76Mb Rab23-201 >protein coding Gm37905-201 >TEC (Comprehensive set...

Rab23-206 >nonsense mediated decay

Rab23-202 >protein coding

Rab23-208 >retained intron

Rab23-203 >retained intron

Rab23-204 >retained intron

Contigs < AC163668.5 Genes (Comprehensive set... < Bag2-201protein coding < Zfp451-206nonsense mediated decay

< Bag2-202retained intron < Bag2-204retained intron < Zfp451-201protein coding

< Bag2-203protein coding

Regulatory Build

33.74Mb 33.75Mb 33.76Mb Reverse strand 32.31 kb

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Page 6 of 7 http://www.alphaknockout.com/

Transcript: ENSMUST00000044691

< Bag2-201protein coding

Reverse strand 12.31 kb

ENSMUSP00000042... Coiled-coils (Ncoils) SMART BAG domain PROSITE profiles BAG domain PANTHER BAG family molecular chaperone regulator 2

Gene3D 1.20.58.890

CDD cd17282

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant splice region variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 210

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

Page 7 of 7