https://www.alphaknockout.com

Mouse Chmp4b Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Chmp4b conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Chmp4b (NCBI Reference Sequence: NM_029362 ; Ensembl: ENSMUSG00000038467 ) is located on Mouse 2. 5 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 5 (Transcript: ENSMUST00000044277). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Chmp4b gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-371E3 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis.

Note: Mice homozygous for a gene trap insertion die between E7.5 and E8.5.

Exon 2 starts from about 28.42% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 32146 bp, and the size of intron 2 for 3'-loxP site insertion: 1514 bp. The size of effective cKO region: ~923 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 5 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Chmp4b Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7178bp) | A(23.77% 1706) | C(21.75% 1561) | T(27.89% 2002) | G(26.6% 1909)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 154686269 154689268 3000 browser details YourSeq 92 1544 1683 3000 86.7% chr3 + 98079140 98079281 142 browser details YourSeq 91 1544 1972 3000 77.8% chr13 - 67082695 67083032 338 browser details YourSeq 86 1553 1680 3000 87.3% chr2 + 93301085 93301215 131 browser details YourSeq 80 1544 1669 3000 87.1% chr15 + 63496254 63496382 129 browser details YourSeq 79 1864 1999 3000 85.8% chrX - 104175247 104175378 132 browser details YourSeq 78 1553 1669 3000 85.6% chr7 + 117477994 117478113 120 browser details YourSeq 76 1538 1964 3000 89.7% chr3 - 51487366 51487912 547 browser details YourSeq 76 1556 1878 3000 92.4% chr2 - 29225289 29225671 383 browser details YourSeq 76 1553 1715 3000 86.7% chr2 + 141534356 141534542 187 browser details YourSeq 75 1540 1657 3000 87.5% chr7 + 129835610 129835726 117 browser details YourSeq 74 1545 1669 3000 81.6% chr17 + 43682707 43682833 127 browser details YourSeq 73 1544 1657 3000 92.1% chr12 - 111581446 111581563 118 browser details YourSeq 71 1544 1640 3000 89.9% chr18 - 73160709 73160807 99 browser details YourSeq 71 1547 1653 3000 84.6% chr19 + 27513961 27514057 97 browser details YourSeq 70 1553 1649 3000 91.8% chr6 + 34502057 34502158 102 browser details YourSeq 68 1544 1638 3000 90.5% chr3 + 87524189 87524284 96 browser details YourSeq 68 1546 1640 3000 90.6% chr10 + 117887382 117887478 97 browser details YourSeq 67 1838 1945 3000 82.1% chr14 - 78310475 78310575 101 browser details YourSeq 67 1544 1669 3000 83.2% chr11 - 105636664 105636790 127

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 154689947 154692946 3000 browser details YourSeq 278 1488 1833 3000 92.5% chr5 - 76903671 76904014 344 browser details YourSeq 270 1478 1799 3000 94.8% chr11 + 96117977 96118333 357 browser details YourSeq 256 1488 1799 3000 95.1% chr7 + 80163259 80163651 393 browser details YourSeq 249 1488 1799 3000 94.7% chr17 - 94753648 94754000 353 browser details YourSeq 248 1488 1799 3000 94.4% chrX + 38567667 38568148 482 browser details YourSeq 248 1491 1799 3000 94.0% chr13 + 100801712 100802035 324 browser details YourSeq 245 1488 1799 3000 95.9% chr6 - 39415279 39415649 371 browser details YourSeq 243 1488 1799 3000 94.3% chrX + 137169466 137170120 655 browser details YourSeq 239 1495 1799 3000 94.8% chr19 - 4639276 4639582 307 browser details YourSeq 239 1489 1799 3000 93.2% chr10 + 76034623 76453744 419122 browser details YourSeq 229 1488 1759 3000 94.6% chr11 + 72642223 72642888 666 browser details YourSeq 228 1496 1763 3000 93.9% chr14 - 47400785 47401065 281 browser details YourSeq 227 1493 1962 3000 91.7% chr2 + 180941017 180941474 458 browser details YourSeq 214 1493 1737 3000 97.8% chr4 - 152155892 152156323 432 browser details YourSeq 210 1489 1784 3000 97.0% chr9 - 109865208 109865513 306 browser details YourSeq 205 1496 1732 3000 97.3% chr5 + 37789676 37789924 249 browser details YourSeq 199 1488 1799 3000 97.2% chr8 - 71501079 71501575 497 browser details YourSeq 199 1496 1735 3000 98.6% chr17 + 27055689 27055975 287 browser details YourSeq 198 1489 1867 3000 91.2% chr4 - 34944825 34945044 220

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Chmp4b charged multivesicular body protein 4B [ Mus musculus (house mouse) ] Gene ID: 75608, updated on 12-Aug-2019

Gene summary

Official Symbol Chmp4b provided by MGI Official Full Name charged multivesicular body protein 4B provided by MGI Primary source MGI:MGI:1922858 See related Ensembl:ENSMUSG00000038467 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as C76846; Snf7-2; 2010012F05Rik Expression Ubiquitous expression in placenta adult (RPKM 92.6), bladder adult (RPKM 83.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 H1 See Chmp4b in Genome Data Viewer

Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (154657026..154694783)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (154482762..154520519)

Chromosome 2 - NC_000068.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Chmp4b ENSMUSG00000038467

Description charged multivesicular body protein 4B [Source:MGI Symbol;Acc:MGI:1922858] Gene Synonyms 2010012F05Rik, Snf7-2, chromatin modifying protein 4B Location Chromosome 2: 154,651,705-154,694,785 forward strand. GRCm38:CM000995.2 About this gene This gene has 3 transcripts (splice variants), 290 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Chmp4b-201 ENSMUST00000044277.9 1616 224aa ENSMUSP00000036206.9 Protein coding CCDS16938 Q9D8B3 TSL:1 GENCODE basic APPRIS P1

Chmp4b-202 ENSMUST00000136788.1 720 No protein - Retained intron - - TSL:2

Chmp4b-203 ENSMUST00000151668.7 1528 No protein - lncRNA - - TSL:1

63.08 kb Forward strand 154.65Mb 154.66Mb 154.67Mb 154.68Mb 154.69Mb 154.70Mb (Comprehensive set... Zfp341-201 >protein coding Chmp4b-203 >lncRNA

Zfp341-203 >nonsense mediated decay Chmp4b-201 >protein coding

Zfp341-202 >protein coding Chmp4b-202 >retained intron

Rpl5-ps2-201 >processed pseudogene

Contigs AL929557.15 > Regulatory Build

154.65Mb 154.66Mb 154.67Mb 154.68Mb 154.69Mb 154.70Mb Reverse strand 63.08 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000044277

37.77 kb Forward strand

Chmp4b-201 >protein coding

ENSMUSP00000036... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam Snf7 family PANTHER PTHR22761

PTHR22761:SF4 Gene3D 1.10.287.1060

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 200 224

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7