https://www.alphaknockout.com

Mouse Gmeb2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Gmeb2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Gmeb2 (NCBI Reference Sequence: NM_198169 ; Ensembl: ENSMUSG00000038705 ) is located on Mouse 2. 10 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 10 (Transcript: ENSMUST00000049032). Exon 4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Gmeb2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-274D7 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 4 starts from about 14.47% of the coding region. The knockout of Exon 4 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 706 bp, and the size of intron 4 for 3'-loxP site insertion: 4582 bp. The size of effective cKO region: ~628 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 4 10 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Gmeb2 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7128bp) | A(29.28% 2087) | C(18.27% 1302) | T(31.33% 2233) | G(21.13% 1506)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 181265359 181268358 3000 browser details YourSeq 168 1343 1872 3000 87.9% chr13 - 55061789 55062115 327 browser details YourSeq 163 1608 1873 3000 92.3% chr18 - 73822682 73822941 260 browser details YourSeq 162 1696 1874 3000 96.6% chr4 - 98834308 98834494 187 browser details YourSeq 159 1693 1873 3000 96.1% chr19 - 21298528 21298819 292 browser details YourSeq 159 1704 1884 3000 94.3% chr10 - 119514062 119514240 179 browser details YourSeq 157 1701 1873 3000 96.5% chrX + 7897323 7897498 176 browser details YourSeq 156 1708 1874 3000 97.1% chr8 - 85439801 85439972 172 browser details YourSeq 156 1487 1874 3000 83.8% chr16 - 13769888 13770106 219 browser details YourSeq 156 1685 1873 3000 92.5% chr7 + 120800781 120800964 184 browser details YourSeq 155 1703 1873 3000 96.5% chr5 - 23580934 23581107 174 browser details YourSeq 155 1703 1873 3000 96.5% chr9 + 41065420 41065593 174 browser details YourSeq 155 1704 1876 3000 95.9% chr6 + 120536836 120537011 176 browser details YourSeq 154 1708 1875 3000 96.5% chrX - 8152255 8152430 176 browser details YourSeq 154 1701 1870 3000 96.5% chr8 + 46498650 46498830 181 browser details YourSeq 153 1708 1873 3000 96.4% chr2 - 152086038 152086208 171 browser details YourSeq 153 1696 1873 3000 94.3% chr19 - 5591373 5591563 191 browser details YourSeq 153 1708 1873 3000 97.0% chrX + 48236010 48236177 168 browser details YourSeq 153 1700 1873 3000 95.3% chr9 + 43432853 43433033 181 browser details YourSeq 153 1696 1873 3000 94.8% chr5 + 130428189 130428373 185

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 181261731 181264730 3000 browser details YourSeq 194 1789 2112 3000 93.3% chr10 - 117491882 117563011 71130 browser details YourSeq 175 1785 2131 3000 88.4% chr11 + 61121514 61121953 440 browser details YourSeq 173 308 1880 3000 89.9% chr10 - 62272454 62556431 283978 browser details YourSeq 143 1783 2024 3000 91.4% chr14 + 76160019 76207389 47371 browser details YourSeq 141 1783 1956 3000 87.9% chr16 - 44071398 44071562 165 browser details YourSeq 140 1783 1957 3000 89.3% chr11 + 17074998 17075169 172 browser details YourSeq 137 1783 1955 3000 88.1% chr16 + 38541224 38541393 170 browser details YourSeq 136 1783 1961 3000 90.2% chr3 - 55865892 55866094 203 browser details YourSeq 136 1783 1948 3000 89.6% chr15 - 77421354 77421516 163 browser details YourSeq 134 1783 1957 3000 87.0% chr3 - 30712303 30712474 172 browser details YourSeq 134 1635 1914 3000 86.9% chr11 - 8749091 8749266 176 browser details YourSeq 132 1783 1937 3000 90.1% chr2 - 120651109 120651259 151 browser details YourSeq 132 1786 1948 3000 90.3% chr15 - 90656988 90657147 160 browser details YourSeq 132 1472 1947 3000 78.5% chr1 - 7142935 7143118 184 browser details YourSeq 132 1783 1956 3000 87.3% chr17 + 56489444 56489609 166 browser details YourSeq 129 1783 1955 3000 88.2% chr7 - 141287296 141287469 174 browser details YourSeq 129 1783 1953 3000 88.7% chr5 - 147413516 147413687 172 browser details YourSeq 129 1803 1954 3000 93.9% chr17 + 87605298 87605453 156 browser details YourSeq 129 1783 1955 3000 87.1% chr1 + 86790519 86790689 171

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Gmeb2 glucocorticoid modulatory element binding protein 2 [ Mus musculus (house mouse) ] Gene ID: 229004, updated on 12-Aug-2019

Gene summary

Official Symbol Gmeb2 provided by MGI Official Full Name glucocorticoid modulatory element binding protein 2 provided by MGI Primary source MGI:MGI:2652836 See related Ensembl:ENSMUSG00000038705 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI839884 Expression Ubiquitous expression in thymus adult (RPKM 14.6), colon adult (RPKM 10.2) and 28 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 H4 See Gmeb2 in Genome Data Viewer

Exon count: 13

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (181251449..181288106, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (180986156..181022671, complement)

Chromosome 2 - NC_000068.7

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Gmeb2 ENSMUSG00000038705

Description glucocorticoid modulatory element binding protein 2 [Source:MGI Symbol;Acc:MGI:2652836] Location Chromosome 2: 181,251,449-181,288,035 reverse strand. GRCm38:CM000995.2 About this gene This gene has 6 transcripts (splice variants), 199 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Gmeb2- ENSMUST00000049032.12 4121 530aa ENSMUSP00000037075.6 Protein coding CCDS17206 P58929 TSL:1 201 GENCODE basic APPRIS P1

Gmeb2- ENSMUST00000130475.7 765 215aa ENSMUSP00000116479.1 Protein coding - A2AS07 CDS 3' 203 incomplete TSL:5

Gmeb2- ENSMUST00000141110.7 1600 78aa ENSMUSP00000115853.1 Nonsense mediated - D6RG43 TSL:1 205 decay

Gmeb2- ENSMUST00000141003.1 811 127aa ENSMUSP00000116854.1 Nonsense mediated - D6RGU7 TSL:5 204 decay

Gmeb2- ENSMUST00000123148.1 888 No - Retained intron - - TSL:3 202 protein

Gmeb2- ENSMUST00000147665.1 642 No - Retained intron - - TSL:2 206 protein

56.59 kb Forward strand 181.25Mb 181.26Mb 181.27Mb 181.28Mb 181.29Mb Gm7645-201 >processed pseudogene (Comprehensive set...

Contigs AL845506.6 >

Genes (Comprehensive set... < Helz2-202protein coding < Gmeb2-201protein coding

< Helz2-201protein coding < Gmeb2-205nonsense mediated decay

< Helz2-203protein coding < Gmeb2-202retained intron < Gmeb2-206retained intron

< Helz2-205retained intron < Gmeb2-203protein coding

< Gmeb2-204nonsense mediated decay

Regulatory Build

181.25Mb 181.26Mb 181.27Mb 181.28Mb 181.29Mb Reverse strand 56.59 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript pseudogene

Page 6 of 8 https://www.alphaknockout.com

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000049032

< Gmeb2-201protein coding

Reverse strand 36.59 kb

ENSMUSP00000037... Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SAND-like domain superfamily SMART SAND domain Pfam SAND domain PROSITE profiles SAND domain PANTHER Glucocorticoid modulatory element-binding protein 1/2

PTHR10417:SF2 Gene3D SAND-like domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 530

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8