https://www.alphaknockout.com

Mouse Hexim2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Hexim2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Hexim2 (NCBI Reference Sequence: NM_027658 ; Ensembl: ENSMUSG00000043372 ) is located on Mouse 11. 3 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 3 (Transcript: ENSMUST00000062530). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Hexim2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-133L3 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 covers 93.29% of the coding region. Start codon is in exon 2, and stop codon is in exon 3. The size of intron 2 for 5'-loxP site insertion: 4045 bp. The size of effective cKO region: ~2440 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Hexim2 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7376bp) | A(25.41% 1874) | C(24.62% 1816) | T(22.94% 1692) | G(27.03% 1994)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 + 103134938 103137937 3000 browser details YourSeq 180 2770 3000 3000 95.5% chr19 - 8864818 8865267 450 browser details YourSeq 179 2765 2963 3000 95.5% chr3 + 90137666 90471203 333538 browser details YourSeq 168 2762 3000 3000 95.2% chr15 + 82415363 82415796 434 browser details YourSeq 167 2781 3000 3000 93.8% chr1 + 26744715 26745178 464 browser details YourSeq 159 2737 2968 3000 95.5% chr1 - 161044332 161044785 454 browser details YourSeq 159 2764 2976 3000 94.5% chr11 + 117411432 117411712 281 browser details YourSeq 157 2710 3000 3000 93.9% chr9 + 13526769 13527155 387 browser details YourSeq 156 2765 3000 3000 94.4% chr14 - 43384417 43485290 100874 browser details YourSeq 153 2818 2990 3000 95.9% chr6 - 37524099 37524284 186 browser details YourSeq 152 2833 3000 3000 95.9% chr3 - 104665876 104666059 184 browser details YourSeq 151 2436 3000 3000 90.0% chr17 - 26108576 26109183 608 browser details YourSeq 151 2827 3000 3000 92.9% chr16 + 4284323 4284494 172 browser details YourSeq 148 2830 2999 3000 92.2% chr7 - 86381221 86381387 167 browser details YourSeq 148 2787 3000 3000 95.7% chr18 - 64929141 64929781 641 browser details YourSeq 147 2830 3000 3000 91.5% chr6 - 119208744 119208910 167 browser details YourSeq 147 2830 2995 3000 94.6% chr2 - 157226713 157226892 180 browser details YourSeq 147 2832 3000 3000 94.1% chr11 - 89100203 89100375 173 browser details YourSeq 147 2830 2998 3000 92.1% chr11 - 77747105 77747270 166 browser details YourSeq 147 2444 2975 3000 86.8% chr4 + 155077274 155077775 502

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 + 103139064 103142063 3000 browser details YourSeq 79 214 320 3000 88.9% chr16 + 94339214 94339637 424 browser details YourSeq 76 196 321 3000 87.9% chr10 - 80618863 80619379 517 browser details YourSeq 74 203 321 3000 88.6% chr14 + 78553819 78553949 131 browser details YourSeq 73 204 321 3000 76.9% chr15 - 12050505 12050613 109 browser details YourSeq 72 177 321 3000 79.7% chr15 + 82182428 82182600 173 browser details YourSeq 69 209 320 3000 81.0% chr12 - 55743545 55743668 124 browser details YourSeq 67 205 436 3000 63.4% chr5 - 100639575 100639733 159 browser details YourSeq 67 222 321 3000 85.4% chr2 - 93812278 93812376 99 browser details YourSeq 67 204 321 3000 79.5% chr11 - 59443249 59443376 128 browser details YourSeq 65 220 321 3000 79.0% chr3 + 104924431 104924530 100 browser details YourSeq 65 255 568 3000 71.6% chr11 + 76079883 76080138 256 browser details YourSeq 62 241 318 3000 88.2% chr7 - 80944340 80944416 77 browser details YourSeq 62 221 318 3000 79.4% chr2 - 163990010 163990106 97 browser details YourSeq 62 218 320 3000 82.3% chr19 - 32565861 32565962 102 browser details YourSeq 61 242 320 3000 85.9% chr5 - 114164803 114164880 78 browser details YourSeq 61 224 314 3000 82.8% chr13 - 54843292 54843381 90 browser details YourSeq 61 215 321 3000 82.8% chr5 + 25666621 25666724 104 browser details YourSeq 61 219 321 3000 78.8% chr18 + 61843204 61843305 102 browser details YourSeq 60 241 320 3000 84.9% chr2 + 167669010 167669088 79

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Hexim2 hexamethylene bis-acetamide inducible 2 [ Mus musculus (house mouse) ] Gene ID: 71059, updated on 12-Aug-2019

Gene summary

Official Symbol Hexim2 provided by MGI Official Full Name hexamethylene bis-acetamide inducible 2 provided by MGI Primary source MGI:MGI:1918309 See related Ensembl:ENSMUSG00000043372 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 4933402L21Rik Expression Broad expression in testis adult (RPKM 10.3), CNS E18 (RPKM 3.7) and 24 other tissues See more Orthologs human all

Genomic context

Location: 11; 11 E1 See Hexim2 in Genome Data Viewer

Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (103123260..103139908)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (102994653..103001222)

Chromosome 11 - NC_000077.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Hexim2 ENSMUSG00000043372

Description hexamethylene bis-acetamide inducible 2 [Source:MGI Symbol;Acc:MGI:1918309] Gene Synonyms 4933402L21Rik Location Chromosome 11: 103,132,429-103,139,876 forward strand. GRCm38:CM001004.2 About this gene This gene has 5 transcripts (splice variants), 88 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Hexim2-201 ENSMUST00000062530.4 1910 313aa ENSMUSP00000053678.4 Protein coding CCDS25514 Q3TVI4 TSL:1 GENCODE basic APPRIS P1

Hexim2-202 ENSMUST00000107037.7 1221 313aa ENSMUSP00000102652.1 Protein coding CCDS25514 Q3TVI4 TSL:2 GENCODE basic APPRIS P1

Hexim2-205 ENSMUST00000150275.7 681 125aa ENSMUSP00000122591.1 Protein coding - A2AB66 CDS 3' incomplete TSL:5

Hexim2-203 ENSMUST00000124928.1 654 174aa ENSMUSP00000116991.1 Protein coding - A2AB67 CDS 3' incomplete TSL:1

Hexim2-204 ENSMUST00000130341.1 370 52aa ENSMUSP00000114405.1 Protein coding - A2AB65 CDS 3' incomplete TSL:1

27.45 kb Forward strand 103.125Mb 103.130Mb 103.135Mb 103.140Mb 103.145Mb (Comprehensive set... Hexim2-205 >protein coding

Hexim2-202 >protein coding

Hexim2-203 >protein coding

Hexim2-201 >protein coding

Hexim2-204 >protein coding

Contigs AL731805.8 > AL662804.18 > Regulatory Build

103.125Mb 103.130Mb 103.135Mb 103.140Mb 103.145Mb Reverse strand 27.45 kb

Regulation Legend CTCF Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000062530

6.53 kb Forward strand

Hexim2-201 >protein coding

ENSMUSP00000053... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Prints HEXIM Pfam HEXIM

PANTHER HEXIM2

HEXIM

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 313

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7