https://www.alphaknockout.com

Mouse Kmt5a Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Kmt5a conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Kmt5a (NCBI Reference Sequence: NM_001310723.1 ; Ensembl: ENSMUSG00000049327 ) is located on Mouse 5. 7 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 7 (Transcript: ENSMUST00000100709). Exon 3~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Kmt5a gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-180G16 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis.

Note: Mice homozygous for a gene trapped allele exhibit embryonic lethality. Mice homozygous for an allele lacking exon 7 die prior to implantation.

Exon 3 starts from about 29.85% of the coding region. The knockout of Exon 3~4 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 3290 bp, and the size of intron 4 for 3'-loxP site insertion: 5685 bp. The size of effective cKO region: ~2015 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Kmt5a Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7713bp) | A(23.54% 1816) | C(23.91% 1844) | T(26.4% 2036) | G(26.15% 2017)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 + 124447412 124450411 3000 browser details YourSeq 59 522 957 3000 72.5% chr14 + 67037376 67037702 327 browser details YourSeq 47 1612 2033 3000 64.2% chr3 + 104998897 104999127 231 browser details YourSeq 43 1959 2054 3000 73.0% chr14 - 21713062 21713132 71 browser details YourSeq 43 1604 1751 3000 92.4% chr18 + 58172689 58172839 151 browser details YourSeq 40 1954 2054 3000 93.5% chr13 + 100915591 100915692 102 browser details YourSeq 39 1954 2014 3000 81.9% chr11 - 114569686 114569742 57 browser details YourSeq 38 1954 2049 3000 95.3% chr10 + 81125439 81125674 236 browser details YourSeq 35 1959 2007 3000 91.9% chr15 - 73446361 73446408 48 browser details YourSeq 34 1946 1983 3000 94.8% chr6 - 29724404 29724441 38 browser details YourSeq 34 1603 1642 3000 94.6% chr14 + 92015106 92015151 46 browser details YourSeq 33 1945 1985 3000 90.3% chr10 + 77220832 77220872 41 browser details YourSeq 32 1965 2054 3000 94.5% chr5 - 119626359 119626457 99 browser details YourSeq 32 1954 2061 3000 86.4% chr16 - 58208896 58209003 108 browser details YourSeq 32 1610 1667 3000 94.5% chr11 + 44921099 44921156 58 browser details YourSeq 32 1960 2007 3000 86.9% chr1 + 36394876 36394922 47 browser details YourSeq 31 1966 2054 3000 97.0% chr1_GL456213_random - 7435 7524 90 browser details YourSeq 31 1886 1917 3000 100.0% chr16 + 60549469 60549501 33 browser details YourSeq 30 1954 1985 3000 96.9% chr10 - 80338055 80338086 32 browser details YourSeq 30 1954 1985 3000 96.9% chr10 + 60715238 60715269 32

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 + 124451625 124454624 3000 browser details YourSeq 184 455 844 3000 87.0% chr16 + 48693722 48694101 380 browser details YourSeq 183 454 844 3000 91.1% chr14 - 34276734 34277203 470 browser details YourSeq 181 450 885 3000 93.0% chr16 + 18512529 18684753 172225 browser details YourSeq 176 429 844 3000 91.6% chr7 + 79996612 79997057 446 browser details YourSeq 174 445 844 3000 91.9% chr10 + 88737737 88738367 631 browser details YourSeq 163 448 821 3000 91.4% chr15 + 99794969 99795385 417 browser details YourSeq 160 423 606 3000 94.0% chr1 - 128110461 128110671 211 browser details YourSeq 160 442 1023 3000 84.9% chr14 + 76501410 76501806 397 browser details YourSeq 150 431 603 3000 92.9% chr11 + 61780839 61781009 171 browser details YourSeq 149 428 605 3000 92.2% chr14 + 34978816 34978997 182 browser details YourSeq 148 435 602 3000 94.1% chr10 - 116334818 116334985 168 browser details YourSeq 148 431 604 3000 92.6% chr10 - 79825311 79825484 174 browser details YourSeq 148 431 605 3000 93.1% chr16 + 42270508 42270682 175 browser details YourSeq 148 431 602 3000 93.1% chr11 + 57978834 57979005 172 browser details YourSeq 148 439 603 3000 95.2% chr1 + 118400943 118401176 234 browser details YourSeq 147 428 603 3000 89.9% chr12 + 100189208 100189376 169 browser details YourSeq 146 439 603 3000 94.6% chr13 - 12637161 12637327 167 browser details YourSeq 146 438 601 3000 94.6% chr5 + 32811498 32811661 164 browser details YourSeq 145 442 602 3000 95.1% chr8 - 34162119 34162279 161

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and protein information: Kmt5a methyltransferase 5A [ Mus musculus (house mouse) ] Gene ID: 67956, updated on 26-Jun-2020

Gene summary

Official Symbol Kmt5a provided by MGI Official Full Name lysine methyltransferase 5A provided by MGI Primary source MGI:MGI:1915206 See related Ensembl:ENSMUSG00000049327 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Setd8; PR-SET7; AA617402; AW536475; PR/SET07; 2410195B05Rik Expression Ubiquitous expression in placenta adult (RPKM 32.5), heart adult (RPKM 26.1) and 28 other tissues See more Orthologs human all

Genomic context

Location: 5; 5 F See Kmt5a in Genome Data Viewer

Exon count: 11

Annotation release Status Assembly Chr Location

108.20200622 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (124439930..124462311)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (124889939..124912316)

Chromosome 5 - NC_000071.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 13 transcripts

Gene: Kmt5a ENSMUSG00000049327

Description lysine methyltransferase 5A [Source:MGI Symbol;Acc:MGI:1915206] Gene Synonyms 2410195B05Rik, PR-SET7, Setd8 Location Chromosome 5: 124,439,930-124,462,308 forward strand. GRCm38:CM000998.2 About this gene This gene has 13 transcripts (splice variants), 122 orthologues, is a member of 1 Ensembl protein family and is associated with 4 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Kmt5a- ENSMUST00000059580.10 2739 350aa ENSMUSP00000052953.4 Protein coding CCDS19675 E9QNB8 TSL:1 201 GENCODE basic APPRIS P3

Kmt5a- ENSMUST00000100709.6 1425 364aa ENSMUSP00000098275.2 Protein coding CCDS80410 D3YXI6 TSL:1 202 GENCODE basic

Kmt5a- ENSMUST00000198451.1 1032 295aa ENSMUSP00000143207.1 Protein coding CCDS80411 A0A0G2JFK5 TSL:3 212 GENCODE basic APPRIS ALT1

Kmt5a- ENSMUST00000199798.4 1789 38aa ENSMUSP00000143765.1 Nonsense mediated - A0A0G2JGZ8 TSL:1 213 decay

Kmt5a- ENSMUST00000147969.7 2112 No - Processed transcript - - TSL:5 209 protein

Kmt5a- ENSMUST00000154031.7 903 No - Processed transcript - - TSL:5 210 protein

Kmt5a- ENSMUST00000138766.7 731 No - Processed transcript - - TSL:3 206 protein

Kmt5a- ENSMUST00000143722.7 548 No - Processed transcript - - TSL:5 207 protein

Kmt5a- ENSMUST00000147692.1 370 No - Processed transcript - - TSL:3 208 protein

Kmt5a- ENSMUST00000196860.1 1271 No - Retained intron - - TSL:NA 211 protein

Kmt5a- ENSMUST00000135667.7 814 No - Retained intron - - TSL:2 205 protein

Kmt5a- ENSMUST00000125588.1 804 No - Retained intron - - TSL:2 203 protein

Kmt5a- ENSMUST00000134826.7 696 No - Retained intron - - TSL:3 204 protein

Page 6 of 8 https://www.alphaknockout.com

42.38 kb Forward strand 124.43Mb 124.44Mb 124.45Mb 124.46Mb 124.47Mb (Comprehensive set... Kmt5a-201 >protein coding

Kmt5a-213 >nonsense mediated decay

Kmt5a-211 >retained intron Kmt5a-206 >processed transcript

Kmt5a-202 >protein coding

Kmt5a-209 >processed transcript

Kmt5a-204 >retained intron

Kmt5a-210 >processed transcript

Kmt5a-208 >processed transcript

Kmt5a-212 >protein coding

Kmt5a-207 >processed transcript

Kmt5a-205 >retained intron

Kmt5a-203 >retained intron

Contigs AC127339.3 >

Genes < Rilpl2-201protein coding (Comprehensive set...

Regulatory Build

124.43Mb 124.44Mb 124.45Mb 124.46Mb 124.47Mb Reverse strand 42.38 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000100709

15.47 kb Forward strand

Kmt5a-202 >protein coding

ENSMUSP00000098... MobiDB lite Low complexity (Seg) Superfamily SSF82199 SMART SET domain Pfam SET domain PROSITE profiles SET domain

Class V SAM-dependent methyltransferases PIRSF Class V SAM-dependent methyltransferases PANTHER PTHR46167 Gene3D 2.170.270.10

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 364

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8