https://www.alphaknockout.com

Mouse Mkx Knockout Project (CRISPR/Cas9)

Objective: To create a Mkx knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Mkx (NCBI Reference Sequence: NM_177595 ; Ensembl: ENSMUSG00000061013 ) is located on Mouse 18. 7 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 7 (Transcript: ENSMUST00000079788). Exon 2~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit thin, hypoplastic tendons with reduced tensile strength.

Exon 2 starts from the coding region. Exon 2~4 covers 47.55% of the coding region. The size of effective KO region: ~9846 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 7

Legends Exon of mouse Mkx Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1998 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 404 bp section downstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(1998bp) | A(20.12% 402) | C(25.13% 502) | T(25.48% 509) | G(29.28% 585)

Note: The 1998 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(404bp) | A(28.22% 114) | C(18.32% 74) | T(36.39% 147) | G(17.08% 69)

Note: The 404 bp section downstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1998 1 1998 1998 100.0% chr18 - 7002545 7004542 1998 browser details YourSeq 101 790 902 1998 94.3% chr10 + 72481976 72482084 109 browser details YourSeq 80 790 870 1998 100.0% chr10 - 77250442 77250526 85 browser details YourSeq 58 790 850 1998 98.4% chr1 + 29180866 29180934 69 browser details YourSeq 55 786 842 1998 98.3% chr2 + 19308266 19308322 57 browser details YourSeq 52 791 842 1998 100.0% chr11 - 46316700 46316751 52 browser details YourSeq 51 789 842 1998 98.2% chr1 + 118723764 118723819 56 browser details YourSeq 50 790 841 1998 98.1% chr10 + 25446216 25446267 52 browser details YourSeq 48 791 838 1998 100.0% chr1 + 133733926 133733973 48 browser details YourSeq 45 798 845 1998 98.0% chr6 + 72192228 72192277 50 browser details YourSeq 32 790 824 1998 97.2% chr1 - 130453610 130453657 48 browser details YourSeq 26 1 47 1998 63.0% chr10 - 22311295 22311321 27 browser details YourSeq 22 1629 1651 1998 100.0% chr10 - 47956145 47956170 26 browser details YourSeq 20 1854 1873 1998 100.0% chr10 + 80992085 80992104 20

Note: The 1998 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 404 1 404 404 100.0% chr18 - 6992374 6992777 404 browser details YourSeq 24 130 156 404 96.3% chr13 + 110097608 110097636 29 browser details YourSeq 23 269 294 404 96.0% chr10 + 76263593 76263627 35 browser details YourSeq 22 11 32 404 100.0% chr4 - 57318906 57318927 22 browser details YourSeq 22 126 147 404 100.0% chr12 - 69803921 69803942 22 browser details YourSeq 20 266 285 404 100.0% chr17 - 23961006 23961025 20 browser details YourSeq 20 277 296 404 100.0% chr15 - 30177566 30177585 20 browser details YourSeq 20 9 30 404 95.5% chr13 + 114965261 114965282 22

Note: The 404 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Mkx mohawk [ Mus musculus (house mouse) ] Gene ID: 210719, updated on 12-Aug-2019

Gene summary

Official Symbol Mkx provided by MGI Official Full Name mohawk homeobox provided by MGI Primary source MGI:MGI:2687286 See related Ensembl:ENSMUSG00000061013 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Irxl1; 9430023B20Rik Expression Biased expression in limb E14.5 (RPKM 9.5), cortex adult (RPKM 5.3) and 8 other tissues See more Orthologs human all

Genomic context

Location: 18 A1; 18 4.53 cM See Mkx in Genome Data Viewer Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (6934966..7004779, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (6934964..7004777, complement)

Chromosome 18 - NC_000084.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Mkx ENSMUSG00000061013

Description mohawk homeobox [Source:MGI Symbol;Acc:MGI:2687286] Gene Synonyms 9430023B20Rik, Irxl1 Location Chromosome 18: 6,934,518-7,004,780 reverse strand. GRCm38:CM001011.2 About this gene This gene has 4 transcripts (splice variants), 257 orthologues, 6 paralogues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Mkx-201 ENSMUST00000079788.6 3650 354aa ENSMUSP00000078718.3 Protein coding CCDS37728 B2RQ30 TSL:1 GENCODE basic APPRIS P1

Mkx-202 ENSMUST00000176608.1 722 101aa ENSMUSP00000157351.1 Protein coding - A0A3Q4EH75 CDS 3' incomplete TSL:3

Mkx-204 ENSMUST00000188926.1 684 No protein - Retained intron - - TSL:3

Mkx-203 ENSMUST00000176757.1 425 No protein - Retained intron - - TSL:2

90.26 kb Forward strand

6.94Mb 6.96Mb 6.98Mb 7.00Mb Gm46633-201 >lncRNA (Comprehensive set...

Contigs < AC148004.3

Genes (Comprehensive set... < Gm2350-201lncRNA < Mkx-204retained intron < Mkx-202protein coding

< Mkx-201protein coding

< Mkx-203retained intron

Regulatory Build

6.94Mb 6.96Mb 6.98Mb 7.00Mb Reverse strand 90.26 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000079788

< Mkx-201protein coding

Reverse strand 70.26 kb

ENSMUSP00000078... MobiDB lite Low complexity (Seg) Superfamily Homeobox-like domain superfamily

SMART Homeobox domain

Pfam Homeobox KN domain

PROSITE profiles Homeobox domain PROSITE patterns Homeobox, conserved site

PANTHER PTHR11211:SF3

PTHR11211 Gene3D 1.10.10.60 CDD Homeobox domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant stop retained variant

Scale bar 0 40 80 120 160 200 240 280 354

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8