https://www.alphaknockout.com

Mouse Mid2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Mid2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Mid2 (NCBI Reference Sequence: NM_011845 ; Ensembl: ENSMUSG00000000266 ) is located on Mouse X. 9 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 9 (Transcript: ENSMUST00000112993). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Mid2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-340F6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 49.34% of the coding region. The knockout of Exon 5 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 11103 bp, and the size of intron 5 for 3'-loxP site insertion: 1122 bp. The size of effective cKO region: ~628 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 5 6 9 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Mid2 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7128bp) | A(29.98% 2137) | C(19.26% 1373) | T(31.24% 2227) | G(19.51% 1391)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chrX + 140746323 140749322 3000 browser details YourSeq 187 962 2704 3000 92.8% chr7 + 28778400 28861822 83423 browser details YourSeq 149 1390 1652 3000 90.3% chr11 - 43435394 43435879 486 browser details YourSeq 141 1380 1563 3000 95.0% chr11 + 60082606 60082810 205 browser details YourSeq 139 1388 1563 3000 93.8% chr17 - 65870086 65870271 186 browser details YourSeq 139 1388 1567 3000 91.3% chr1 - 87438011 87438202 192 browser details YourSeq 137 944 1561 3000 82.3% chr7 - 126442307 126442672 366 browser details YourSeq 137 1387 1562 3000 94.2% chr5 + 115116941 115117129 189 browser details YourSeq 137 1389 1564 3000 95.4% chr17 + 72594623 72594890 268 browser details YourSeq 135 1388 1559 3000 93.6% chr7 - 28415634 28415822 189 browser details YourSeq 135 1388 1562 3000 92.5% chr7 + 65242483 65242671 189 browser details YourSeq 134 1388 1567 3000 94.2% chr8 + 79728017 79728196 180 browser details YourSeq 134 1388 1564 3000 92.5% chr2 + 28869479 28869668 190 browser details YourSeq 134 1388 1563 3000 92.9% chr18 + 35521805 35521991 187 browser details YourSeq 134 1388 1559 3000 91.5% chr11 + 6557292 6557474 183 browser details YourSeq 133 1391 1561 3000 93.0% chr13 + 62917192 62917371 180 browser details YourSeq 132 1391 1562 3000 93.5% chr4 - 40541293 40541475 183 browser details YourSeq 131 1388 1563 3000 93.4% chr5 + 135906264 135906447 184 browser details YourSeq 130 1393 1560 3000 95.2% chr15 - 82297777 82298124 348 browser details YourSeq 129 1388 1556 3000 92.8% chr18 + 80237811 80237991 181

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chrX + 140749951 140752950 3000 browser details YourSeq 75 893 995 3000 86.5% chrY + 90843899 90844001 103 browser details YourSeq 40 210 335 3000 97.8% chr1 - 174167912 174168045 134 browser details YourSeq 36 1687 1725 3000 89.2% chr13 + 27951280 27951316 37 browser details YourSeq 35 1998 2129 3000 64.3% chr6 - 79511304 79511401 98 browser details YourSeq 34 1694 1731 3000 97.4% chr3 - 13267339 13267384 46 browser details YourSeq 30 2166 2195 3000 100.0% chr7 - 124449702 124449731 30 browser details YourSeq 30 1674 1714 3000 81.9% chr19 - 3741064 3741101 38 browser details YourSeq 29 29 69 3000 75.8% chr13 + 111891911 111891946 36 browser details YourSeq 28 2097 2127 3000 100.0% chr4 + 47176603 47176645 43 browser details YourSeq 28 1613 1647 3000 91.5% chr10 + 64015696 64015743 48 browser details YourSeq 27 2104 2130 3000 100.0% chr8 + 31335940 31335966 27 browser details YourSeq 25 2095 2122 3000 96.5% chr8 - 89700659 89700687 29 browser details YourSeq 25 1701 1726 3000 100.0% chr4 - 139113501 139113529 29 browser details YourSeq 22 2177 2198 3000 100.0% chr11 + 107672288 107672309 22

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Mid2 midline 2 [ Mus musculus (house mouse) ] Gene ID: 23947, updated on 14-Aug-2019

Gene summary

Official Symbol Mid2 provided by MGI Official Full Name midline 2 provided by MGI Primary source MGI:MGI:1344333 See related Ensembl:ENSMUSG00000000266 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as FXY2; Trim1 Expression Ubiquitous expression in lung adult (RPKM 6.4), bladder adult (RPKM 4.0) and 26 other tissues See more Orthologs human all

Genomic context

Location: X F1; X 61.35 cM See Mid2 in Genome Data Viewer

Exon count: 11

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) X NC_000086.7 (140664304..140767715)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) X NC_000086.6 (137212567..137302254)

Chromosome X - NC_000086.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Mid2 ENSMUSG00000000266

Description midline 2 [Source:MGI Symbol;Acc:MGI:1344333] Gene Synonyms FXY2, Trim1 Location Chromosome X: 140,664,599-140,767,715 forward strand. GRCm38:CM001013.2 About this gene This gene has 5 transcripts (splice variants), 195 orthologues, 73 paralogues, is a member of 2 Ensembl protein families and is associated with 8 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Mid2-203 ENSMUST00000112993.1 5916 685aa ENSMUSP00000108617.1 Protein coding CCDS41151 B1AVF4 TSL:1 GENCODE basic APPRIS P2

Mid2-202 ENSMUST00000112990.7 6271 685aa ENSMUSP00000108614.1 Protein coding - B1AVF4 TSL:5 GENCODE basic APPRIS ALT2

Mid2-201 ENSMUST00000112988.7 2525 715aa ENSMUSP00000108612.1 Protein coding - B1AVF5 TSL:5 GENCODE basic APPRIS ALT1

Mid2-204 ENSMUST00000128809.1 757 219aa ENSMUSP00000123221.1 Protein coding - B1AVF6 CDS 3' incomplete TSL:5

Mid2-205 ENSMUST00000140144.1 2069 No protein - Retained intron - - TSL:1

123.12 kb Forward strand

140.66Mb 140.68Mb 140.70Mb 140.72Mb 140.74Mb 140.76Mb Mid2-204 >protein coding Mid2-205 >retained intron (Comprehensive set...

Mid2-202 >protein coding

Mid2-201 >protein coding

Mid2-203 >protein coding

Gm23199-201 >snRNA Eif2c5-201 >lncRNA

Contigs AL683809.7 > BX470203.6 > Regulatory Build

140.66Mb 140.68Mb 140.70Mb 140.72Mb 140.74Mb 140.76Mb Reverse strand 123.12 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000112993

89.69 kb Forward strand

Mid2-203 >protein coding

ENSMUSP00000108... Coiled-coils (Ncoils) Superfamily SSF57850 SSF57845 Concanavalin A-like lectin/glucanase domain superfamily

Fibronectin type III superfamily SMART Zinc finger, RING-type B-box, C-terminal Fibronectin type III SPRY domain

B-box-type zinc finger Prints Butyrophylin-like, SPRY domain Pfam RING-type zinc-finger, LisH dimerisation motif Fibronectin type III SPRY domain

B-box-type zinc finger Midline-1, COS domain PROSITE profiles Zinc finger, RING-type B-box-type zinc finger COS domain B30.2/SPRY domain

Fibronectin type III PROSITE patterns Zinc finger, RING-type, conserved site PANTHER PTHR24099:SF12

PTHR24099 Gene3D 3.30.40.90 3.30.40.200 2.60.120.920

Zinc finger, RING/FYVE/PHD-type Immunoglobulin-like fold CDD cd16754 B-box-type zinc finger Fibronectin type III TRIM1, PRY/SPRY domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 685

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7