https://www.alphaknockout.com

Mouse Cmtm3 Knockout Project (CRISPR/Cas9)

Objective: To create a Cmtm3 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cmtm3 (NCBI Reference Sequence: NM_024217 ; Ensembl: ENSMUSG00000031875 ) is located on Mouse 8. 5 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 5 (Transcript: ENSMUST00000034343). Exon 1~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 1 starts from about 0.18% of the coding region. Exon 1~5 covers 100.0% of the coding region. The size of effective KO region: ~6015 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5

Legends Exon of mouse Cmtm3 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.4% 548) | C(17.85% 357) | T(18.5% 370) | G(36.25% 725)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.65% 553) | C(24.95% 499) | T(23.25% 465) | G(24.15% 483)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 + 104338750 104340749 2000 browser details YourSeq 225 12 391 2000 86.5% chr11 + 48701890 48702279 390 browser details YourSeq 196 12 375 2000 83.2% chr6 + 47934495 47934823 329 browser details YourSeq 186 7 810 2000 87.5% chr7 + 127409783 127410658 876 browser details YourSeq 174 12 385 2000 89.2% chr8 + 108186836 108187250 415 browser details YourSeq 172 9 404 2000 87.3% chr2 + 125237545 125237937 393 browser details YourSeq 168 15 410 2000 88.0% chr15 + 98624104 98624558 455 browser details YourSeq 167 12 405 2000 84.6% chr6 + 71472703 71473090 388 browser details YourSeq 167 11 407 2000 85.8% chr18 + 73670125 73670496 372 browser details YourSeq 167 55 407 2000 84.4% chr1 + 169835948 169836288 341 browser details YourSeq 167 51 407 2000 83.1% chr1 + 152918409 152918753 345 browser details YourSeq 165 11 397 2000 80.9% chr7 - 109666236 109666578 343 browser details YourSeq 164 53 407 2000 82.8% chr4 - 94612213 94612585 373 browser details YourSeq 164 11 408 2000 86.1% chr19 + 54461297 54461703 407 browser details YourSeq 163 11 416 2000 84.5% chr4 + 156206978 156207374 397 browser details YourSeq 161 44 384 2000 90.9% chr14 + 58375780 58376221 442 browser details YourSeq 160 14 393 2000 80.1% chr7 + 46035516 46035829 314 browser details YourSeq 157 14 397 2000 80.6% chr4 - 152682694 152683052 359 browser details YourSeq 156 11 373 2000 82.2% chr7 - 80267839 80268174 336 browser details YourSeq 156 11 397 2000 88.3% chr1 - 151725604 151725998 395

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 + 104346765 104348764 2000 browser details YourSeq 41 1902 1960 2000 88.7% chr4 + 153933769 153933828 60 browser details YourSeq 34 1877 1929 2000 73.9% chr9 - 72690999 72691041 43 browser details YourSeq 31 1869 1904 2000 94.5% chr1 - 88305792 88305828 37 browser details YourSeq 29 1854 1894 2000 78.2% chr1 + 59753535 59753571 37 browser details YourSeq 28 1932 1965 2000 91.2% chr7 - 122153561 122153594 34 browser details YourSeq 28 1411 1458 2000 69.5% chr4 - 66328952 66328991 40 browser details YourSeq 27 1941 1970 2000 96.7% chr8 - 13639779 13639809 31 browser details YourSeq 26 1939 1964 2000 100.0% chr2 - 126712986 126713011 26 browser details YourSeq 24 1427 1450 2000 100.0% chr14 - 76373560 76373583 24 browser details YourSeq 22 1877 1898 2000 100.0% chr17 - 33821086 33821107 22 browser details YourSeq 22 1941 1962 2000 100.0% chr10 - 70403165 70403186 22 browser details YourSeq 22 729 752 2000 87.0% chr1 + 60353799 60353821 23 browser details YourSeq 22 1878 1899 2000 100.0% chr1 + 56855482 56855503 22

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Cmtm3 CKLF-like MARVEL transmembrane domain containing 3 [ Mus musculus (house mouse) ] Gene ID: 68119, updated on 12-Aug-2019

Gene summary

Official Symbol Cmtm3 provided by MGI Official Full Name CKLF-like MARVEL transmembrane domain containing 3 provided by MGI Primary source MGI:MGI:2447162 See related Ensembl:ENSMUSG00000031875 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as BNAS2; Cklfsf3; AI413895; 9430096L06Rik Expression Broad expression in ovary adult (RPKM 59.8), adrenal adult (RPKM 43.1) and 22 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 D3 See Cmtm3 in Genome Data Viewer Exon count: 15

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (104339383..104347672)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (106864494..106871572)

Chromosome 8 - NC_000074.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Cmtm3 ENSMUSG00000031875

Description CKLF-like MARVEL transmembrane domain containing 3 [Source:MGI Symbol;Acc:MGI:2447162] Gene Synonyms 9430096L06Rik, BNAS2, Cklfsf3 Location Chromosome 8: 104,339,410-104,347,672 forward strand. GRCm38:CM001001.2 About this gene This gene has 8 transcripts (splice variants), 196 orthologues, 17 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cmtm3-201 ENSMUST00000034343.4 1616 184aa ENSMUSP00000034343.4 Protein coding CCDS22577 Q99LJ5 TSL:1 GENCODE basic APPRIS P1

Cmtm3-204 ENSMUST00000212081.1 850 184aa ENSMUSP00000148682.1 Protein coding CCDS22577 Q99LJ5 TSL:5 GENCODE basic APPRIS P1

Cmtm3-208 ENSMUST00000212948.1 523 142aa ENSMUSP00000148628.1 Protein coding - A0A1D5RM50 CDS 3' incomplete TSL:3

Cmtm3-205 ENSMUST00000212139.1 502 118aa ENSMUSP00000148338.1 Protein coding - A0A1D5RLE8 CDS 3' incomplete TSL:5

Cmtm3-202 ENSMUST00000211885.1 465 82aa ENSMUSP00000148513.1 Protein coding - A0A1D5RLU9 CDS 3' incomplete TSL:3

Cmtm3-207 ENSMUST00000212734.1 867 No protein - Retained intron - - TSL:2

Cmtm3-206 ENSMUST00000212399.1 790 No protein - Retained intron - - TSL:2

Cmtm3-203 ENSMUST00000211996.1 630 No protein - Retained intron - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

28.26 kb Forward strand

104.33Mb 104.34Mb 104.35Mb (Comprehensive set... Cmtm2b-201 >protein coding Cmtm3-205 >protein coding Cmtm3-207 >retained intron

Cmtm2b-202 >protein coding Cmtm3-204 >protein coding

Cmtm3-202 >protein coding

Cmtm3-208 >protein coding

Cmtm3-203 >retained intron

Cmtm3-201 >protein coding

Cmtm3-206 >retained intron

Contigs < AC121952.4 Genes < Cmtm4-201protein coding (Comprehensive set...

Regulatory Build

104.33Mb 104.34Mb 104.35Mb Reverse strand 28.26 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000034343

7.08 kb Forward strand

Cmtm3-201 >protein coding

ENSMUSP00000034... Transmembrane heli... MobiDB lite Low complexity (Seg) Pfam Marvel domain PROSITE profiles Marvel domain PANTHER PTHR22776

PTHR22776:SF3

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 184

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9