https://www.alphaknockout.com

Mouse Csmd2 Knockout Project (CRISPR/Cas9)

Objective: To create a Csmd2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Csmd2 (NCBI Reference Sequence: NM_001281955 ; Ensembl: ENSMUSG00000028804 ) is located on Mouse 4. 71 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 70 (Transcript: ENSMUST00000184063). Exon 2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 1.87% of the coding region. Exon 2 covers 2.0% of the coding region. The size of effective KO region: ~217 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 71

Legends Exon of mouse Csmd2 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.2% 524) | C(19.8% 396) | T(24.8% 496) | G(29.2% 584)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.35% 487) | C(22.5% 450) | T(30.35% 607) | G(22.8% 456)

Note: The 2000 bp section downstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 128056963 128058962 2000 browser details YourSeq 32 740 786 2000 84.8% chr1 + 103302511 103302558 48 browser details YourSeq 29 581 617 2000 87.1% chr11 - 51452249 51452283 35 browser details YourSeq 23 738 763 2000 83.4% chr17 - 71562885 71562908 24 browser details YourSeq 22 352 373 2000 100.0% chr7 - 115162995 115163016 22 browser details YourSeq 22 383 404 2000 100.0% chr11 - 27247971 27247992 22

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 128059180 128061179 2000 browser details YourSeq 625 1305 2000 2000 96.1% chr1 + 23394569 23395265 697 browser details YourSeq 624 1305 2000 2000 94.8% chr5 + 33913354 33914047 694 browser details YourSeq 623 1291 1999 2000 94.9% chr4 + 149746227 149746938 712 browser details YourSeq 622 1305 1999 2000 95.4% chr9 + 81426244 81426947 704 browser details YourSeq 620 1303 1999 2000 95.1% chr5 + 28057819 28058524 706 browser details YourSeq 619 1305 2000 2000 95.0% chrX - 121901285 121901990 706 browser details YourSeq 618 1303 2000 2000 94.8% chr16 + 5360293 5360997 705 browser details YourSeq 615 1305 2000 2000 94.8% chrX + 12161950 12162647 698 browser details YourSeq 615 1312 2000 2000 95.4% chr8 + 124091145 124091833 689 browser details YourSeq 615 1305 2000 2000 94.8% chr5 + 27019264 27019970 707 browser details YourSeq 613 1305 2000 2000 95.1% chr8 + 10729109 10729811 703 browser details YourSeq 613 1305 2000 2000 94.3% chr14 + 25568996 25569682 687 browser details YourSeq 612 1304 2000 2000 95.1% chr10 + 38706085 38706786 702 browser details YourSeq 611 1304 2000 2000 94.3% chr4 - 17998454 17999152 699 browser details YourSeq 611 1295 2000 2000 93.8% chr11 - 3619250 3619979 730 browser details YourSeq 610 1305 2000 2000 94.7% chr5 - 116988988 116989694 707 browser details YourSeq 610 1305 2000 2000 95.6% chr10 - 69122077 69122774 698 browser details YourSeq 609 1304 2000 2000 94.3% chr10 - 117428661 117429359 699 browser details YourSeq 609 1304 2000 2000 93.8% chr15 + 4964322 4965016 695

Note: The 2000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Csmd2 CUB and Sushi multiple domains 2 [ Mus musculus (house mouse) ] Gene ID: 329942, updated on 12-Aug-2019

Gene summary

Official Symbol Csmd2 provided by MGI Official Full Name CUB and Sushi multiple domains 2 provided by MGI Primary source MGI:MGI:2386401 See related Ensembl:ENSMUSG00000028804 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gm139 Expression Biased expression in CNS E18 (RPKM 4.4), whole brain E14.5 (RPKM 3.9) and 6 other tissues See more Orthologs human all

Genomic context

Location: 4; 4 D2.2 See Csmd2 in Genome Data Viewer Exon count: 71

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (127986505..128567660)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (127665294..128245100)

Chromosome 4 - NC_000070.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Csmd2 ENSMUSG00000028804

Description CUB and Sushi multiple domains 2 [Source:MGI Symbol;Acc:MGI:2386401] Location Chromosome 4: 127,987,857-128,567,656 forward strand. GRCm38:CM000997.2 About this gene This gene has 6 transcripts (splice variants), 140 orthologues, 38 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Csmd2-205 ENSMUST00000184063.2 13555 3611aa ENSMUSP00000138958.2 Protein coding CCDS84795 V9GX34 TSL:5 GENCODE basic APPRIS P1

Csmd2-206 ENSMUST00000221199.1 3601 302aa ENSMUSP00000152795.1 Protein coding - A0A1Y7VP35 CDS 5' incomplete TSL:5

Csmd2-203 ENSMUST00000144298.1 4104 No protein - Retained intron - - TSL:1

Csmd2-201 ENSMUST00000129619.7 3034 No protein - Retained intron - - TSL:1

Csmd2-204 ENSMUST00000148247.1 3320 No protein - lncRNA - - TSL:1

Csmd2-202 ENSMUST00000139561.1 686 No protein - lncRNA - - TSL:3

599.80 kb Forward strand 128.0Mb 128.1Mb 128.2Mb 128.3Mb 128.4Mb 128.5Mb (Comprehensive set... Csmd2-201 >retained intron Csmd2-204 >lncRNA

Csmd2-205 >protein coding

Csmd2-203 >retained intron Csmd2-206 >protein coding

Hmgb4os-201 >lncRNA Csmd2-202 >lncRNA

Contigs AL611934.18 > AL807807.11 > AL627093.17 > Genes < Csmd2os-202lncRNA < Hmgb4-201protein coding (Comprehensive set...

< Csmd2os-201lncRNA

Regulatory Build

128.0Mb 128.1Mb 128.2Mb 128.3Mb 128.4Mb 128.5Mb Reverse strand 599.80 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000184063

579.61 kb Forward strand

Csmd2-205 >protein coding

ENSMUSP00000138... Transmembrane heli... Low complexity (Seg) Superfamily Spermadhesin, CUB domain superfamily

Sushi/SCR/CCP superfamily SMART Sushi/SCR/CCP domain

CUB domain Pfam Sushi/SCR/CCP domain

CUB domain PROSITE profiles Sushi/SCR/CCP domain

CUB domain PANTHER PTHR45656:SF6

PTHR45656 Gene3D Spermadhesin, CUB domain superfamily

2.10.70.10 CDD CUB domain

Sushi/SCR/CCP domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2400 2800 3200 3611

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8