https://www.alphaknockout.com

Mouse Clec2d Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Clec2d conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Clec2d (NCBI Reference Sequence: NM_053109 ; Ensembl: ENSMUSG00000030157 ) is located on Mouse 6. 5 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 5 (Transcript: ENSMUST00000032260). Exon 2~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Clec2d gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-259F7 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Enhanced osteoclastic activity in the bone of homozygous null mice leads to osteopenia and high serum calcium levels.

Exon 2 starts from about 10.95% of the coding region. The knockout of Exon 2~4 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 2344 bp, and the size of intron 4 for 3'-loxP site insertion: 982 bp. The size of effective cKO region: ~2293 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Clec2d Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8793bp) | A(27.95% 2458) | C(20.18% 1774) | T(30.42% 2675) | G(21.45% 1886)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 129179843 129182842 3000 browser details YourSeq 1121 775 3000 3000 90.6% chr6 + 128934353 128946555 12203 browser details YourSeq 707 751 1752 3000 89.9% chr6 + 128992953 128993929 977 browser details YourSeq 420 1 527 3000 92.4% chr12 + 11452975 11998167 545193 browser details YourSeq 359 19 522 3000 91.1% chr19 + 56166175 56166694 520 browser details YourSeq 346 19 528 3000 92.9% chr8 - 85966441 85966952 512 browser details YourSeq 331 19 511 3000 89.3% chr14 + 80151064 80151649 586 browser details YourSeq 322 19 502 3000 92.0% chr12 - 37509345 37509832 488 browser details YourSeq 313 1 528 3000 87.3% chr10 - 79561164 79561761 598 browser details YourSeq 312 1 515 3000 86.3% chr9 - 49322486 49323082 597 browser details YourSeq 308 1 505 3000 88.6% chr2 + 164217124 164217714 591 browser details YourSeq 307 3 516 3000 85.1% chr5 + 29828582 29829165 584 browser details YourSeq 306 103 513 3000 89.3% chr13 + 3286195 3286597 403 browser details YourSeq 305 153 526 3000 91.2% chrX - 165885898 165886268 371 browser details YourSeq 300 103 514 3000 90.2% chr2 + 142155410 142155822 413 browser details YourSeq 299 2 503 3000 87.7% chrX - 96013060 96013643 584 browser details YourSeq 299 6 505 3000 85.3% chr8 - 22346287 22346864 578 browser details YourSeq 297 1 503 3000 84.8% chr1 - 76388603 76389186 584 browser details YourSeq 296 1 514 3000 86.3% chr1 + 185900629 185901221 593 browser details YourSeq 295 2 507 3000 89.5% chr15 - 44848527 44849114 588

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 129185136 129188135 3000 browser details YourSeq 1673 1 2128 3000 92.5% chr6 + 128982219 128985445 3227 browser details YourSeq 1160 1 1591 3000 92.8% chr6 + 128895684 128898361 2678 browser details YourSeq 241 2440 3000 3000 86.0% chr15 - 98385716 98386047 332 browser details YourSeq 214 2699 2988 3000 92.2% chr13 + 52366804 52367206 403 browser details YourSeq 197 2752 3000 3000 90.3% chr10 - 26066271 26066520 250 browser details YourSeq 197 2756 2997 3000 92.0% chr11 + 62190454 62190698 245 browser details YourSeq 196 2756 3000 3000 90.3% chr1 + 58385094 58385339 246 browser details YourSeq 195 2766 2997 3000 93.4% chr11 - 44388623 44388856 234 browser details YourSeq 193 2756 3000 3000 89.9% chr7 - 135386088 135386331 244 browser details YourSeq 191 2752 3000 3000 89.4% chr7 + 12642770 12643019 250 browser details YourSeq 190 2756 3000 3000 89.1% chr12 - 108523486 108523729 244 browser details YourSeq 190 2767 3000 3000 91.7% chr15 + 67754750 67754984 235 browser details YourSeq 189 2756 3000 3000 88.6% chr11 - 84998280 84998524 245 browser details YourSeq 188 2756 3000 3000 90.4% chr8 - 106458400 106458643 244 browser details YourSeq 188 2752 3000 3000 90.6% chr7 - 63783898 63784147 250 browser details YourSeq 188 2752 2988 3000 92.4% chr4 - 89725091 89725327 237 browser details YourSeq 188 2766 3000 3000 90.6% chr16 - 78508599 78508831 233 browser details YourSeq 188 2752 3000 3000 90.6% chrX + 140328684 140328933 250 browser details YourSeq 188 2774 2997 3000 93.2% chr7 + 97367994 97368218 225

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Clec2d C-type lectin domain family 2, member d [ Mus musculus (house mouse) ] Gene ID: 93694, updated on 12-Aug-2019

Gene summary

Official Symbol Clec2d provided by MGI Official Full Name C-type lectin domain family 2, member d provided by MGI Primary source MGI:MGI:2135589 See related Ensembl:ENSMUSG00000030157 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Clrb; Ocil; Clr-b Expression Broad expression in lung adult (RPKM 34.9), large intestine adult (RPKM 31.4) and 17 other tissuesS ee more

Genomic context

Location: 6; 6 F3 See Clec2d in Genome Data Viewer Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (129180615..129186535)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (129130640..129136552)

Chromosome 6 - NC_000072.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Clec2d ENSMUSG00000030157

Description C-type lectin domain family 2, member d [Source:MGI Symbol;Acc:MGI:2135589] Gene Synonyms Clr-b, Clrb, Ocil Location Chromosome 6: 129,180,615-129,186,534 forward strand. GRCm38:CM000999.2 About this gene This gene has 1 transcript (splice variant), 284 orthologues, 44 paralogues, is a member of 1 Ensembl protein family and is associated with 13 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Clec2d-201 ENSMUST00000032260.5 1213 207aa ENSMUSP00000032260.5 Protein coding CCDS20581 F5BFH0 Q91V08 TSL:1 GENCODE basic APPRIS P1

25.92 kb Forward strand 129.175Mb 129.180Mb 129.185Mb 129.190Mb 129.195Mb (Comprehensive set... Gm26160-201 >snRNA Clec2d-201 >protein coding

Contigs AC142191.5 > Genes < Gm27514-201miRNA (Comprehensive set...

Regulatory Build

129.175Mb 129.180Mb 129.185Mb 129.190Mb 129.195Mb Reverse strand 25.92 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000032260

5.92 kb Forward strand

Clec2d-201 >protein coding

ENSMUSP00000032... Transmembrane heli... PDB-ENSP mappings Low complexity (Seg) Superfamily C-type lectin fold

SMART C-type lectin-like

Pfam C-type lectin-like

PROSITE profiles C-type lectin-like

PANTHER PTHR45710

PTHR45710:SF4 Gene3D C-type lectin-like/link domain superfamily

CDD Natural killer cell receptor-like, C-type lectin-like domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 207

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7