https://www.alphaknockout.com

Mouse Dcbld2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Dcbld2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Dcbld2 (NCBI Reference Sequence: NM_028523 ; Ensembl: ENSMUSG00000035107 ) is located on Mouse 16. 16 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 16 (Transcript: ENSMUST00000046663). Exon 5~7 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Dcbld2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-86O4 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit reduced postnatal angiogenesis and impaired recovery from femoral artery ligation with impaired blood flow and decreased capillary density.

Exon 5 starts from about 26.66% of the coding region. The knockout of Exon 5~7 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 3059 bp, and the size of intron 7 for 3'-loxP site insertion: 884 bp. The size of effective cKO region: ~1888 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 5 6 7 8 9 16 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Dcbld2 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8388bp) | A(26.13% 2192) | C(19.87% 1667) | T(32.61% 2735) | G(21.39% 1794)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr16 + 58445203 58448202 3000 browser details YourSeq 149 825 1025 3000 90.1% chr5 - 147769073 147769252 180 browser details YourSeq 147 825 991 3000 96.9% chr4 - 59291648 59291816 169 browser details YourSeq 142 825 991 3000 93.4% chr10 + 16053684 16053844 161 browser details YourSeq 139 825 991 3000 95.9% chr17 - 51968195 51968359 165 browser details YourSeq 138 825 991 3000 95.2% chr3 - 140743622 140743786 165 browser details YourSeq 137 845 991 3000 98.0% chr11 + 112734954 112735113 160 browser details YourSeq 135 825 991 3000 91.6% chr9 - 47995644 47995796 153 browser details YourSeq 135 845 991 3000 97.9% chr4 - 58322789 58323034 246 browser details YourSeq 134 825 991 3000 90.8% chr1 - 22327761 22327909 149 browser details YourSeq 133 845 991 3000 97.2% chr8 - 95515476 95515643 168 browser details YourSeq 133 845 991 3000 93.4% chr10 + 66495089 66495224 136 browser details YourSeq 132 845 991 3000 98.6% chr9 + 48217646 48217817 172 browser details YourSeq 128 825 991 3000 97.1% chr6 + 32110081 32110253 173 browser details YourSeq 126 825 991 3000 94.4% chrY + 77732162 77732378 217 browser details YourSeq 126 825 991 3000 94.4% chrY + 65416094 65416318 225 browser details YourSeq 126 825 991 3000 94.4% chrY + 54693978 54694194 217 browser details YourSeq 125 853 991 3000 97.8% chr8 - 77820822 77820975 154 browser details YourSeq 122 1827 1989 3000 90.1% chr7 - 58629345 58629508 164 browser details YourSeq 122 871 1003 3000 97.7% chr6 - 138050451 138050628 178

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr16 + 58450091 58453090 3000 browser details YourSeq 44 2483 2536 3000 95.9% chr6 + 99504441 99504498 58 browser details YourSeq 42 2486 2552 3000 74.0% chr15 + 36196369 36196417 49 browser details YourSeq 31 2790 2820 3000 100.0% chr18 + 86972892 86972922 31 browser details YourSeq 31 2473 2508 3000 97.2% chr10 + 113763670 113763719 50

Note: The 3000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Dcbld2 discoidin, CUB and LCCL domain containing 2 [ Mus musculus (house mouse) ] Gene ID: 73379, updated on 14-Aug-2019

Gene summary

Official Symbol Dcbld2 provided by MGI Official Full Name discoidin, CUB and LCCL domain containing 2 provided by MGI Primary source MGI:MGI:1920629 See related Ensembl:ENSMUSG00000035107 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Esdn; CLCP1; AW146002; 1700055P21Rik Expression Ubiquitous expression in bladder adult (RPKM 10.6), CNS E11.5 (RPKM 10.5) and 27 other tissues See more Orthologs human all

Genomic context

Location: 16; 16 C1.2 See Dcbld2 in Genome Data Viewer

Exon count: 16

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 16 NC_000082.6 (58408535..58469745)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 16 NC_000082.5 (58408648..58469858)

Chromosome 16 - NC_000082.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Dcbld2 ENSMUSG00000035107

Description discoidin, CUB and LCCL domain containing 2 [Source:MGI Symbol;Acc:MGI:1920629] Gene Synonyms 1700055P21Rik, CLCP1, Esdn Location Chromosome 16: 58,408,443-58,469,727 forward strand. GRCm38:CM001009.2 About this gene This gene has 7 transcripts (splice variants), 208 orthologues, 35 paralogues, is a member of 1 Ensembl protein family and is associated with 12 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Dcbld2-201 ENSMUST00000046663.7 6561 769aa ENSMUSP00000039915.7 Protein coding CCDS28229 Q91ZV3 TSL:1 GENCODE basic APPRIS P1

Dcbld2-202 ENSMUST00000130409.7 2396 No protein - Retained intron - - TSL:1

Dcbld2-204 ENSMUST00000135415.1 2359 No protein - Retained intron - - TSL:1

Dcbld2-203 ENSMUST00000134324.7 891 No protein - Retained intron - - TSL:3

Dcbld2-207 ENSMUST00000150817.7 1270 No protein - lncRNA - - TSL:1

Dcbld2-205 ENSMUST00000142830.7 627 No protein - lncRNA - - TSL:3

Dcbld2-206 ENSMUST00000149321.1 576 No protein - lncRNA - - TSL:3

Page 6 of 8 https://www.alphaknockout.com

81.28 kb Forward strand 58.40Mb 58.42Mb 58.44Mb 58.46Mb (Comprehensive set... Dcbld2-201 >protein coding

Dcbld2-205 >lncRNA

Dcbld2-206 >lncRNA

Dcbld2-203 >retained intron

Dcbld2-207 >lncRNA

Dcbld2-202 >retained intron

Dcbld2-204 >retained intron

Contigs CT027564.7 >

Genes < 4930461C15Rik-201lncRNA < St3gal6-207protein coding (Comprehensive set...

< St3gal6-202protein coding

< St3gal6-201protein coding

< St3gal6-212lncRNA

Regulatory Build

58.40Mb 58.42Mb 58.44Mb 58.46Mb Reverse strand 81.28 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000046663

61.28 kb Forward strand

Dcbld2-201 >protein coding

ENSMUSP00000039... Transmembrane heli... MobiDB lite Low complexity (Seg) Superfamily Galactose-binding-like domain superfamily

LCCL domain superfamily

Spermadhesin, CUB domain superfamily SMART CUB domain LCCL domain Coagulation factor 5/8 C-terminal domain

Pfam CUB domain LCCL domain Coagulation factor 5/8 C-terminal domain

PROSITE profiles CUB domain LCCL domain Coagulation factor 5/8 C-terminal domain

PROSITE patterns Coagulation factor 5/8 C-terminal domain PANTHER PTHR46806:SF3

PTHR46806 Gene3D Spermadhesin, CUB domain superfamily

Galactose-binding-like domain superfamily

LCCL domain superfamily CDD CUB domain Coagulation factor 5/8 C-terminal domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

inframe insertion missense variant splice region variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 769

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8