https://www.alphaknockout.com

Mouse Cdhr5 Knockout Project (CRISPR/Cas9)

Objective: To create a Cdhr5 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cdhr5 (NCBI Reference Sequence: NM_001114322 ; Ensembl: ENSMUSG00000025497 ) is located on Mouse 7. 15 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 15 (Transcript: ENSMUST00000167263). Exon 1~15 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 1 starts from about 0.04% of the coding region. Exon 1~15 covers 100.0% of the coding region. The size of effective KO region: ~7564 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Legends Exon of mouse Cdhr5 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.6% 432) | C(32.4% 648) | T(21.15% 423) | G(24.85% 497)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.9% 498) | C(23.1% 462) | T(28.75% 575) | G(23.25% 465)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 - 141276719 141278718 2000 browser details YourSeq 100 1381 1621 2000 92.4% chr13 + 58912538 58912803 266 browser details YourSeq 97 1378 1623 2000 84.4% chr7 + 51741021 51741234 214 browser details YourSeq 84 1380 1597 2000 92.2% chr11 - 22506824 22507210 387 browser details YourSeq 78 1404 1630 2000 77.8% chr11 - 80585716 80585853 138 browser details YourSeq 72 1429 1621 2000 93.8% chr13 + 58912529 58912874 346 browser details YourSeq 71 1428 1624 2000 92.8% chr9 + 89880370 89880863 494 browser details YourSeq 70 1407 1623 2000 90.6% chr11 + 104121341 104121561 221 browser details YourSeq 59 1408 1627 2000 95.4% chr12 + 113312179 113312929 751 browser details YourSeq 56 1401 1602 2000 74.2% chr7 + 51741106 51741234 129 browser details YourSeq 56 1382 1619 2000 71.3% chr11 + 34767377 34767499 123 browser details YourSeq 56 1445 1619 2000 74.3% chr11 + 34767216 34767342 127 browser details YourSeq 53 1407 1561 2000 92.1% chr9 + 89880571 89881023 453 browser details YourSeq 47 1546 1630 2000 73.6% chr11 - 80585795 80585853 59 browser details YourSeq 43 1379 1455 2000 93.9% chr11 - 22506890 22507134 245 browser details YourSeq 43 1381 1537 2000 67.4% chr13 + 58912568 58912642 75 browser details YourSeq 40 1385 1457 2000 76.1% chr17 + 44609913 44609972 60 browser details YourSeq 28 1461 1511 2000 71.9% chr10 + 81559837 81559876 40 browser details YourSeq 27 1427 1458 2000 93.8% chr3 + 69902591 69902623 33 browser details YourSeq 26 1549 1585 2000 96.5% chr11 + 104121498 104121539 42

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 - 141267153 141269152 2000 browser details YourSeq 428 699 2000 2000 92.9% chr2 + 153311818 153584239 272422 browser details YourSeq 235 1403 2000 2000 88.4% chr17 - 35890263 35890730 468 browser details YourSeq 212 1406 1997 2000 89.9% chr7 - 120872251 120872782 532 browser details YourSeq 209 367 930 2000 84.0% chr8 + 13765633 13766067 435 browser details YourSeq 199 734 1263 2000 89.7% chr18 - 35454740 35731354 276615 browser details YourSeq 193 1123 1997 2000 78.6% chr7 - 126349492 126349858 367 browser details YourSeq 191 1407 1997 2000 84.5% chr4 - 136022739 136022971 233 browser details YourSeq 187 723 1249 2000 83.1% chr12 - 85148893 85149229 337 browser details YourSeq 186 1299 1717 2000 89.1% chr11 + 119272283 119272789 507 browser details YourSeq 178 1403 1722 2000 83.8% chr7 - 34336652 34336925 274 browser details YourSeq 172 705 1250 2000 80.1% chr7 - 81518457 81518790 334 browser details YourSeq 167 707 1227 2000 80.3% chr13 - 43640159 43640461 303 browser details YourSeq 165 367 896 2000 76.7% chr19 + 3827463 3827786 324 browser details YourSeq 165 697 895 2000 90.6% chr11 + 21734471 21734663 193 browser details YourSeq 162 707 893 2000 91.3% chrX + 152087157 152087339 183 browser details YourSeq 161 684 895 2000 90.0% chr10 + 22634280 22634505 226 browser details YourSeq 159 703 911 2000 90.0% chr8 - 13407011 13407258 248 browser details YourSeq 159 705 899 2000 88.8% chr2 - 157196447 157196633 187 browser details YourSeq 156 700 896 2000 92.4% chr16 + 48292071 48292491 421

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Cdhr5 cadherin-related family member 5 [ Mus musculus (house mouse) ] Gene ID: 72040, updated on 12-Aug-2019

Gene summary

Official Symbol Cdhr5 provided by MGI Official Full Name cadherin-related family member 5 provided by MGI Primary source MGI:MGI:1919290 See related Ensembl:ENSMUSG00000025497 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Mucdhl; Mupcdh; AI481143; 1810074H01Rik Expression Biased expression in duodenum adult (RPKM 477.0), small intestine adult (RPKM 353.9) and 3 other tissues See more Orthologs human all

Genomic context

Location: 7; 7 F5 See Cdhr5 in Genome Data Viewer Exon count: 15

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (141269080..141276795, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (148454984..148462685, complement)

Chromosome 7 - NC_000073.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Cdhr5 ENSMUSG00000025497

Description cadherin-related family member 5 [Source:MGI Symbol;Acc:MGI:1919290] Gene Synonyms 1810074H01Rik, Mucdhl, Mupcdh Location Chromosome 7: 141,269,083-141,276,786 reverse strand. GRCm38:CM001000.2 About this gene This gene has 5 transcripts (splice variants), 90 orthologues, 33 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cdhr5- ENSMUST00000167263.8 2628 831aa ENSMUSP00000127292.1 Protein coding CCDS52442 A0PJK7 TSL:1 202 GENCODE basic APPRIS ALT2

Cdhr5- ENSMUST00000080654.6 2133 669aa ENSMUSP00000079484.5 Protein coding CCDS22006 Q8VHF2 TSL:1 201 GENCODE basic APPRIS P3

Cdhr5- ENSMUST00000210124.1 1001 333aa ENSMUSP00000148123.1 Protein coding - A0A1B0GSY5 CDS 5' and 3' 203 incomplete TSL:5

Cdhr5- ENSMUST00000210773.1 554 131aa ENSMUSP00000147472.1 Nonsense mediated - A0A1B0GRD3 CDS 5' 205 decay incomplete TSL:5

Cdhr5- ENSMUST00000210386.1 755 No - Retained intron - - TSL:3 204 protein

Page 7 of 9 https://www.alphaknockout.com

27.70 kb Forward strand 141.26Mb 141.27Mb 141.28Mb Phrf1-209 >nonsense mediated decay (Comprehensive set...

Phrf1-201 >protein coding

Phrf1-202 >protein coding

Phrf1-207 >protein coding

Phrf1-204 >protein coding

Phrf1-208 >retained intron

Contigs AC163434.5 > Genes < Irf7-203protein coding < Cdhr5-201protein coding < Sct-203retained intron (Comprehensive set...

< Irf7-201protein coding < Cdhr5-202protein coding < Sct-201protein coding

< Irf7-213retained intron < Cdhr5-205nonsense mediated decay < Sct-204protein coding

< Irf7-202protein coding < Cdhr5-203protein coding < Sct-202protein coding

< Irf7-210retained intron < Cdhr5-204retained intron

< Irf7-212nonsense mediated decay

< Irf7-204protein coding

< Irf7-206retained intron

< Irf7-207lncRNA

< Irf7-209retained intron

< Irf7-211retained intron

< Irf7-205retained intron

< Irf7-208lncRNA

Regulatory Build

141.26Mb 141.27Mb 141.28Mb Reverse strand 27.70 kb

Regulation Legend CTCF Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000167263

< Cdhr5-202protein coding

Reverse strand 7.70 kb

ENSMUSP00000127... Transmembrane heli... MobiDB lite Low complexity (Seg) Cleavage site (Sign... Superfamily Cadherin-like superfamily SMART Cadherin-like PROSITE profiles PS50268 PROSITE patterns Cadherin conserved site

PANTHER PTHR24028

Cadherin-related family member 5 Gene3D 2.60.40.60 CDD cd11304

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 831

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9