https://www.alphaknockout.com

Mouse Cdc42ep3 Knockout Project (CRISPR/Cas9)

Objective: To create a Cdc42ep3 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cdc42ep3 (NCBI Reference Sequence: NM_026514 ; Ensembl: ENSMUSG00000036533 ) is located on Mouse 17. 2 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 2 (Transcript: ENSMUST00000068958). Exon 2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 0.13% of the coding region. Exon 2 covers 100.0% of the coding region. The size of effective KO region: ~760 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2

Legends Exon of mouse Cdc42ep3 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.0% 480) | C(23.25% 465) | T(28.55% 571) | G(24.2% 484)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(30.35% 607) | C(19.1% 382) | T(27.65% 553) | G(22.9% 458)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr17 - 79335490 79337489 2000 browser details YourSeq 160 1308 1497 2000 93.1% chr8 - 121937195 121937389 195 browser details YourSeq 159 1300 1496 2000 89.2% chr5 - 142491212 142491398 187 browser details YourSeq 156 1302 1519 2000 84.1% chr19 - 6039900 6040107 208 browser details YourSeq 155 1302 1494 2000 90.2% chr7 + 98598808 98598986 179 browser details YourSeq 154 1307 1496 2000 88.0% chr1 - 73892242 73892423 182 browser details YourSeq 154 1301 1496 2000 88.9% chr6 + 119295465 119295653 189 browser details YourSeq 154 1302 1518 2000 86.2% chr17 + 30560147 30560331 185 browser details YourSeq 153 1302 1496 2000 87.4% chr7 - 45472468 45472650 183 browser details YourSeq 153 1302 1498 2000 86.3% chr7 - 6174451 6174639 189 browser details YourSeq 153 1302 1494 2000 88.2% chr4 + 140997164 140997340 177 browser details YourSeq 151 1320 1496 2000 92.7% chr11 - 100985076 100985252 177 browser details YourSeq 151 1302 1498 2000 90.0% chr10 - 120346919 120347120 202 browser details YourSeq 151 1315 1516 2000 89.6% chr1 - 128265581 128265791 211 browser details YourSeq 151 1135 1494 2000 91.3% chr10 + 67114427 67114904 478 browser details YourSeq 151 1302 1497 2000 90.4% chr1 + 133957773 133958361 589 browser details YourSeq 150 1302 1497 2000 88.3% chr8 - 88251156 88251345 190 browser details YourSeq 149 1302 1496 2000 92.2% chr16 - 5208955 5209167 213 browser details YourSeq 149 1302 1496 2000 87.3% chr1 - 33723129 33723315 187 browser details YourSeq 149 1327 1496 2000 94.2% chr5 + 35988859 35989031 173

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr17 - 79332728 79334727 2000 browser details YourSeq 62 1181 1389 2000 76.4% chr18 + 50745354 50745511 158 browser details YourSeq 56 1225 1400 2000 92.5% chr13 - 9321285 9321529 245 browser details YourSeq 44 1324 1400 2000 78.2% chr6 + 97454039 97454105 67 browser details YourSeq 43 1324 1400 2000 85.8% chr2 - 50366829 50366904 76 browser details YourSeq 43 1226 1400 2000 63.3% chr19 - 34561942 34561993 52 browser details YourSeq 39 208 253 2000 95.6% chr2 - 100216962 100217010 49 browser details YourSeq 38 1320 1391 2000 68.8% chr2 + 4932705 4932757 53 browser details YourSeq 34 1230 1280 2000 94.9% chr19 - 57319271 57319322 52 browser details YourSeq 34 1225 1338 2000 68.5% chr1 - 170854020 170854100 81 browser details YourSeq 31 1240 1280 2000 97.2% chr13 - 53883968 53884012 45 browser details YourSeq 31 1355 1392 2000 76.5% chr18 + 43401667 43401700 34 browser details YourSeq 31 1225 1277 2000 94.3% chr13 + 52334639 52334713 75 browser details YourSeq 30 1360 1394 2000 94.2% chr17 - 37303128 37303162 35 browser details YourSeq 30 1230 1278 2000 65.7% chr15 - 7354311 7354342 32 browser details YourSeq 30 1235 1268 2000 97.0% chr15 + 54218848 54218884 37 browser details YourSeq 29 225 254 2000 100.0% chr15 + 10471534 10471846 313 browser details YourSeq 28 903 1157 2000 44.8% chr10 - 99454951 99454989 39 browser details YourSeq 28 1224 1268 2000 70.6% chr16 + 90892697 90892733 37 browser details YourSeq 27 1258 1298 2000 85.4% chr17 - 11686059 11686101 43

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Cdc42ep3 CDC42 effector protein (Rho GTPase binding) 3 [ Mus musculus (house mouse) ] Gene ID: 260409, updated on 12-Aug-2019

Gene summary

Official Symbol Cdc42ep3 provided by MGI Official Full Name CDC42 effector protein (Rho GTPase binding) 3 provided by MGI Primary source MGI:MGI:2384718 See related Ensembl:ENSMUSG00000036533 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as UB1; Cep3; Borg2; AA986861; 3200001F04Rik Expression Broad expression in bladder adult (RPKM 61.2), testis adult (RPKM 25.9) and 20 other tissues See more Orthologs human all

Genomic context

Location: 17; 17 E3 See Cdc42ep3 in Genome Data Viewer Exon count: 3

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (79333723..79355091, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (79733365..79754431, complement)

Chromosome 17 - NC_000083.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Cdc42ep3 ENSMUSG00000036533

Description CDC42 effector protein (Rho GTPase binding) 3 [Source:MGI Symbol;Acc:MGI:2384718] Gene Synonyms 3200001F04Rik, Borg2, Cep3, UB1 Location Chromosome 17: 79,333,727-79,355,091 reverse strand. GRCm38:CM001010.2 About this gene This gene has 2 transcripts (splice variants), 192 orthologues, 5 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cdc42ep3-201 ENSMUST00000068958.8 2265 254aa ENSMUSP00000067217.7 Protein coding CCDS28984 Q9CQC5 TSL:1 GENCODE basic APPRIS P1

Cdc42ep3-202 ENSMUST00000233363.1 564 97aa ENSMUSP00000156957.1 Protein coding - A0A3B2WBL9 CDS 3' incomplete

41.37 kb Forward strand 79.33Mb 79.34Mb 79.35Mb 79.36Mb Contigs AC091332.8 > CT030740.8 > (Comprehensive set... < Cdc42ep3-201protein coding

< Cdc42ep3-202protein coding

Regulatory Build

79.33Mb 79.34Mb 79.35Mb 79.36Mb Reverse strand 41.37 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000068958

< Cdc42ep3-201protein coding

Reverse strand 21.36 kb

ENSMUSP00000067... MobiDB lite Low complexity (Seg) SMART CRIB domain Pfam CRIB domain Cdc42 effector

PROSITE profiles CRIB domain PANTHER PTHR15344

PTHR15344:SF3

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 254

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8