https://www.alphaknockout.com

Mouse Rabepk Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Rabepk conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Rabepk (NCBI Reference Sequence: NM_145522 ; Ensembl: ENSMUSG00000070953 ) is located on Mouse 2. 8 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 8 (Transcript: ENSMUST00000145903). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Rabepk gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-347M2 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 6.84% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 2045 bp, and the size of intron 3 for 3'-loxP site insertion: 4516 bp. The size of effective cKO region: ~658 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 8 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Rabepk Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7158bp) | A(27.06% 1937) | C(22.32% 1598) | T(27.83% 1992) | G(22.79% 1631)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 34795580 34798579 3000 browser details YourSeq 143 210 2641 3000 88.7% chr17 + 80163711 80212998 49288 browser details YourSeq 127 208 565 3000 92.2% chr1 + 90926800 91007556 80757 browser details YourSeq 124 402 579 3000 84.3% chr19 - 34892759 34892926 168 browser details YourSeq 124 413 578 3000 87.9% chr10 - 64499971 64500141 171 browser details YourSeq 111 149 548 3000 88.9% chr1 - 165619062 165791893 172832 browser details YourSeq 105 207 578 3000 88.9% chr15 + 73444296 73681967 237672 browser details YourSeq 103 425 570 3000 88.8% chr15 - 55188469 55188614 146 browser details YourSeq 98 413 548 3000 89.7% chr9 + 14310992 14342836 31845 browser details YourSeq 95 431 576 3000 83.6% chr16 + 89260275 89260417 143 browser details YourSeq 94 418 577 3000 94.4% chr14 - 16296454 16296615 162 browser details YourSeq 93 389 548 3000 90.5% chr1 + 95151008 95151169 162 browser details YourSeq 88 461 570 3000 88.1% chr1 - 52789135 52789243 109 browser details YourSeq 88 418 579 3000 89.4% chr11 + 4521749 4521912 164 browser details YourSeq 88 466 579 3000 88.9% chr1 + 155522161 155522273 113 browser details YourSeq 85 426 548 3000 87.5% chr6 - 47869438 47869560 123 browser details YourSeq 85 461 580 3000 85.6% chr10 - 31399645 31400189 545 browser details YourSeq 82 465 571 3000 88.7% chr1 + 59908946 59909053 108 browser details YourSeq 82 461 588 3000 85.6% chr1 + 34428991 34429115 125 browser details YourSeq 81 427 581 3000 85.9% chr11 - 63295265 63295417 153

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 34791922 34794921 3000 browser details YourSeq 370 1 716 3000 87.4% chr8 - 68110118 68347705 237588 browser details YourSeq 247 2291 2602 3000 92.5% chr15 + 76665495 76665870 376 browser details YourSeq 244 2067 2604 3000 94.6% chr11 + 23502136 23502808 673 browser details YourSeq 234 2350 2745 3000 93.7% chr11 - 72555273 72555698 426 browser details YourSeq 233 21 594 3000 83.8% chr5 + 91704321 91704923 603 browser details YourSeq 230 19 754 3000 81.6% chr1 - 10177753 10178267 515 browser details YourSeq 229 1 318 3000 87.3% chr3 - 66329908 66330228 321 browser details YourSeq 213 1 311 3000 85.0% chr13 - 106673684 106673975 292 browser details YourSeq 195 1 305 3000 85.1% chr1 - 97565280 97565587 308 browser details YourSeq 194 1 318 3000 84.7% chr5 + 112176663 112177022 360 browser details YourSeq 193 1 318 3000 85.2% chr14 - 33584925 33585222 298 browser details YourSeq 192 1 705 3000 83.3% chr16 - 18742198 18742841 644 browser details YourSeq 191 1 318 3000 82.7% chr5 - 113594046 113677020 82975 browser details YourSeq 190 1971 2394 3000 95.7% chr9 + 78619762 78620212 451 browser details YourSeq 189 1 305 3000 83.0% chr8 + 22841032 22841337 306 browser details YourSeq 188 1 310 3000 82.2% chr14 - 60134102 60134393 292 browser details YourSeq 188 2413 2605 3000 99.0% chr3 + 19202731 19202936 206 browser details YourSeq 187 11 303 3000 86.6% chr9 - 15364253 15364525 273 browser details YourSeq 187 2414 2603 3000 99.5% chr16 - 17866768 17866958 191

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Rabepk Rab9 effector protein with kelch motifs [ Mus musculus (house mouse) ] Gene ID: 227746, updated on 9-Oct-2019

Gene summary

Official Symbol Rabepk provided by MGI Official Full Name Rab9 effector protein with kelch motifs provided by MGI Primary source MGI:MGI:2139530 See related Ensembl:ENSMUSG00000070953 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as C87311; Rab9p40; AV073337; 8430412M01Rik; 9530020D24Rik Expression Broad expression in testis adult (RPKM 48.4), genital fat pad adult (RPKM 5.2) and 18 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 B See Rabepk in Genome Data Viewer

Exon count: 9

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (34774609..34800196, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (34634186..34655311, complement)

Chromosome 2 - NC_000068.7

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Rabepk ENSMUSG00000070953

Description Rab9 effector protein with kelch motifs [Source:MGI Symbol;Acc:MGI:2139530] Gene Synonyms 8430412M01Rik, 9530020D24Rik Location Chromosome 2: 34,777,556-34,799,912 reverse strand. GRCm38:CM000995.2 About this gene This gene has 7 transcripts (splice variants), 208 orthologues, 10 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Rabepk- ENSMUST00000145903.7 2968 380aa ENSMUSP00000122360.1 Protein coding CCDS15951 Q8VCH5 TSL:1 207 GENCODE basic

Rabepk- ENSMUST00000118108.1 1431 372aa ENSMUSP00000113099.1 Protein coding - B0R0S4 TSL:5 203 GENCODE basic APPRIS P1

Rabepk- ENSMUST00000113086.8 1106 321aa ENSMUSP00000108709.2 Protein coding - A2AUF7 TSL:5 202 GENCODE basic

Rabepk- ENSMUST00000140663.7 1223 52aa ENSMUSP00000122119.1 Nonsense mediated - D6RGR6 TSL:2 204 decay

Rabepk- ENSMUST00000047963.6 763 64aa ENSMUSP00000037746.6 Nonsense mediated - G8JL41 CDS 5' 201 decay incomplete TSL:3

Rabepk- ENSMUST00000141099.1 542 76aa ENSMUSP00000120887.1 Nonsense mediated - D6RHN6 TSL:3 205 decay

Rabepk- ENSMUST00000144113.1 600 No - Retained intron - - TSL:2 206 protein

Page 6 of 8 https://www.alphaknockout.com

42.36 kb Forward strand

34.77Mb 34.78Mb 34.79Mb 34.80Mb Hspa5-201 >protein coding (Comprehensive set...

Hspa5-202 >protein coding

Hspa5-204 >nonsense mediated decay

Hspa5-205 >retained intron

Hspa5-206 >retained intron

Hspa5-203 >retained intron

Contigs AL929106.5 > Genes (Comprehensive set... < Rabepk-207protein coding < Fbxw2-201protein coding

< Rabepk-202protein coding < Fbxw2-205protein coding

< Rabepk-203protein coding < Fbxw2-202protein coding

< Rabepk-201nonsense mediated decay < Fbxw2-206protein coding

< Rabepk-205nonsense mediated decay < Fbxw2-204protein coding

< Rabepk-204nonsense mediated decay < Fbxw2-203protein coding

< Rabepk-206retained intron < Fbxw2-213lncRNA

< Fbxw2-210lncRNA

Regulatory Build

34.77Mb 34.78Mb 34.79Mb 34.80Mb Reverse strand 42.36 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000145903

< Rabepk-207protein coding

Reverse strand 22.36 kb

ENSMUSP00000122... Low complexity (Seg) Superfamily Kelch-type beta propeller Pfam Kelch repeat type 2 PF13415 Kelch repeat type 1

PANTHER PTHR46647 Gene3D Kelch-type beta propeller

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 380

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8