https://www.alphaknockout.com

Mouse Rrh Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Rrh conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Rrh (NCBI Reference Sequence: NM_009102 ; Ensembl: ENSMUSG00000028012 ) is located on Mouse 3. 7 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 7 (Transcript: ENSMUST00000196902). Exon 3~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Rrh gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-86E18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 29.48% of the coding region. The knockout of Exon 3~4 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 2210 bp, and the size of intron 4 for 3'-loxP site insertion: 828 bp. The size of effective cKO region: ~1242 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Rrh Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7742bp) | A(26.39% 2043) | C(22.26% 1723) | T(28.65% 2218) | G(22.71% 1758)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 - 129813600 129816599 3000 browser details YourSeq 234 27 427 3000 83.5% chr5 - 35309146 35309594 449 browser details YourSeq 232 14 421 3000 83.8% chr10 + 115135784 115136196 413 browser details YourSeq 231 37 418 3000 85.8% chr1 - 37629114 37629496 383 browser details YourSeq 231 29 416 3000 80.7% chr5 + 45003518 45003918 401 browser details YourSeq 230 27 427 3000 79.6% chr7 - 122795983 122796390 408 browser details YourSeq 230 43 420 3000 85.3% chr9 + 21297354 21297749 396 browser details YourSeq 228 58 397 3000 84.4% chr12 - 32550257 32550595 339 browser details YourSeq 227 27 426 3000 79.0% chr5 - 54263362 54263769 408 browser details YourSeq 225 27 415 3000 80.9% chr6 - 39063836 39064233 398 browser details YourSeq 225 82 420 3000 82.6% chrX + 36782118 36782455 338 browser details YourSeq 220 32 416 3000 87.2% chr11 - 106390804 106391198 395 browser details YourSeq 220 27 428 3000 86.5% chr19 + 43959480 43959848 369 browser details YourSeq 219 83 427 3000 82.4% chr10 - 117029221 117029566 346 browser details YourSeq 214 28 397 3000 79.1% chr15 - 95683681 95684046 366 browser details YourSeq 214 58 429 3000 88.5% chr15 + 83152705 83153081 377 browser details YourSeq 213 58 421 3000 88.3% chr4 - 124161690 124162260 571 browser details YourSeq 212 27 379 3000 81.3% chr9 - 49496016 49496379 364 browser details YourSeq 212 58 420 3000 83.6% chr13 - 17675299 17675661 363 browser details YourSeq 211 77 379 3000 85.3% chr15 - 88602292 88602594 303

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 - 129809358 129812357 3000 browser details YourSeq 105 1968 2275 3000 85.7% chr2 - 71042704 71043518 815 browser details YourSeq 104 1952 2301 3000 83.4% chr6 + 112247470 112247862 393 browser details YourSeq 90 1952 2303 3000 86.2% chr16 - 34375263 34375622 360 browser details YourSeq 89 2006 2316 3000 82.5% chr13 - 85356066 85356363 298 browser details YourSeq 84 1968 2265 3000 91.2% chr2 - 77293689 77294047 359 browser details YourSeq 84 1966 2244 3000 85.6% chr6 + 120393160 120393458 299 browser details YourSeq 83 1946 2303 3000 89.5% chr1 + 119148850 119383202 234353 browser details YourSeq 81 1960 2320 3000 91.1% chr6 - 109204348 109204899 552 browser details YourSeq 81 1969 2302 3000 89.9% chr4 + 45781221 45781574 354 browser details YourSeq 80 1968 2303 3000 91.5% chr17 - 68901383 68901733 351 browser details YourSeq 79 1968 2316 3000 81.6% chr6 + 115904971 115905309 339 browser details YourSeq 77 2050 2321 3000 89.8% chr11 + 20751812 20752106 295 browser details YourSeq 76 1969 2318 3000 84.6% chr1 - 21067372 21067731 360 browser details YourSeq 76 2173 2321 3000 82.2% chr12 + 40652671 40652818 148 browser details YourSeq 75 2104 2322 3000 89.5% chrX - 111655409 111655706 298 browser details YourSeq 75 1952 2309 3000 86.3% chr10 - 127122772 127123137 366 browser details YourSeq 74 1968 2302 3000 85.0% chr6 + 27773625 27773968 344 browser details YourSeq 74 2049 2321 3000 87.0% chr1 + 15705080 15705359 280 browser details YourSeq 73 1988 2277 3000 84.6% chr8 + 25826621 25826901 281

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Rrh retinal pigment epithelium derived homolog [ Mus musculus (house mouse) ] Gene ID: 20132, updated on 12-Aug-2019

Gene summary

Official Symbol Rrh provided by MGI Official Full Name retinal pigment epithelium derived rhodopsin homolog provided by MGI Primary source MGI:MGI:1097709 See related Ensembl:ENSMUSG00000028012 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Expression Biased expression in kidney adult (RPKM 1.6), testis adult (RPKM 0.2) and 6 other tissues See more Orthologs human all

Genomic context

Location: 3 G3; 3 59.09 cM See Rrh in Genome Data Viewer Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (129808575..129822505, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (129511493..129525423, complement)

Chromosome 3 - NC_000069.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Rrh ENSMUSG00000028012

Description retinal pigment epithelium derived rhodopsin homolog [Source:MGI Symbol;Acc:MGI:1097709] Gene Synonyms Peropsin Location Chromosome 3: 129,804,408-129,822,587 reverse strand. GRCm38:CM000996.2 About this gene This gene has 8 transcripts (splice variants), 195 orthologues, 6 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Rrh- ENSMUST00000171313.7 1759 337aa ENSMUSP00000132360.1 Protein coding CCDS38633 O35214 TSL:1 202 Q543W9 GENCODE basic APPRIS P2

Rrh- ENSMUST00000196902.4 1250 337aa ENSMUSP00000143093.1 Protein coding CCDS38633 O35214 TSL:1 205 Q543W9 GENCODE basic APPRIS P2

Rrh- ENSMUST00000029648.10 1506 379aa ENSMUSP00000029648.7 Protein coding - Q80XL3 TSL:1 201 GENCODE basic APPRIS ALT2

Rrh- ENSMUST00000200079.4 1106 240aa ENSMUSP00000143054.1 Protein coding - A0A0G2JF76 TSL:1 208 GENCODE basic

Rrh- ENSMUST00000197535.1 763 130aa ENSMUSP00000143245.1 Nonsense mediated - A0A0G2JFN6 CDS 5' 207 decay incomplete TSL:3

Rrh- ENSMUST00000196281.1 3005 No - Retained intron - - TSL:NA 203 protein

Rrh- ENSMUST00000196317.4 1759 No - lncRNA - - TSL:1 204 protein

Rrh- ENSMUST00000197295.1 1207 No - lncRNA - - TSL:3 206 protein

Page 6 of 8 https://www.alphaknockout.com

38.18 kb Forward strand 129.80Mb 129.81Mb 129.82Mb 129.83Mb Contigs < AC111097.10 Genes (Comprehensive set... < Lrit3-202protein coding < Rrh-202protein coding < Gar1-201protein coding

< Lrit3-201protein coding < Rrh-204lncRNA < Gar1-204protein coding

< Lrit3-203retained intron < Rrh-208protein coding < Gar1-205retained intron

< Rrh-201protein coding < Gar1-202retained intron

< Rrh-205protein coding < Gar1-203lncRNA

< Rrh-207nonsense mediated decay < Gar1-206protein coding

< Rrh-206lncRNA

< Rrh-203retained intron

Regulatory Build

129.80Mb 129.81Mb 129.82Mb 129.83Mb Reverse strand 38.18 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000196902

< Rrh-205protein coding

Reverse strand 13.79 kb

ENSMUSP00000143... Transmembrane heli... Low complexity (Seg) Superfamily SSF81321

Prints G protein-coupled receptor, rhodopsin-like

Peropsin Pfam G protein-coupled receptor, rhodopsin-like PROSITE profiles GPCR, rhodopsin-like, 7TM PROSITE patterns G protein-coupled receptor, rhodopsin-like Visual pigments () retinal binding site

PANTHER PTHR24240

Visual pigment-like receptor peropsin Gene3D 1.20.1070.10

CDD cd15073

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 337

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8