https://www.alphaknockout.com

Mouse Rhbdf2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Rhbdf2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Rhbdf2 (NCBI Reference Sequence: NM_172572 ; Ensembl: ENSMUSG00000020806 ) is located on Mouse 11. 19 exons are identified, with the ATG start codon in exon 3 and the TAA stop codon in exon 19 (Transcript: ENSMUST00000103028). Exon 4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Rhbdf2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-245D9 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null mutation display impaired TNF secretion and increased sensitivity to bacterial infection induced mortality.

Exon 4 starts from about 5.93% of the coding region. The knockout of Exon 4 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 995 bp, and the size of intron 4 for 3'-loxP site insertion: 669 bp. The size of effective cKO region: ~625 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 4 5 6 7 8 19 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Rhbdf2 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7125bp) | A(21.94% 1563) | C(28.35% 2020) | T(21.54% 1535) | G(28.17% 2007)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 116606483 116609482 3000 browser details YourSeq 145 74 369 3000 94.5% chr17 - 66491035 66491422 388 browser details YourSeq 143 33 213 3000 95.6% chr14 + 54445092 54445331 240 browser details YourSeq 140 56 217 3000 95.5% chr10 + 78418297 78418460 164 browser details YourSeq 136 74 385 3000 91.0% chr10 - 82852417 82852946 530 browser details YourSeq 135 74 227 3000 94.2% chr19 - 53728104 53728429 326 browser details YourSeq 134 74 217 3000 96.6% chr12 + 112582917 112583060 144 browser details YourSeq 133 74 216 3000 96.6% chr5 - 107215528 107215670 143 browser details YourSeq 133 62 217 3000 90.0% chr4 - 53798759 53798908 150 browser details YourSeq 132 74 217 3000 95.9% chr2 - 152168649 152168792 144 browser details YourSeq 131 74 217 3000 95.9% chr13 - 14044985 14045129 145 browser details YourSeq 131 75 217 3000 95.9% chr14 + 76480876 76481018 143 browser details YourSeq 130 74 217 3000 95.8% chr11 - 53356241 53356827 587 browser details YourSeq 130 74 217 3000 95.2% chr5 + 130125458 130125601 144 browser details YourSeq 130 61 217 3000 92.3% chr17 + 29424870 29425026 157 browser details YourSeq 130 73 217 3000 95.2% chr16 + 17421108 17421253 146 browser details YourSeq 130 72 211 3000 96.5% chr10 + 41228493 41228632 140 browser details YourSeq 129 74 216 3000 95.2% chr10 - 61573386 61573528 143 browser details YourSeq 129 74 217 3000 95.2% chr2 + 71230120 71646075 415956 browser details YourSeq 129 73 211 3000 96.5% chr17 + 27670761 27670899 139

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 116602858 116605857 3000 browser details YourSeq 73 2585 2796 3000 86.9% chr2 - 20567929 20568151 223 browser details YourSeq 48 2732 2804 3000 83.9% chr15 - 102428226 102428296 71 browser details YourSeq 32 2588 2632 3000 91.2% chr4 - 138577554 138577597 44 browser details YourSeq 31 2758 2800 3000 86.1% chr2 - 105085290 105085332 43 browser details YourSeq 31 2443 2473 3000 100.0% chr11 - 65775059 65775089 31 browser details YourSeq 31 2443 2473 3000 100.0% chr12 + 15587417 15587447 31 browser details YourSeq 28 2764 2804 3000 86.7% chr2 - 80695955 80695993 39 browser details YourSeq 28 2588 2644 3000 96.7% chrX + 102386371 102386428 58 browser details YourSeq 25 2614 2644 3000 85.2% chr14 - 121258568 121258596 29 browser details YourSeq 20 2454 2473 3000 100.0% chr1 + 118802126 118802145 20

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Rhbdf2 rhomboid 5 homolog 2 [ Mus musculus (house mouse) ] Gene ID: 217344, updated on 21-Sep-2019

Gene summary

Official Symbol Rhbdf2 provided by MGI Official Full Name rhomboid 5 homolog 2 provided by MGI Primary source MGI:MGI:2442473 See related Ensembl:ENSMUSG00000020806 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as cub; Rhbdl6; 4732465I17Rik Expression Broad expression in ovary adult (RPKM 35.3), spleen adult (RPKM 33.7) and 21 other tissues See more Orthologs human all

Genomic context

Location: 11; 11 E2 See Rhbdf2 in Genome Data Viewer

Exon count: 20

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (116598148..116627019, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (116459479..116488333, complement)

Chromosome 11 - NC_000077.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Rhbdf2 ENSMUSG00000020806

Description rhomboid 5 homolog 2 [Source:MGI Symbol;Acc:MGI:2442473] Gene Synonyms 4732465I17Rik, Rhbdl6, cub, iRhom2 Location Chromosome 11: 116,598,165-116,627,019 reverse strand. GRCm38:CM001004.2 About this gene This gene has 4 transcripts (splice variants), 194 orthologues, 5 paralogues, is a member of 1 Ensembl protein family and is associated with 29 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Rhbdf2-201 ENSMUST00000103028.7 3855 827aa ENSMUSP00000099317.1 Protein coding CCDS25672 Q80WQ6 TSL:1 GENCODE basic APPRIS P1

Rhbdf2-202 ENSMUST00000103029.9 3593 827aa ENSMUSP00000099318.3 Protein coding CCDS25672 Q80WQ6 TSL:1 GENCODE basic APPRIS P1

Rhbdf2-203 ENSMUST00000126819.1 2489 No protein - Retained intron - - TSL:2

Rhbdf2-204 ENSMUST00000138125.1 815 No protein - lncRNA - - TSL:5

Page 6 of 8 https://www.alphaknockout.com

48.85 kb Forward strand

116.59Mb 116.60Mb 116.61Mb 116.62Mb 116.63Mb Aanat-204 >lncRNA (Comprehensive set...

Aanat-205 >protein coding

Aanat-201 >nonsense mediated decay

Aanat-202 >nonsense mediated decay

Aanat-203 >lncRNA

Contigs AL607039.25 > Genes (Comprehensive set... < Rhbdf2-202protein coding

< Rhbdf2-201protein coding

< Rhbdf2-204lncRNA

< Rhbdf2-203retained intron

Regulatory Build

116.59Mb 116.60Mb 116.61Mb 116.62Mb 116.63Mb Reverse strand 48.85 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000103028

< Rhbdf2-201protein coding

Reverse strand 28.85 kb

ENSMUSP00000099... Transmembrane heli... MobiDB lite Low complexity (Seg) Superfamily SSF144091 Pfam Rhomboid serine protease Peptidase S54, rhomboid domain

PANTHER PTHR45965:SF2

PTHR45965 Gene3D Rhomboid-like superfamily CDD cd06173

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 827

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8