https://www.alphaknockout.com Mouse Rc3h1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Rc3h1 conditional knockout mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Rc3h1 (NCBI Reference Sequence: NM_001024952 ; Ensembl: ENSMUSG00000040423 ) is located on mouse 1. 20 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 20 (Transcript: ENSMUST00000161609). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the mouse Rc3h1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-162J18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: A single recessive mutation on this gene resulted in severe autoimmune disease with phenotype resembling human systemic lupus erythematosus.

Exon 3 starts from about 6.84% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 7767 bp, and the size of intron 3 for 3'-loxP site insertion: 1762 bp. The size of effective cKO region: ~621 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 4 20 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Rc3h1 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Summary: Full Length(7121bp) | A(28.31% 2016) | C(19.06% 1357) | T(31.86% 2269) | G(20.77% 1479)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 + 160934863 160937862 3000 browser details YourSeq 252 378 996 3000 86.6% chr14 + 75705366 75705857 492 browser details YourSeq 240 360 985 3000 84.6% chr16 - 89953339 89953692 354 browser details YourSeq 237 369 998 3000 91.7% chr19 + 46333312 46334064 753 browser details YourSeq 232 357 985 3000 86.1% chr5 + 29697943 29698331 389 browser details YourSeq 220 388 985 3000 85.9% chr16 - 44187784 44188313 530 browser details YourSeq 220 369 985 3000 86.6% chr4 + 46372209 46372741 533 browser details YourSeq 189 453 999 3000 84.4% chr3 + 138863477 138863772 296 browser details YourSeq 179 426 997 3000 88.7% chr14 - 73554196 73554829 634 browser details YourSeq 166 379 996 3000 83.6% chr12 + 85634637 85635158 522 browser details YourSeq 155 385 993 3000 86.4% chrX - 8220697 8221356 660 browser details YourSeq 146 777 1000 3000 92.5% chr8 + 104379796 104380078 283 browser details YourSeq 145 814 996 3000 90.7% chr19 - 56694303 56694483 181 browser details YourSeq 144 362 543 3000 91.0% chr7 + 46794849 46795056 208 browser details YourSeq 141 814 997 3000 93.9% chr9 - 122919472 122919661 190 browser details YourSeq 141 815 1087 3000 90.8% chr9 - 55183543 55183902 360 browser details YourSeq 141 814 996 3000 90.2% chr5 + 121508890 121509068 179 browser details YourSeq 140 814 996 3000 91.2% chr6 + 118463426 118463618 193 browser details YourSeq 140 369 530 3000 93.8% chr19 + 9817445 9817607 163 browser details YourSeq 138 369 521 3000 96.1% chr2 - 157969607 157970094 488

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 + 160938484 160941483 3000 browser details YourSeq 335 188 737 3000 87.9% chr2 - 173828931 173829403 473 browser details YourSeq 317 188 565 3000 92.8% chr4 - 138918799 138919182 384 browser details YourSeq 313 188 589 3000 91.0% chr6 + 120826284 120826839 556 browser details YourSeq 309 186 565 3000 91.7% chr1 - 151302962 151303348 387 browser details YourSeq 309 188 763 3000 92.7% chr2 + 130140060 130140647 588 browser details YourSeq 308 185 575 3000 90.4% chr8 - 44783208 44783601 394 browser details YourSeq 308 188 565 3000 91.5% chr16 - 36763217 36763596 380 browser details YourSeq 308 188 559 3000 93.1% chr15 - 84240125 84240497 373 browser details YourSeq 307 188 565 3000 92.1% chr15 - 10961567 10961946 380 browser details YourSeq 307 187 572 3000 90.8% chr2 + 153280746 153281147 402 browser details YourSeq 307 187 565 3000 92.4% chr14 + 10820916 10821297 382 browser details YourSeq 306 187 565 3000 91.4% chr6 - 77971281 77971662 382 browser details YourSeq 306 185 565 3000 91.2% chr5 - 50811532 50811915 384 browser details YourSeq 306 187 565 3000 91.4% chr1 - 127283522 127283903 382 browser details YourSeq 306 188 565 3000 92.1% chrX + 10949299 10949679 381 browser details YourSeq 306 181 565 3000 91.0% chr18 + 48750474 48750860 387 browser details YourSeq 305 188 565 3000 91.6% chr3 - 126125612 126125993 382 browser details YourSeq 305 185 565 3000 90.8% chr6 + 53465697 53466082 386 browser details YourSeq 305 168 565 3000 89.9% chr3 + 4609358 4609747 390

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com Gene and information: Rc3h1 RING CCCH (C3H) domains 1 [ Mus musculus (house mouse) ] Gene ID: 381305, updated on 9-Jul-2017

Gene summary

Official Symbol Rc3h1 provided by MGI Official Full Name RING CCCH (C3H) domains 1 provided by MGI Primary source MGI:MGI:2685397 See related Ensembl:ENSMUSG00000040423 Vega:OTTMUSG00000034799 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gm551; N28103; mKIAA2025; 5730557L09Rik Orthologs human all

Genomic context

Location: 1; 1 H2.1 See Rc3h1 in Genome Data Viewer Map Viewer Exon count: 21

Annotation release Status Assembly Chr Location

106 current GRCm38.p4 (GCF_000001635.24) 1 NC_000067.6 (160906411..160974976)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (162836542..162905107)

Chromosome 1 - NC_000067.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Rc3h1 ENSMUSG00000040423

Description RING CCCH (C3H) domains 1 [Source:MGI Symbol;Acc:MGI:2685397] Gene Synonyms 5730557L09Rik, roquin Location : 160,906,418-160,974,978 forward strand. GRCm38:CM000994.2 About this gene This gene has 3 transcripts (splice variants), 189 orthologues, 4 paralogues, is a member of 1 Ensembl protein family and is associated with 40 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Rc3h1-202 ENSMUST00000161609.7 11006 1130aa ENSMUSP00000124871 Protein coding CCDS15410 Q4VGL6 TSL:1 GENCODE basic APPRIS P2

Rc3h1-201 ENSMUST00000035911.4 3476 1121aa ENSMUSP00000037178 Protein coding - H7BX02 TSL:5 GENCODE basic APPRIS ALT2

Rc3h1-203 ENSMUST00000161708.1 663 No protein - Retained intron - - TSL:3

88.56 kb Forward strand 160.90Mb 160.92Mb 160.94Mb 160.96Mb 160.98Mb (Comprehensive set... Rc3h1-202 >protein coding

Gm37052-201 >TEC Rc3h1-201 >protein coding Serpinc1-203 >retained intron

Gm37809-201 >TEC Rc3h1-203 >retained intron Serpinc1-207 >nonsense mediated decay

Gm37653-201 >TEC Serpinc1-201 >protein coding

Serpinc1-202 >protein coding

Serpinc1-208 >protein coding

Serpinc1-205 >protein coding

Contigs < AC119204.7 Regulatory Build

160.90Mb 160.92Mb 160.94Mb 160.96Mb 160.98Mb Reverse strand 88.56 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000161609

68.56 kb Forward strand

Rc3h1-202 >protein coding

ENSMUSP00000124... SIFTS import MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF57850 , CCCH-type superfamily

SMART Zinc finger, RING-type Zinc finger, CCCH-type

Pfam Roquin II Zinc finger, CCCH-type

RING-type zinc-finger, LisH dimerisation motif PROSITE profiles Zinc finger, RING-type Zinc finger, CCCH-type

PROSITE patterns Zinc finger, RING-type, conserved site PANTHER PTHR13139

Roquin-1 Gene3D 1.20.120.1790 4.10.1000.10

Zinc finger, RING/FYVE/PHD-type CDD cd16638

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1000 1130

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7