https://www.alphaknockout.com Mouse Riok1 Knockout Project (CRISPR/Cas9)

Objective: To create a Riok1 knockout mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Riok1 (NCBI Reference Sequence: NM_024242.3 ; Ensembl: ENSMUSG00000021428 ) is located on mouse 13. 17 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 17 (Transcript: ENSMUST00000021866). Exon 2~3 will be selected as target site . Cas9 and gRNA will be co-injected into fertilized eggs for KO mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 4.23% of the coding region. Exon 2~3 covers 17.23% of the coding region. The size of effective KO region: ~3069 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 17

Legends Exon of mouse Riok1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Note: The 767 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Summary: Full Length(2000bp) | A(29.4% 588) | C(18.0% 360) | T(28.05% 561) | G(24.55% 491)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Summary: Full Length(767bp) | A(28.94% 222) | C(15.78% 121) | T(35.2% 270) | G(20.08% 154)

Note: The 767 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr13 + 38038074 38040073 2000 browser details YourSeq 55 957 1224 2000 71.5% chr16 - 4524425 4524592 168 browser details YourSeq 54 957 1109 2000 96.8% chr1 + 55070703 55185474 114772 browser details YourSeq 50 964 1237 2000 69.0% chr16 + 8784255 8784413 159 browser details YourSeq 47 957 1109 2000 94.5% chr11 - 3626741 3626893 153 browser details YourSeq 44 961 1109 2000 94.2% chr19 - 54336778 54336926 149 browser details YourSeq 44 957 1110 2000 81.7% chr7 + 29546531 29546678 148 browser details YourSeq 41 957 1067 2000 91.2% chr12 + 101102223 101102332 110 browser details YourSeq 39 958 1095 2000 95.4% chr15 - 56873400 56873538 139 browser details YourSeq 38 958 1095 2000 91.4% chr16 + 32075984 32076122 139 browser details YourSeq 37 957 1007 2000 88.4% chr19 - 45979405 45979454 50 browser details YourSeq 35 956 1009 2000 86.1% chr14 - 122316706 122316758 53 browser details YourSeq 35 957 995 2000 89.5% chr1 - 58432124 58432161 38 browser details YourSeq 35 957 995 2000 89.5% chr12 + 4682352 4682389 38 browser details YourSeq 34 958 995 2000 89.2% chr14 - 89249393 89249429 37 browser details YourSeq 34 957 994 2000 89.2% chr1 - 158213248 158213284 37 browser details YourSeq 34 964 1067 2000 89.5% chr16 + 59358374 59358476 103 browser details YourSeq 33 957 1009 2000 83.8% chr12 - 91695110 91695161 52 browser details YourSeq 33 957 1001 2000 83.4% chr12 - 19602632 19602673 42 browser details YourSeq 33 964 1005 2000 89.2% chr3 + 108843506 108843546 41

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 767 1 767 767 100.0% chr13 + 38043143 38043909 767 browser details YourSeq 26 444 476 767 96.6% chr10 - 22915231 22915266 36 browser details YourSeq 23 225 248 767 100.0% chr11 - 112971641 112971669 29 browser details YourSeq 22 251 272 767 100.0% chr2 - 44248571 44248592 22 browser details YourSeq 22 739 763 767 95.9% chr11 - 110471647 110471672 26 browser details YourSeq 20 481 500 767 100.0% chr12 - 93223343 93223362 20

Missing a match?

Note: The 767 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com Gene and information: Riok1 RIO kinase 1 [ Mus musculus (house mouse) ] Gene ID: 71340, updated on 4-Feb-2018

Gene summary

Official Symbol Riok1 provided by MGI Official Full Name RIO kinase 1 provided by MGI Primary source MGI:MGI:1918590 See related Ensembl:ENSMUSG00000021428 Gene type protein coding RefSeq status REVIEWED Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Ad034; 3110046C13Rik; 5430416A05Rik Summary This gene encodes a member of the RIO family of atypical serine protein kinases. A similar protein in humans is a Expression component of the protein arginine methyltransferase 5 complex that specifically recruits the RNA-binding protein as a methylation substrate. [provided by RefSeq, Feb 2011] Orthologs Ubiquitous expression in CNS E11.5 (RPKM 10.2), CNS E14 (RPKM 8.0) and 28 other tissues See more human all

Genomic context

Location: 13; 13 A3.3 See Riok1 in Genome Data Viewer Map Viewer Exon count: 17

Annotation release Status Assembly Chr Location

106 current GRCm38.p4 (GCF_000001635.24) 13 NC_000079.6 (38036989..38061433)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 13 NC_000079.5 (38129164..38153298)

Chromosome 13 - NC_000079.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 14 transcripts

Gene: Riok1 ENSMUSG00000021428

Description RIO kinase 1 (yeast) [Source:MGI Symbol;Acc:MGI:1918590] Synonyms 5430416A05Rik, 3110046C13Rik Location Chromosome 13: 38,036,995-38,061,433 forward strand. GRCm38:CM001006.2 About this gene This gene has 14 transcripts (splice variants), 93 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt RefSeq Flags

Riok1- ENSMUST00000021866.9 2738 567aa ENSMUSP00000021866 Protein coding CCDS26461 Q922Q2 NM_024242 TSL:1 201 NP_077204 GENCODE basic APPRIS P1

Riok1- ENSMUST00000224956.1 619 206aa ENSMUSP00000152922 Protein coding - - - CDS 5' and 206 3' incomplete

Riok1- ENSMUST00000226110.1 386 113aa ENSMUSP00000153217 Protein coding - - - CDS 3' 214 incomplete

Riok1- ENSMUST00000223656.1 2373 85aa ENSMUSP00000153494 Nonsense mediated - - - 202 decay

Riok1- ENSMUST00000224477.1 886 152aa ENSMUSP00000153030 Nonsense mediated - - - 204 decay

Riok1- ENSMUST00000225954.1 569 No - Processed transcript - - - 211 protein

Riok1- ENSMUST00000223910.1 2509 No - Retained intron - - - 203 protein

Riok1- ENSMUST00000226006.1 2307 No - Retained intron - - - 212 protein

Riok1- ENSMUST00000225174.1 2006 No - Retained intron - - - 208 protein

Riok1- ENSMUST00000225816.1 1791 No - Retained intron - - - 210 protein

Riok1- ENSMUST00000224962.1 1464 No - Retained intron - - - 207 protein

Riok1- ENSMUST00000226056.1 1198 No - Retained intron - - - 213 protein

Riok1- ENSMUST00000225418.1 734 No - Retained intron - - - 209 protein

Riok1- ENSMUST00000224683.1 619 No - Retained intron - - - 205 protein

Page 7 of 9 https://www.alphaknockout.com

44.44 kb Forward strand 38.03Mb 38.04Mb 38.05Mb 38.06Mb 38.07Mb (Comprehensive set... Riok1-201 >protein coding

Riok1-202 >nonsense mediated decay

Riok1-203 >retained intron Riok1-210 >retained intron Riok1-207 >retained intron

Riok1-211 >processed transcript Riok1-209 >retained intron Riok1-208 >retained intron

Riok1-212 >retained intron Riok1-213 >retained intron

Riok1-204 >nonsense mediated decay

Riok1-214 >protein coding

Riok1-205 >retained intron

Riok1-206 >protein coding

Contigs < CT010477.16 AC140331.2 > Genes < Cage1-203protein coding (Comprehensive set...

< Cage1-201protein coding

< Cage1-202protein coding

< Cage1-204protein coding

Regulatory Build

38.03Mb 38.04Mb 38.05Mb 38.06Mb 38.07Mb Reverse strand 44.44 kb

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Regulation Legend

CTCF Enhancer Promoter Promoter Flank Motif feature

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000021866

24.44 kb Forward strand

Riok1-201 >protein coding

ENSMUSP00000021... Low complexity (Seg) Conserved Domains Coiled-coils (Ncoils) hmmpanther PTHR10593:SF85

PTHR10593 Superfamily domains Protein kinase-like domain SMART domains RIO kinase Pfam domain PF01163 PROSITE patterns RIO kinase, conserved site PIRSF domain Serine/threonine-protein kinase Rio1 Gene3D 3.30.200.20 1.10.510.10

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe deletion missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 567

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

Page 9 of 9