http://www.alphaknockout.com/ Mouse Rbm6 Knockout Project (CRISPR/Cas9)

Objective: To create a Rbm6 knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Rbm6 (NCBI Reference Sequence: NM_029169 ; Ensembl: ENSMUSG00000032582 ) is located on Mouse 9. 21 exons are identified, with the ATG start codon in exon 3 and the TAA stop codon in exon 21 (Transcript: ENSMUST00000035201). Exon 3~17 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 0.1% of the coding region. Exon 3~17 covers 85.6% of the coding region. The size of effective KO region: ~71065 bp. The KO region does not have any other known gene.

Page 1 of 9 http://www.alphaknockout.com/

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region 15 3' 13 gRNA region 10 12 17

1 3 4 5 6 7 8 9 11 14 16 21

Legends Exon of mouse Rbm6 Knockout region

Page 2 of 9 http://www.alphaknockout.com/

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section of Exon 17 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 http://www.alphaknockout.com/

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.85% 597) | C(17.2% 344) | G(23.75% 475) | T(29.2% 584)

Note: The 2000 bp section of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.65% 533) | C(19.45% 389) | G(20.85% 417) | T(33.05% 661)

Note: The 2000 bp section of Exon 17 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 http://www.alphaknockout.com/

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr9 - 107851674 107853673 2000 browser details YourSeq 1403 1 1550 2000 97.1% chr1 - 17399426 17401061 1636 browser details YourSeq 1259 133 1550 2000 95.1% chr1 - 144222211 144223608 1398 browser details YourSeq 25 545 578 2000 88.9% chr19 + 52398256 52398288 33 browser details YourSeq 23 1402 1428 2000 79.2% chr1 - 97735929 97735952 24 browser details YourSeq 22 102 126 2000 96.0% chr12 - 92858086 92858111 26 browser details YourSeq 21 899 919 2000 100.0% chr4 + 13945656 13945676 21 browser details YourSeq 20 124 143 2000 100.0% chr1 - 90891650 90891669 20 browser details YourSeq 20 853 878 2000 88.5% chr1 + 57894706 57894731 26

Note: The 2000 bp section of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr9 - 107782611 107784610 2000 browser details YourSeq 313 824 2000 2000 89.8% chr1 - 17397839 17398182 344 browser details YourSeq 171 1136 1324 2000 96.3% chr11 + 117944015 117944217 203 browser details YourSeq 168 1141 1325 2000 95.7% chr5 - 22030312 22030509 198 browser details YourSeq 168 1131 1325 2000 95.8% chr13 - 65885720 65885929 210 browser details YourSeq 168 1136 1328 2000 94.3% chr1 + 59626743 59627377 635 browser details YourSeq 167 1147 1328 2000 96.2% chr5 - 135813866 135814061 196 browser details YourSeq 166 1152 1334 2000 97.3% chr8 + 14119430 14119631 202 browser details YourSeq 166 1152 1326 2000 97.8% chr10 + 4459920 4460108 189 browser details YourSeq 164 1157 1325 2000 98.9% chr7 - 122025290 122098894 73605 browser details YourSeq 164 1141 1326 2000 95.1% chr7 - 4738148 4738346 199 browser details YourSeq 164 1135 1326 2000 95.7% chr5 + 144006536 144006746 211 browser details YourSeq 164 1157 1358 2000 95.2% chr14 + 60725227 60725505 279 browser details YourSeq 163 1136 1324 2000 94.1% chr8 - 82422088 82422285 198 browser details YourSeq 163 1147 1325 2000 96.1% chr5 - 138015442 138015635 194 browser details YourSeq 163 1152 1325 2000 97.2% chr5 + 100878629 100878816 188 browser details YourSeq 162 1147 1326 2000 96.1% chr2 - 157568023 157568404 382 browser details YourSeq 162 1147 1325 2000 95.6% chr12 + 85784539 85784730 192 browser details YourSeq 161 1157 1325 2000 98.3% chr14 - 121394905 121395088 184 browser details YourSeq 161 1151 1325 2000 96.6% chr13 - 97014023 97014430 408

Note: The 2000 bp section of Exon 17 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 http://www.alphaknockout.com/ Gene and information: Rbm6 RNA binding motif protein 6 [ Mus musculus (house mouse) ] Gene ID: 19654, updated on 12-Aug-2019

Gene summary

Official Symbol Rbm6 provided by MGI Official Full Name RNA binding motif protein 6 provided by MGI Primary source MGI:MGI:1338037 See related Ensembl:ENSMUSG00000032582 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as g16; Def-3; NY-LU-12; mKIAA4015; 4930506F14Rik Expression Broad expression in CNS E11.5 (RPKM 17.5), CNS E14 (RPKM 14.4) and 22 other tissues See more Orthologs human all

Genomic context

Location: 9; 9 F1 See Rbm6 in Genome Data Viewer Exon count: 25

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 9 NC_000075.6 (107773559..107873358, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 9 NC_000075.5 (107675890..107775150, complement)

Chromosome 9 - NC_000075.6

Page 6 of 9 http://www.alphaknockout.com/

Transcript information: This gene has 16 transcripts

Gene: Rbm6 ENSMUSG00000032582

Description RNA binding motif protein 6 [Source:MGI Symbol;Acc:MGI:1338037] Gene Synonyms Def-3, NY-LU-12, g16 Location Chromosome 9: 107,773,559-107,873,237 reverse strand. GRCm38:CM001002.2 About this gene This gene has 16 transcripts (splice variants), 209 orthologues, 2 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Rbm6- ENSMUST00000035201.12 3999 986aa ENSMUSP00000035201.5 Protein coding CCDS23507 Q3ULB0 TSL:1 201 GENCODE basic APPRIS P3

Rbm6- ENSMUST00000183032.7 3716 1118aa ENSMUSP00000138400.1 Protein coding CCDS59749 S4R1W5 TSL:1 207 GENCODE basic APPRIS ALT2

Rbm6- ENSMUST00000195883.5 648 77aa ENSMUSP00000141953.1 Protein coding - A0A0A6YXE2 CDS 3' 216 incomplete TSL:2

Rbm6- ENSMUST00000194436.1 647 187aa ENSMUSP00000142283.1 Protein coding - A0A0A6YY54 CDS 5' 214 incomplete TSL:3

Rbm6- ENSMUST00000195866.5 447 59aa ENSMUSP00000141622.1 Protein coding - A0A0A6YWN4 CDS 3' 215 incomplete TSL:3

Rbm6- ENSMUST00000183035.1 412 42aa ENSMUSP00000138236.1 Protein coding - S4R1I5 CDS 3' 208 incomplete TSL:3

Rbm6- ENSMUST00000181986.7 2066 40aa ENSMUSP00000138172.1 Nonsense mediated - S4R1D1 TSL:1 202 decay

Rbm6- ENSMUST00000182445.1 4397 No - Retained intron - - TSL:NA 206 protein

Rbm6- ENSMUST00000182301.6 3527 No - Retained intron - - TSL:1 205 protein

Rbm6- ENSMUST00000182242.1 2282 No - Retained intron - - TSL:1 204 protein

Rbm6- ENSMUST00000183152.2 800 No - Retained intron - - TSL:3 209 protein

Rbm6- ENSMUST00000182092.1 665 No - Retained intron - - TSL:3 203 protein

Rbm6- ENSMUST00000183179.1 645 No - Retained intron - - TSL:3 210 protein

Rbm6- ENSMUST00000192474.1 656 No - lncRNA - - TSL:2 211 protein

Rbm6- ENSMUST00000194250.1 404 No - lncRNA - - TSL:2 213 protein

Rbm6- ENSMUST00000193957.5 383 No - lncRNA - - TSL:2 212 protein

Page 7 of 9 http://www.alphaknockout.com/

119.68 kb Forward strand 107.78Mb 107.80Mb 107.82Mb 107.84Mb 107.86Mb 107.88Mb Gm37850-201 >antisense (Comprehensive set...

Contigs < AC152718.6 AL672195.9 > Genes (Comprehensive set... < Rbm5-214protein coding < Rbm6-203retained intron < Rbm6-213processed transcript < Rbm6-215protein coding

< Rbm5-219retained intron < Rbm6-214protein coding < Gm37436-201TEC < Rbm6-210retained intron < Rbm6-206retained intron

< Rbm5-201protein coding < Rbm6-216protein coding

< Rbm5-210nonsense mediated decay < Rbm6-212processed transcript

< Rbm5-211nonsense mediated decay < Rbm6-205retained intron

< Rbm5-221nonsense mediated decay < Rbm6-208protein coding

< Rbm5-220nonsense mediated decay < Rbm6-211processed transcript

< Rbm5-208protein coding < Rbm6-204retained intron

< Rbm5-203retained intron

< Rbm5-204protein coding

< Rbm5-218protein coding

< Rbm5-206retained intron

< Rbm5-213retained intron

< Rbm5-202retained intron

< Rbm6-201protein coding

< Rbm6-207protein coding

< Rbm6-202nonsense mediated decay

< Rbm6-209retained intron

Regulatory Build

107.78Mb 107.80Mb 107.82Mb 107.84Mb 107.86Mb 107.88Mb Reverse strand 119.68 kb

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Page 8 of 9 http://www.alphaknockout.com/

Transcript: ENSMUST00000035201

< Rbm6-201protein coding

Reverse strand 99.27 kb

ENSMUSP00000035... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF141571 RNA-binding domain superfamily SMART RNA recognition motif domain Pfam OCRE domain PROSITE profiles RNA recognition motif domain Zinc finger C2H2-type

G-patch domain PANTHER PTHR13948

RNA-binding protein 6 Gene3D 2.160.20.100 Nucleotide-binding alpha-beta plait domain superfamily CDD RBM6, RNA recognition motif 1 RBM6, OCRE domain

RBM6, RNA recognition motif 2

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

frameshift variant inframe insertion missense variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 986

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

Page 9 of 9