https://www.alphaknockout.com

Mouse Rbms1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Rbms1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Rbms1 (NCBI Reference Sequence: NM_020296 ; Ensembl: ENSMUSG00000026970 ) is located on Mouse 2. 14 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 13 (Transcript: ENSMUST00000028347). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Rbms1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-172B13 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Only about half the expected number of mice homozygous for disruptions in this gene are produced in matings of heterozygotes. Embryo sizes are reduced. Females have smaller than normal uteri and decreased levels of progesterone during estrus.

Exon 3 starts from about 20.84% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 44554 bp, and the size of intron 3 for 3'-loxP site insertion: 4903 bp. The size of effective cKO region: ~559 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 14 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Rbms1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7059bp) | A(28.01% 1977) | C(20.34% 1436) | T(28.11% 1984) | G(23.54% 1662)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 60797993 60800992 3000 browser details YourSeq 285 2351 2687 3000 93.4% chr13 + 58164976 58165622 647 browser details YourSeq 259 2354 2687 3000 90.6% chr4 - 135295547 135295874 328 browser details YourSeq 250 2240 2675 3000 91.2% chr2 + 103472122 103472713 592 browser details YourSeq 247 2340 2687 3000 87.2% chr11 - 77090956 77091307 352 browser details YourSeq 247 2352 2687 3000 93.0% chr16 + 17128041 17553752 425712 browser details YourSeq 246 2353 2684 3000 89.5% chrX - 162806541 162806862 322 browser details YourSeq 242 2351 2683 3000 89.4% chr11 - 69283197 69283506 310 browser details YourSeq 239 2353 2686 3000 92.2% chr10 - 80044896 80045265 370 browser details YourSeq 234 2353 2687 3000 88.0% chrX - 162860363 162860687 325 browser details YourSeq 219 2405 2688 3000 88.9% chrX + 152386176 152386456 281 browser details YourSeq 215 2335 2607 3000 89.7% chr12 - 51742508 51742771 264 browser details YourSeq 215 2358 2684 3000 87.5% chr19 + 34921186 34921491 306 browser details YourSeq 210 2420 2687 3000 93.4% chr10 + 62900046 62900700 655 browser details YourSeq 205 2334 2687 3000 84.4% chr2 + 84830550 84830858 309 browser details YourSeq 203 2337 2661 3000 89.5% chr9 - 67667325 67667660 336 browser details YourSeq 201 2401 2685 3000 92.1% chr3 - 40918365 40919006 642 browser details YourSeq 187 2334 2620 3000 94.4% chr17 + 46669080 46669375 296 browser details YourSeq 187 2372 2687 3000 87.6% chr11 + 105187535 105187875 341 browser details YourSeq 182 2359 2638 3000 88.1% chr15 - 98817628 98818045 418

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 60794434 60797433 3000 browser details YourSeq 43 1626 1849 3000 66.7% chr2 + 71654959 71655066 108 browser details YourSeq 26 1002 1028 3000 100.0% chr10 - 103814239 103814266 28 browser details YourSeq 25 1008 1035 3000 84.7% chr18 + 75876750 75876775 26 browser details YourSeq 24 2536 2567 3000 85.8% chr2 - 76595316 76595346 31 browser details YourSeq 22 1001 1024 3000 95.9% chr13 - 42962254 42962277 24

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Rbms1 RNA binding motif, single stranded interacting protein 1 [ Mus musculus (house mouse) ] Gene ID: 56878, updated on 10-Oct-2019

Gene summary

Official Symbol Rbms1 provided by MGI Official Full Name RNA binding motif, single stranded interacting protein 1 provided by MGI Primary source MGI:MGI:1861774 See related Ensembl:ENSMUSG00000026970 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as YC1; MSSP-1; MSSP-2; MSSP-3; AI255215; 2600014B10Rik Expression Ubiquitous expression in lung adult (RPKM 26.0), ovary adult (RPKM 17.9) and 28 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 C1.2 See Rbms1 in Genome Data Viewer

Exon count: 17

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (60750193..60963880, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (60590010..60801261, complement)

Chromosome 2 - NC_000068.7

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Rbms1 ENSMUSG00000026970

Description RNA binding motif, single stranded interacting protein 1 [Source:MGI Symbol;Acc:MGI:1861774] Gene Synonyms 2600014B10Rik, MSSP-1, MSSP-2, MSSP-3, YC1 Location : 60,750,193-60,963,192 reverse strand. GRCm38:CM000995.2 About this gene This gene has 9 transcripts (splice variants), 251 orthologues, 23 paralogues, is a member of 1 Ensembl protein family and is associated with 11 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Rbms1- ENSMUST00000028347.12 4436 403aa ENSMUSP00000028347.6 Protein CCDS50591 Q91W59 TSL:1 201 coding GENCODE basic APPRIS ALT1

Rbms1- ENSMUST00000164147.7 2543 417aa ENSMUSP00000131306.1 Protein CCDS50590 E9PZ21 TSL:1 209 coding GENCODE basic APPRIS P4

Rbms1- ENSMUST00000112509.1 1806 386aa ENSMUSP00000108128.1 Protein - Q3TTX8 TSL:1 202 coding Q91W59 GENCODE basic

Rbms1- ENSMUST00000153555.1 757 No - lncRNA - - TSL:3 208 protein

Rbms1- ENSMUST00000123046.7 644 No - lncRNA - - TSL:3 203 protein

Rbms1- ENSMUST00000151846.1 224 No - lncRNA - - TSL:5 207 protein

Rbms1- ENSMUST00000132529.7 217 No - lncRNA - - TSL:5 206 protein

Rbms1- ENSMUST00000125485.1 202 No - lncRNA - - TSL:5 204 protein

Rbms1- ENSMUST00000128866.1 123 No - lncRNA - - TSL:5 205 protein

Page 6 of 8 https://www.alphaknockout.com

233.00 kb Forward strand 60.75Mb 60.80Mb 60.85Mb 60.90Mb 60.95Mb Contigs AL929012.4 > AL928581.8 > (Comprehensive set... < Rbms1-201protein coding

< Rbms1-209protein coding < Rbms1-204lncRNA < Gm13582-201lncRNA

< Rbms1-202protein coding < Rbms1-205lncRNA

< Rbms1-203lncRNA < Rbms1-207lncRNA

< Rbms1-208lncRNA < Rbms1-206lncRNA

Regulatory Build

60.75Mb 60.80Mb 60.85Mb 60.90Mb 60.95Mb Reverse strand 233.00 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000028347

< Rbms1-201protein coding

Reverse strand 213.00 kb

ENSMUSP00000028... MobiDB lite Low complexity (Seg) Superfamily RNA-binding domain superfamily SMART RNA recognition motif domain Prints Paraneoplastic encephalomyelitis antigen Pfam RNA recognition motif domain PROSITE profiles RNA recognition motif domain PANTHER PTHR24012

PTHR24012:SF745 Gene3D Nucleotide-binding alpha-beta plait domain superfamily CDD MSSP-1, RNA recognition motif 1

cd12474

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 403

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8