https://www.alphaknockout.com

Mouse Sf3b4 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Sf3b4 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Sf3b4 (NCBI Reference Sequence: NM_153053 ; Ensembl: ENSMUSG00000068856 ) is located on Mouse 3. 6 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 6 (Transcript: ENSMUST00000076372). Exon 3~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Sf3b4 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-293K7 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3~4 is not frameshift exon, and covers 58.96% of the coding region. The size of intron 2 for 5'-loxP site insertion: 441 bp, and the size of intron 4 for 3'-loxP site insertion: 2105 bp. The size of effective cKO region: ~1459 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

17 1 2 3 4 5 6 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Mtmr11 Homology arm Exon of mouse Sf3b4 cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7929bp) | A(22.17% 1758) | C(24.87% 1972) | T(28.83% 2286) | G(24.13% 1913)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 + 96170322 96173321 3000 browser details YourSeq 203 2209 2779 3000 90.0% chr15 - 74081348 74081555 208 browser details YourSeq 93 569 686 3000 93.7% chr4 - 144115867 144116007 141 browser details YourSeq 88 556 675 3000 88.5% chr1 - 93702397 93702534 138 browser details YourSeq 88 569 705 3000 88.6% chr4 + 94505201 94505347 147 browser details YourSeq 83 544 655 3000 91.0% chr14 - 57696541 57696669 129 browser details YourSeq 80 565 684 3000 87.7% chr10 + 68228858 68229039 182 browser details YourSeq 78 556 680 3000 86.2% chrX + 80811609 80811770 162 browser details YourSeq 77 556 689 3000 89.6% chr10 + 117613679 117613817 139 browser details YourSeq 73 552 652 3000 86.8% chr1 + 189537514 189537632 119 browser details YourSeq 72 556 655 3000 86.6% chr10 + 7172168 7172285 118 browser details YourSeq 71 586 685 3000 93.8% chr10 - 69905306 69905425 120 browser details YourSeq 71 621 711 3000 93.8% chr10 + 90986421 90986511 91 browser details YourSeq 70 569 671 3000 86.8% chr1 - 23778419 23778542 124 browser details YourSeq 70 1623 1708 3000 91.8% chr14 + 73347791 73347882 92 browser details YourSeq 69 569 655 3000 90.6% chr1 - 162347904 162347991 88 browser details YourSeq 66 585 677 3000 94.6% chr12 - 79277904 79278017 114 browser details YourSeq 65 1614 1691 3000 92.4% chrX - 36027627 36027706 80 browser details YourSeq 65 588 708 3000 90.4% chr17 + 73084578 73084700 123 browser details YourSeq 64 580 654 3000 95.9% chr7 + 24793241 24793343 103

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 + 96174781 96177780 3000 browser details YourSeq 584 1853 2613 3000 97.0% chr15 - 74080006 74080600 595 browser details YourSeq 375 1911 2606 3000 86.6% chr17 - 62448567 62449095 529 browser details YourSeq 310 520 1062 3000 91.2% chr16 + 84768782 84769168 387 browser details YourSeq 283 509 1065 3000 88.6% chr10 - 78011329 78011763 435 browser details YourSeq 267 538 1063 3000 89.0% chr9 + 110058168 110058555 388 browser details YourSeq 260 514 1062 3000 90.1% chr11 + 88062720 88063154 435 browser details YourSeq 257 511 1062 3000 88.7% chr17 + 35497451 35497777 327 browser details YourSeq 255 520 1062 3000 87.8% chr10 - 77148653 77149032 380 browser details YourSeq 231 536 1062 3000 94.6% chr9 + 65704788 65705350 563 browser details YourSeq 231 527 1073 3000 87.7% chr14 + 52354578 52355018 441 browser details YourSeq 228 525 1188 3000 88.4% chr6 + 140639319 140639738 420 browser details YourSeq 222 525 1055 3000 95.2% chr19 + 5255912 5256543 632 browser details YourSeq 220 553 1078 3000 88.8% chr13 + 38203931 38204272 342 browser details YourSeq 217 850 1135 3000 95.1% chr19 + 6395361 6396046 686 browser details YourSeq 215 866 1184 3000 94.3% chr4 + 44304951 44305547 597 browser details YourSeq 213 608 1063 3000 87.9% chr13 - 98999698 98999968 271 browser details YourSeq 212 862 1324 3000 88.6% chr15 - 55303716 55304021 306 browser details YourSeq 209 609 1065 3000 91.8% chr15 + 84660923 84661362 440 browser details YourSeq 206 524 1063 3000 88.5% chr4 + 54983430 54983676 247

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Sf3b4 splicing factor 3b, subunit 4 [ Mus musculus (house mouse) ] Gene ID: 107701, updated on 10-Oct-2019

Gene summary

Official Symbol Sf3b4 provided by MGI Official Full Name splicing factor 3b, subunit 4 provided by MGI Primary source MGI:MGI:109580 See related Ensembl:ENSMUSG00000068856 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Sap49; SF3b49; SF3b50 Summary This gene encodes one of the subunits of splicing factor 3B. A similar gene in human encodes a protein that cross-links to a Expression region in the pre-mRNA immediately upstream of the branchpoint sequence in pre-mRNA in the prespliceosomal complex A. It also may be involved in the assembly of the B, C and E spliceosomal complexes, and also belongs with the minor U12-dependent spliceosome. [provided by RefSeq, May 2015] Orthologs Ubiquitous expression in adrenal adult (RPKM 84.1), thymus adult (RPKM 64.7) and 28 other tissues See more human all

Genomic context

Location: 3; 3 F2.1 See Sf3b4 in Genome Data Viewer

Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (96172506..96177564)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (95976473..95981487)

Chromosome 3 - NC_000069.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Sf3b4 ENSMUSG00000068856

Description splicing factor 3b, subunit 4 [Source:MGI Symbol;Acc:MGI:109580] Gene Synonyms 49kDa, SF3b49, Sap49 Location Chromosome 3: 96,172,332-96,177,564 forward strand. GRCm38:CM000996.2 About this gene This gene has 2 transcripts (splice variants), 213 orthologues, 23 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Sf3b4-201 ENSMUST00000076372.4 1915 424aa ENSMUSP00000075709.4 Protein coding CCDS17630 Q8QZY9 TSL:1 GENCODE basic APPRIS P1

Sf3b4-202 ENSMUST00000153977.1 437 No protein - Retained intron - - TSL:2

25.23 kb Forward strand 96.165Mb 96.170Mb 96.175Mb 96.180Mb 96.185Mb (Comprehensive set... Mtmr11-201 >protein coding Sf3b4-201 >protein coding Sv2a-201 >protein coding

Mtmr11-204 >retained intron Sf3b4-202 >retained intron Gm22027-201 >snoRNA Sv2a-202 >retained intron

Mtmr11-203 >nonsense mediated decay

Mtmr11-202 >retained intron Mtmr11-205 >lncRNA

Mtmr11-206 >lncRNA

Mtmr11-207 >lncRNA

Contigs < AC092094.19 < AC125099.3 Genes < Gm17690-201lncRNA (Comprehensive set...

< Gm17690-202lncRNA

Regulatory Build

96.165Mb 96.170Mb 96.175Mb 96.180Mb 96.185Mb Reverse strand 25.23 kb

Regulation Legend Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000076372

5.23 kb Forward strand

Sf3b4-201 >protein coding

ENSMUSP00000075... MobiDB lite Low complexity (Seg) Superfamily RNA-binding domain superfamily SMART RNA recognition motif domain Prints PR01217 Pfam RNA recognition motif domain PROSITE profiles RNA recognition motif domain PANTHER PTHR15241:SF183

PTHR15241 Gene3D Nucleotide-binding alpha-beta plait domain superfamily CDD SF3B4, RNA recognition motif 2

SF3B4, RNA recognition motif 1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 424

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7