https://www.alphaknockout.com

Mouse Utp15 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Utp15 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Utp15 (NCBI Reference Sequence: NM_178918 ; Ensembl: ENSMUSG00000041747 ) is located on Mouse 13. 13 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 13 (Transcript: ENSMUST00000040972). Exon 3~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Utp15 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-267C22 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 5.74% of the coding region. The knockout of Exon 3~4 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 1332 bp, and the size of intron 4 for 3'-loxP site insertion: 1090 bp. The size of effective cKO region: ~862 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 4 5 6 13 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Utp15 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7362bp) | A(25.54% 1880) | C(19.65% 1447) | T(33.5% 2466) | G(21.31% 1569)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr13 - 98259717 98262716 3000 browser details YourSeq 88 1425 1528 3000 96.9% chr15 + 91620315 91620482 168 browser details YourSeq 60 2294 2488 3000 86.8% chr1 + 127685434 127900421 214988 browser details YourSeq 47 1418 1501 3000 96.1% chr10 + 11537557 11538109 553 browser details YourSeq 46 1460 1518 3000 98.0% chr2 + 89288507 89288671 165 browser details YourSeq 46 2289 2351 3000 91.3% chr15 + 62432390 62432454 65 browser details YourSeq 45 1493 1558 3000 75.0% chr3 - 36445372 36445423 52 browser details YourSeq 40 2289 2340 3000 93.5% chr10 + 95018901 95018954 54 browser details YourSeq 38 1466 1557 3000 69.1% chr2 + 161261882 161261939 58 browser details YourSeq 32 1460 1510 3000 74.4% chr3 - 92387528 92387570 43 browser details YourSeq 30 1470 1510 3000 76.5% chr1 + 153640881 153640915 35 browser details YourSeq 29 2289 2320 3000 96.8% chr5 - 146092444 146092476 33 browser details YourSeq 29 2291 2320 3000 100.0% chr12 + 102414331 102414361 31 browser details YourSeq 28 2461 2506 3000 91.0% chr12 - 21733356 21733401 46 browser details YourSeq 28 2292 2320 3000 100.0% chr4 + 138405807 138405836 30 browser details YourSeq 28 2409 2454 3000 96.8% chr2 + 119682765 119682816 52 browser details YourSeq 28 715 749 3000 91.5% chr15 + 32232719 32232754 36 browser details YourSeq 27 2289 2318 3000 96.7% chr4 - 121082281 121082311 31 browser details YourSeq 27 2291 2320 3000 96.6% chr10 - 31304463 31304493 31 browser details YourSeq 27 2290 2320 3000 96.6% chr7 + 116149340 116149371 32

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr13 - 98255855 98258854 3000 browser details YourSeq 162 2432 2680 3000 87.5% chr11 + 68542028 68542218 191 browser details YourSeq 151 2496 2681 3000 89.5% chr4 + 41646759 41646932 174 browser details YourSeq 150 2502 2675 3000 95.3% chr17 + 46329442 46329618 177 browser details YourSeq 150 2499 2675 3000 93.4% chr11 + 88858068 88858242 175 browser details YourSeq 149 2499 2673 3000 91.1% chr3 + 37440843 37441010 168 browser details YourSeq 148 2499 2679 3000 90.2% chr12 - 24688321 24688496 176 browser details YourSeq 148 2498 2665 3000 92.0% chr11 - 106260035 106260195 161 browser details YourSeq 148 2499 2681 3000 90.6% chr1 - 173558552 173558729 178 browser details YourSeq 148 2501 2680 3000 89.6% chr15 + 100017950 100018122 173 browser details YourSeq 148 2496 2650 3000 98.1% chr14 + 57835025 57835184 160 browser details YourSeq 147 2496 2679 3000 92.5% chr11 - 62596545 62596728 184 browser details YourSeq 147 2499 2675 3000 92.9% chr3 + 146522408 146522583 176 browser details YourSeq 146 2499 2654 3000 96.8% chr2 + 53477003 53477158 156 browser details YourSeq 146 2498 2665 3000 91.9% chr11 + 106614062 106614222 161 browser details YourSeq 144 2499 2666 3000 91.3% chr2 + 129982663 129982823 161 browser details YourSeq 144 2499 2666 3000 91.3% chr17 + 26278698 26278858 161 browser details YourSeq 143 2493 2654 3000 94.5% chr11 - 73052189 73052355 167 browser details YourSeq 143 2495 2654 3000 95.6% chr12 + 69612018 69612188 171 browser details YourSeq 142 2499 2665 3000 92.4% chr12 - 76508606 76508767 162

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Utp15 UTP15 small subunit processome component [ Mus musculus (house mouse) ] Gene ID: 105372, updated on 12-Aug-2019

Gene summary

Official Symbol Utp15 provided by MGI Official Full Name UTP15 small subunit processome component provided by MGI Primary source MGI:MGI:2145443 See related Ensembl:ENSMUSG00000041747 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AW544865 Expression Ubiquitous expression in CNS E11.5 (RPKM 7.4), placenta adult (RPKM 5.9) and 28 other tissues See more Orthologs human all

Genomic context

Location: 13; 13 D1 See Utp15 in Genome Data Viewer

Exon count: 13

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 13 NC_000079.6 (98246845..98262992, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 13 NC_000079.5 (99016800..99032947, complement)

Chromosome 13 - NC_000079.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Utp15 ENSMUSG00000041747

Description UTP15 small subunit processome component [Source:MGI Symbol;Acc:MGI:2145443] Location Chromosome 13: 98,246,845-98,263,041 reverse strand. GRCm38:CM001006.2 About this gene This gene has 3 transcripts (splice variants), 198 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Utp15-201 ENSMUST00000040972.3 4120 528aa ENSMUSP00000048204.2 Protein coding CCDS26712 Q8C7V3 TSL:1 GENCODE basic APPRIS P1

Utp15-202 ENSMUST00000224842.1 2328 No protein - Retained intron - - -

Utp15-203 ENSMUST00000225100.1 634 No protein - Retained intron - - -

36.20 kb Forward strand 98.24Mb 98.25Mb 98.26Mb 98.27Mb Ankra2-204 >protein coding (Comprehensive set...

Ankra2-208 >protein coding

Ankra2-207 >retained intron

Ankra2-205 >protein coding

Ankra2-202 >protein coding

Ankra2-203 >protein coding

Ankra2-201 >protein coding

Ankra2-206 >retained intron

Contigs AC123056.4 > Genes (Comprehensive set... < Utp15-201protein coding

< Utp15-203retained intron

< Utp15-202retained intron

Regulatory Build

98.24Mb 98.25Mb 98.26Mb 98.27Mb Reverse strand 36.20 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000040972

< Utp15-201protein coding

Reverse strand 16.20 kb

ENSMUSP00000048... MobiDB lite Low complexity (Seg) Superfamily WD40-repeat-containing domain superfamily

SMART WD40 repeat Prints G-protein beta WD-40 repeat Pfam WD40 repeat U3 small nucleolar RNA-associated protein 15, C-terminal

PROSITE profiles WD40-repeat-containing domain

WD40 repeat PROSITE patterns WD40 repeat, conserved site

PANTHER PTHR19924:SF30

PTHR19924 Gene3D WD40/YVTN repeat-like-containing domain superfamily CDD cd00200

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 528

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7