https://www.alphaknockout.com

Mouse Utp15 Knockout Project (CRISPR/Cas9)

Objective: To create a Utp15 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Utp15 (NCBI Reference Sequence: NM_178918 ; Ensembl: ENSMUSG00000041747 ) is located on Mouse 13. 13 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 13 (Transcript: ENSMUST00000040972). Exon 2~10 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from the coding region. Exon 2~10 covers 72.35% of the coding region. The size of effective KO region: ~9042 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9 10 13

Legends Exon of mouse Utp15 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1904 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1010 bp section downstream of Exon 10 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(1904bp) | A(23.48% 447) | C(20.38% 388) | T(35.98% 685) | G(20.17% 384)

Note: The 1904 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1010bp) | A(24.65% 249) | C(18.71% 189) | T(30.3% 306) | G(26.34% 266)

Note: The 1010 bp section downstream of Exon 10 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1904 1 1904 1904 100.0% chr13 - 98260889 98262792 1904 browser details YourSeq 88 1501 1604 1904 96.9% chr15 + 91620315 91620482 168 browser details YourSeq 56 1546 1635 1904 85.2% chrX - 69983449 69983559 111 browser details YourSeq 53 1506 1581 1904 98.4% chr7 - 37111689 37111788 100 browser details YourSeq 47 1494 1577 1904 96.1% chr10 + 11537557 11538109 553 browser details YourSeq 46 1536 1594 1904 98.0% chr2 + 89288507 89288671 165 browser details YourSeq 45 1569 1634 1904 75.0% chr3 - 36445372 36445423 52 browser details YourSeq 38 1542 1588 1904 95.5% chr3 - 92387419 92387465 47 browser details YourSeq 38 1542 1633 1904 69.1% chr2 + 161261882 161261939 58 browser details YourSeq 30 1546 1586 1904 76.5% chr1 + 153640881 153640915 35 browser details YourSeq 28 787 817 1904 96.7% chr1 - 16934770 16934804 35 browser details YourSeq 28 791 825 1904 91.5% chr15 + 32232719 32232754 36 browser details YourSeq 27 1144 1175 1904 86.3% chr7 + 37273022 37273051 30 browser details YourSeq 26 1557 1586 1904 93.4% chr10 - 89129989 89130018 30 browser details YourSeq 26 791 817 1904 100.0% chr2 + 108767334 108767361 28 browser details YourSeq 25 1144 1173 1904 88.9% chr1 - 12774484 12774512 29 browser details YourSeq 25 1146 1176 1904 85.2% chr18 + 61234605 61234633 29 browser details YourSeq 24 791 815 1904 100.0% chrX - 164421391 164421416 26 browser details YourSeq 24 794 817 1904 100.0% chr9 - 6179885 6179908 24 browser details YourSeq 23 794 816 1904 100.0% chr10 - 116891764 116891786 23

Note: The 1904 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1010 1 1010 1010 100.0% chr13 - 98250919 98251928 1010 browser details YourSeq 87 80 396 1010 87.2% chr12 - 25435559 25790936 355378 browser details YourSeq 78 118 447 1010 74.5% chr5 - 104015944 104016154 211 browser details YourSeq 69 118 385 1010 70.5% chr6 - 51778067 51778197 131 browser details YourSeq 67 132 393 1010 71.8% chr13 + 73247904 73248062 159 browser details YourSeq 66 77 396 1010 69.5% chr1 + 72156405 72156621 217 browser details YourSeq 65 314 407 1010 89.5% chr10 + 53487082 53487223 142 browser details YourSeq 62 323 403 1010 88.9% chr3 - 98960563 98960646 84 browser details YourSeq 60 314 396 1010 86.8% chr5 + 126558453 126558549 97 browser details YourSeq 60 323 407 1010 86.6% chr17 + 84949220 84949322 103 browser details YourSeq 60 323 407 1010 86.6% chr1 + 63588941 63589043 103 browser details YourSeq 59 314 390 1010 85.6% chr8 - 13091368 13091443 76 browser details YourSeq 58 314 407 1010 82.5% chr10 - 122124002 122124137 136 browser details YourSeq 58 323 407 1010 85.4% chr16 + 95184633 95184735 103 browser details YourSeq 58 314 396 1010 85.6% chr14 + 113760585 113760691 107 browser details YourSeq 57 314 396 1010 93.9% chr2 - 79163469 79163564 96 browser details YourSeq 57 118 396 1010 66.7% chr19 - 29330603 29330772 170 browser details YourSeq 57 323 407 1010 84.4% chr8 + 125597837 125597939 103 browser details YourSeq 57 321 407 1010 84.4% chr15 + 101312109 101312214 106 browser details YourSeq 56 314 535 1010 92.5% chr10 - 66482528 66482781 254

Note: The 1010 bp section downstream of Exon 10 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Utp15 UTP15 small subunit processome component [ Mus musculus (house mouse) ] Gene ID: 105372, updated on 12-Aug-2019

Gene summary

Official Symbol Utp15 provided by MGI Official Full Name UTP15 small subunit processome component provided by MGI Primary source MGI:MGI:2145443 See related Ensembl:ENSMUSG00000041747 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AW544865 Expression Ubiquitous expression in CNS E11.5 (RPKM 7.4), placenta adult (RPKM 5.9) and 28 other tissues See more Orthologs human all

Genomic context

Location: 13; 13 D1 See Utp15 in Genome Data Viewer Exon count: 13

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 13 NC_000079.6 (98246845..98262992, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 13 NC_000079.5 (99016800..99032947, complement)

Chromosome 13 - NC_000079.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Utp15 ENSMUSG00000041747

Description UTP15 small subunit processome component [Source:MGI Symbol;Acc:MGI:2145443] Location Chromosome 13: 98,246,845-98,263,041 reverse strand. GRCm38:CM001006.2 About this gene This gene has 3 transcripts (splice variants), 198 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Utp15-201 ENSMUST00000040972.3 4120 528aa ENSMUSP00000048204.2 Protein coding CCDS26712 Q8C7V3 TSL:1 GENCODE basic APPRIS P1

Utp15-202 ENSMUST00000224842.1 2328 No protein - Retained intron - - -

Utp15-203 ENSMUST00000225100.1 634 No protein - Retained intron - - -

36.20 kb Forward strand 98.24Mb 98.25Mb 98.26Mb 98.27Mb Ankra2-204 >protein coding (Comprehensive set...

Ankra2-208 >protein coding

Ankra2-207 >retained intron

Ankra2-205 >protein coding

Ankra2-202 >protein coding

Ankra2-203 >protein coding

Ankra2-201 >protein coding

Ankra2-206 >retained intron

Contigs AC123056.4 > Genes (Comprehensive set... < Utp15-201protein coding

< Utp15-203retained intron

< Utp15-202retained intron

Regulatory Build

98.24Mb 98.25Mb 98.26Mb 98.27Mb Reverse strand 36.20 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000040972

< Utp15-201protein coding

Reverse strand 16.20 kb

ENSMUSP00000048... MobiDB lite Low complexity (Seg) Superfamily WD40-repeat-containing domain superfamily

SMART WD40 repeat Prints G-protein beta WD-40 repeat Pfam WD40 repeat U3 small nucleolar RNA-associated protein 15, C-terminal

PROSITE profiles WD40-repeat-containing domain

WD40 repeat PROSITE patterns WD40 repeat, conserved site

PANTHER PTHR19924:SF30

PTHR19924 Gene3D WD40/YVTN repeat-like-containing domain superfamily CDD cd00200

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 528

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8