https://www.alphaknockout.com

Mouse Uty Knockout Project (CRISPR/Cas9)

Objective: To create a Uty knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Uty (NCBI Reference Sequence: NM_009484 ; Ensembl: ENSMUSG00000068457 ) is located on Mouse Y. 27 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 27 (Transcript: ENSMUST00000069309). Exon 6~7 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mutant male mice hemizygous for a gene trapped allele are viable and fertile.

Exon 6 starts from about 11.8% of the coding region. Exon 6~7 covers 4.84% of the coding region. The size of effective KO region: ~7991 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 6 7 27

Legends Exon of mouse Uty Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 6 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 7 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(32.35% 647) | C(15.25% 305) | T(34.35% 687) | G(18.05% 361)

Note: The 2000 bp section upstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(31.5% 630) | C(13.55% 271) | T(39.35% 787) | G(15.6% 312)

Note: The 2000 bp section downstream of Exon 7 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chrY - 1197327 1199326 2000 browser details YourSeq 121 99 292 2000 82.1% chr14 + 118818925 118819130 206 browser details YourSeq 118 99 285 2000 83.3% chrX - 168414227 168414425 199 browser details YourSeq 115 32 362 2000 80.2% chr15 - 44605904 44606246 343 browser details YourSeq 111 121 295 2000 84.7% chr16 - 88030100 88030302 203 browser details YourSeq 110 125 365 2000 87.3% chr12 + 98892591 98892852 262 browser details YourSeq 105 121 286 2000 83.3% chr4 + 68377844 68378021 178 browser details YourSeq 102 108 292 2000 79.6% chr2 - 171944386 171944581 196 browser details YourSeq 101 99 286 2000 87.0% chr7 + 4031388 4031587 200 browser details YourSeq 100 33 379 2000 81.3% chr3 + 81476130 81476509 380 browser details YourSeq 96 108 287 2000 84.2% chr17 + 49843240 49843414 175 browser details YourSeq 96 21 282 2000 87.7% chr15 + 32416337 32416612 276 browser details YourSeq 90 86 292 2000 91.0% chr1 - 77752297 77752516 220 browser details YourSeq 89 108 291 2000 87.4% chr8 - 78962301 78962485 185 browser details YourSeq 81 99 253 2000 77.2% chr2 - 166547488 166547629 142 browser details YourSeq 81 111 365 2000 88.6% chr8 + 109971420 109971681 262 browser details YourSeq 81 125 291 2000 88.6% chr1 + 166777418 166777589 172 browser details YourSeq 80 125 292 2000 76.2% chrX + 94994811 94994975 165 browser details YourSeq 80 140 360 2000 90.9% chr10 + 42413964 42414214 251 browser details YourSeq 78 32 286 2000 81.0% chr3 + 34158228 34158478 251

Note: The 2000 bp section upstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chrY - 1187336 1189335 2000 browser details YourSeq 158 618 865 2000 88.7% chr10 - 78407548 78408102 555 browser details YourSeq 149 617 845 2000 90.0% chr9 + 30874428 30874661 234 browser details YourSeq 144 226 845 2000 79.6% chr5 - 134324774 134325193 420 browser details YourSeq 136 232 790 2000 85.1% chr12 + 85718291 85718847 557 browser details YourSeq 135 662 867 2000 85.2% chr7 + 130551356 130551544 189 browser details YourSeq 133 660 873 2000 86.5% chr14 + 89105346 89105569 224 browser details YourSeq 127 665 845 2000 85.1% chr1 - 105769144 105769310 167 browser details YourSeq 126 660 870 2000 86.8% chr11 - 98172867 98173082 216 browser details YourSeq 125 661 845 2000 83.4% chr9 + 54306076 54306243 168 browser details YourSeq 125 235 790 2000 77.0% chr4 + 133925957 133926255 299 browser details YourSeq 124 614 785 2000 89.9% chr14 + 94745519 94745691 173 browser details YourSeq 124 660 845 2000 83.6% chr11 + 118235320 118235489 170 browser details YourSeq 123 660 845 2000 82.8% chr8 + 119852539 119852707 169 browser details YourSeq 122 660 845 2000 82.3% chr11 + 101824445 101824609 165 browser details YourSeq 121 663 1204 2000 79.3% chr2 - 153589405 153589763 359 browser details YourSeq 121 662 844 2000 82.5% chr15 + 75930721 75930886 166 browser details YourSeq 120 660 842 2000 82.4% chr9 + 63878699 63878864 166 browser details YourSeq 120 644 837 2000 83.2% chr4 + 130535055 130535220 166 browser details YourSeq 120 600 845 2000 77.7% chr4 + 103649204 103649376 173

Note: The 2000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Uty ubiquitously transcribed tetratricopeptide repeat gene, [ Mus musculus (house mouse) ] Gene ID: 22290, updated on 10-Oct-2019

Gene summary

Official Symbol Uty provided by MGI Official Full Name ubiquitously transcribed tetratricopeptide repeat gene, Y chromosome provided by MGI Primary source MGI:MGI:894810 See related Ensembl:ENSMUSG00000068457 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Hydb; mKIAA4057 Expression Broad expression in CNS E18 (RPKM 1.8), CNS E14 (RPKM 1.7) and 18 other tissues See more

Genomic context

Location: Y; Y A1 See Uty in Genome Data Viewer

Exon count: 33

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) Y NC_000087.7 (1085187..1245773, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) Y NC_000087.6 (433587..582181, complement)

Chromosome Y - NC_000087.7

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 12 transcripts

Gene: Uty ENSMUSG00000068457

Description ubiquitously transcribed tetratricopeptide repeat gene, Y chromosome [Source:MGI Symbol;Acc:MGI:894810] Gene Synonyms Hydb Location Chromosome Y: 1,096,861-1,245,759 reverse strand. GRCm38:CM001014.2 About this gene This gene has 12 transcripts (splice variants), 139 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Uty- ENSMUST00000069309.13 5208 1212aa ENSMUSP00000070012.7 Protein coding CCDS41216 G3X9F2 TSL:1 201 GENCODE basic APPRIS P3

Uty- ENSMUST00000139365.7 4684 1211aa ENSMUSP00000114752.1 Protein coding CCDS85834 E9PWW8 TSL:1 204 GENCODE basic APPRIS ALT2

Uty- ENSMUST00000154004.7 4267 1149aa ENSMUSP00000114910.1 Protein coding CCDS85833 Q3UQE8 TSL:1 208 GENCODE basic APPRIS ALT2

Uty- ENSMUST00000143286.7 5083 1111aa ENSMUSP00000115113.1 Protein coding - E9Q504 TSL:1 205 GENCODE basic APPRIS ALT2

Uty- ENSMUST00000137048.7 2012 646aa ENSMUSP00000119406.1 Protein coding - D3Z2A3 CDS 3' 203 incomplete TSL:1

Uty- ENSMUST00000154666.7 5185 144aa ENSMUSP00000122818.1 Nonsense mediated - D6RFT4 TSL:1 210 decay

Uty- ENSMUST00000143958.7 2903 78aa ENSMUSP00000120069.1 Nonsense mediated - D6RJL0 TSL:5 206 decay

Uty- ENSMUST00000150715.1 501 100aa ENSMUSP00000116372.1 Nonsense mediated - F7BBC1 CDS 5' 207 decay incomplete TSL:3

Uty- ENSMUST00000185837.1 2368 No - Retained intron - - TSL:NA 212 protein

Uty- ENSMUST00000154527.1 667 No - Retained intron - - TSL:3 209 protein

Uty- ENSMUST00000133976.1 587 No - Retained intron - - TSL:5 202 protein

Uty- ENSMUST00000157073.7 2148 No - lncRNA - - TSL:1 211 protein

Page 7 of 9 https://www.alphaknockout.com

168.90 kb Forward strand 1.10Mb 1.15Mb 1.20Mb 1.25Mb Contigs AC148319.2 > AC145571.3 > (Comprehensive set... < Uty-205protein coding

< Uty-201protein coding

< Uty-210nonsense mediated decay

< Uty-204protein coding

< Uty-208protein coding

< Uty-212retained intron < Uty-202retained intron < Uty-207nonsense mediated decay

< Uty-206nonsense mediated decay

< Uty-203protein coding

< Uty-211lncRNA

< Uty-209retained intron

Regulatory Build

1.10Mb 1.15Mb 1.20Mb 1.25Mb Reverse strand 168.90 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000069309

< Uty-201protein coding

Reverse strand 148.57 kb

ENSMUSP00000070... MobiDB lite Low complexity (Seg) Superfamily Tetratricopeptide-like helical domain superfamily

SSF51197 SMART Tetratricopeptide repeat JmjC domain Pfam Tetratricopeptide repeat JmjC domain

PROSITE profiles Tetratricopeptide repeat-containing domain JmjC domain

Tetratricopeptide repeat PANTHER PTHR14017

Histone demethylase UTY Gene3D Tetratricopeptide-like helical domain superfamily 2.60.120.650 2.10.110.20

1.20.58.1370

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1212

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9