https://www.alphaknockout.com

Mouse Usp44 Knockout Project (CRISPR/Cas9)

Objective: To create a Usp44 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Usp44 (NCBI Reference Sequence: NM_001206851 ; Ensembl: ENSMUSG00000020020 ) is located on Mouse 10. 6 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 6 (Transcript: ENSMUST00000216224). Exon 2~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit chromosomal instability, aneuploidy and increased tumor incidence.

Exon 2 starts from the coding region. Exon 2~3 covers 76.0% of the coding region. The size of effective KO region: ~3259 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 6

Legends Exon of mouse Usp44 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.05% 501) | C(17.75% 355) | T(31.4% 628) | G(25.8% 516)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.3% 506) | C(21.75% 435) | T(28.2% 564) | G(24.75% 495)

Note: The 2000 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr10 + 93843691 93845690 2000 browser details YourSeq 97 387 490 2000 97.2% chr2 - 147292101 147292206 106 browser details YourSeq 95 387 496 2000 94.4% chr3 + 81823210 81823407 198 browser details YourSeq 93 387 492 2000 95.3% chr1 + 188793585 188793754 170 browser details YourSeq 91 387 496 2000 94.3% chr1 + 110573756 110573873 118 browser details YourSeq 89 388 492 2000 93.3% chr11 - 83555912 83556048 137 browser details YourSeq 89 389 495 2000 94.0% chr12 + 20310444 20310566 123 browser details YourSeq 87 387 478 2000 99.0% chr19 + 28135159 28135272 114 browser details YourSeq 86 594 834 2000 85.0% chr1 - 179272121 179272446 326 browser details YourSeq 85 396 494 2000 95.8% chr7 + 128199899 128200035 137 browser details YourSeq 84 387 494 2000 95.8% chr16 - 6068703 6068850 148 browser details YourSeq 82 398 492 2000 96.7% chr11 + 57119558 57119662 105 browser details YourSeq 82 397 490 2000 96.7% chr1 + 62690364 62690459 96 browser details YourSeq 81 394 492 2000 87.0% chr5 + 27724926 27725018 93 browser details YourSeq 79 387 490 2000 86.6% chr18 - 21428151 21428232 82 browser details YourSeq 77 404 492 2000 96.5% chr7 + 128199991 128200081 91 browser details YourSeq 75 618 822 2000 79.1% chr2 - 132599776 132599972 197 browser details YourSeq 75 616 824 2000 73.8% chr3 + 30956926 30957067 142 browser details YourSeq 74 387 497 2000 82.8% chr2 - 56287856 56287938 83 browser details YourSeq 74 404 491 2000 95.1% chr17 - 10726046 10726141 96

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr10 + 93850347 93852346 2000 browser details YourSeq 247 1752 2000 2000 99.6% chr10 + 93852303 93852551 249 browser details YourSeq 112 266 1658 2000 89.6% chr15 + 84919624 85277991 358368 browser details YourSeq 84 468 1592 2000 92.8% chr11 - 100937666 100953051 15386 browser details YourSeq 80 207 523 2000 92.6% chr2 - 154208287 154208738 452 browser details YourSeq 74 1468 1598 2000 80.0% chr1 - 74407859 74407988 130 browser details YourSeq 72 1469 1657 2000 87.5% chr5 + 134598815 134993450 394636 browser details YourSeq 69 210 562 2000 93.7% chr5 + 77236576 77237150 575 browser details YourSeq 69 424 523 2000 90.7% chr17 + 47614832 47615040 209 browser details YourSeq 68 1468 1655 2000 82.0% chr9 + 75627818 75628007 190 browser details YourSeq 66 1465 1655 2000 91.3% chr1 - 180245341 180627668 382328 browser details YourSeq 66 431 523 2000 92.4% chr8 + 75163484 75163580 97 browser details YourSeq 65 1474 1654 2000 77.8% chr14 - 48962322 48962502 181 browser details YourSeq 65 424 520 2000 89.2% chr12 - 44161753 44161852 100 browser details YourSeq 65 208 304 2000 83.6% chr11 + 79820806 79820902 97 browser details YourSeq 64 1468 1653 2000 84.6% chr16 - 16595815 16595998 184 browser details YourSeq 64 1464 1592 2000 90.6% chr1 + 39540708 39540835 128 browser details YourSeq 63 208 288 2000 88.9% chr17 - 47108326 47108406 81 browser details YourSeq 63 1466 1641 2000 92.2% chr1 - 156549905 156550311 407 browser details YourSeq 63 431 524 2000 92.0% chr11 + 101765544 101765901 358

Note: The 2000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and protein information: Usp44 ubiquitin specific peptidase 44 [ Mus musculus (house mouse) ] Gene ID: 327799, updated on 24-Oct-2019

Gene summary

Official Symbol Usp44 provided by MGI Official Full Name ubiquitin specific peptidase 44 provided by MGI Primary source MGI:MGI:3045318 See related Ensembl:ENSMUSG00000020020 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as E430004F17Rik Expression Biased expression in testis adult (RPKM 24.5), CNS E11.5 (RPKM 1.5) and 1 other tissue See more Orthologs all

Genomic context

Location: 10; 10 C2 See Usp44 in Genome Data Viewer Exon count: 14

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (93819956..93860210)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (93294300..93311283)

Chromosome 10 - NC_000076.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Usp44 ENSMUSG00000020020

Description ubiquitin specific peptidase 44 [Source:MGI Symbol;Acc:MGI:3045318] Gene Synonyms E430004F17Rik Location Chromosome 10: 93,831,555-93,858,088 forward strand. GRCm38:CM001003.2 About this gene This gene has 2 transcripts (splice variants), 187 orthologues, 49 paralogues, is a member of 1 Ensembl protein family and is associated with 10 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Usp44-202 ENSMUST00000216224.1 2449 711aa ENSMUSP00000149020.1 Protein coding - Q8C2S0 TSL:5 GENCODE basic APPRIS P1

Usp44-201 ENSMUST00000095333.5 2946 No protein - lncRNA - - TSL:1

46.53 kb Forward strand

93.83Mb 93.84Mb 93.85Mb 93.86Mb (Comprehensive set... Usp44-201 >lncRNA

Usp44-202 >protein coding

Contigs AC124686.3 > Genes < Metap2-203nonsense mediated decay (Comprehensive set...

< Metap2-201protein coding

< Metap2-216retained intron

< Metap2-206protein coding

< Metap2-210protein coding

< Metap2-208protein coding

< Metap2-207retained intron

< Metap2-214protein coding

Regulatory Build

93.83Mb 93.84Mb 93.85Mb 93.86Mb Reverse strand 46.53 kb

Regulation Legend CTCF Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000216224

24.27 kb Forward strand

Usp44-202 >protein coding

ENSMUSP00000149... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF57850 Papain-like cysteine peptidase superfamily

SMART Zinc finger, UBP-type Pfam Zinc finger, UBP-type Peptidase C19, ubiquitin carboxyl-terminal hydrolase

PROSITE profiles Zinc finger, UBP-type Ubiquitin specific protease domain

PROSITE patterns Ubiquitin specific protease, conserved site Ubiquitin specific protease, conserved site

PANTHER PTHR21646:SF15

PTHR21646 Gene3D Zinc finger, RING/FYVE/PHD-type 3.90.70.10

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe deletion missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 711

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8