https://www.alphaknockout.com

Mouse Ubac1 Knockout Project (CRISPR/Cas9)

Objective: To create a Ubac1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ubac1 (NCBI Reference Sequence: NM_133835 ; Ensembl: ENSMUSG00000036352 ) is located on Mouse 2. 10 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 10 (Transcript: ENSMUST00000036509). Exon 2~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 11.33% of the coding region. Exon 2~6 covers 41.97% of the coding region. The size of effective KO region: ~7556 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 10

Legends Exon of mouse Ubac1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 907 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.3% 446) | C(21.2% 424) | T(31.7% 634) | G(24.8% 496)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(907bp) | A(20.18% 183) | C(19.96% 181) | T(34.73% 315) | G(25.14% 228)

Note: The 907 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 26016417 26018416 2000 browser details YourSeq 37 1610 1651 2000 97.5% chr19 - 15190611 15190653 43 browser details YourSeq 31 273 304 2000 100.0% chr14 - 26561902 26562056 155 browser details YourSeq 27 1722 1752 2000 93.6% chr4 - 143078301 143078331 31 browser details YourSeq 24 526 549 2000 100.0% chr14 - 13714115 13714138 24 browser details YourSeq 24 1723 1755 2000 88.5% chr7 + 135603645 135603676 32 browser details YourSeq 24 526 551 2000 96.2% chr1 + 16973269 16973294 26 browser details YourSeq 23 874 896 2000 100.0% chr1 + 87249808 87249830 23 browser details YourSeq 22 548 569 2000 100.0% chr16 + 57281165 57281186 22 browser details YourSeq 22 526 549 2000 95.9% chr1 + 101948359 101948382 24 browser details YourSeq 21 39 59 2000 100.0% chr7 - 14306306 14306326 21 browser details YourSeq 21 927 953 2000 88.9% chr1 + 21234410 21234436 27

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 907 1 907 907 100.0% chr2 - 26007954 26008860 907 browser details YourSeq 60 72 207 907 75.3% chr3 - 83701537 83701687 151 browser details YourSeq 56 143 359 907 96.8% chr1 + 87363344 87363574 231 browser details YourSeq 55 130 210 907 88.8% chr11 - 61762562 61762906 345 browser details YourSeq 54 90 207 907 92.4% chr2 + 112194959 112195080 122 browser details YourSeq 53 80 223 907 82.3% chr1 - 72109290 72109427 138 browser details YourSeq 51 112 185 907 85.2% chr2 - 32661295 32661369 75 browser details YourSeq 51 102 344 907 93.3% chr13 + 48184238 48184528 291 browser details YourSeq 50 143 439 907 62.8% chr7 - 12484589 12484782 194 browser details YourSeq 50 90 166 907 88.3% chr5 + 22728064 22728153 90 browser details YourSeq 48 75 211 907 87.5% chr11 + 79209585 79209777 193 browser details YourSeq 45 334 379 907 100.0% chr1 + 85967956 85968004 49 browser details YourSeq 43 96 166 907 86.0% chr4 - 109480278 109480346 69 browser details YourSeq 42 78 168 907 91.9% chr6 + 98903351 98903466 116 browser details YourSeq 41 80 195 907 95.6% chr11 - 20279715 20279867 153 browser details YourSeq 41 148 218 907 79.8% chr11 + 38576285 38576366 82 browser details YourSeq 40 95 207 907 88.7% chr5 - 131365971 131366172 202 browser details YourSeq 40 115 174 907 88.9% chr2 + 90944720 90944778 59 browser details YourSeq 39 328 366 907 100.0% chrX - 74964223 74964261 39 browser details YourSeq 39 124 211 907 77.8% chr8 - 26606326 26606412 87

Note: The 907 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Ubac1 ubiquitin associated domain containing 1 [ Mus musculus (house mouse) ] Gene ID: 98766, updated on 12-Aug-2019

Gene summary

Official Symbol Ubac1 provided by MGI Official Full Name ubiquitin associated domain containing 1 provided by MGI Primary source MGI:MGI:1920995 See related Ensembl:ENSMUSG00000036352 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Kpc2; GBDR1; Gdbr1; Ubadc1; AA407978; 1110033G07Rik Expression Broad expression in liver E14.5 (RPKM 72.0), liver E14 (RPKM 70.3) and 27 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 A3 See Ubac1 in Genome Data Viewer Exon count: 12

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (25996958..26021781, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (25852478..25877280, complement)

Chromosome 2 - NC_000068.7

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Ubac1 ENSMUSG00000036352

Description ubiquitin associated domain containing 1 [Source:MGI Symbol;Acc:MGI:1920995] Gene Synonyms 1110033G07Rik, Kpc2, Ubadc1 Location Chromosome 2: 25,998,543-26,021,747 reverse strand. GRCm38:CM000995.2 About this gene This gene has 9 transcripts (splice variants), 193 orthologues, 5 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ubac1-201 ENSMUST00000036509.13 1852 409aa ENSMUSP00000040220.7 Protein coding CCDS15795 Q8VDI7 TSL:1 GENCODE basic APPRIS P2

Ubac1-204 ENSMUST00000136750.2 778 260aa ENSMUSP00000123115.1 Protein coding - F6UMH9 CDS 5' and 3' incomplete TSL:5 APPRIS ALT2

Ubac1-205 ENSMUST00000146363.7 551 184aa ENSMUSP00000117683.1 Protein coding - F6Z1E7 CDS 5' and 3' incomplete TSL:5

Ubac1-202 ENSMUST00000123275.7 814 No protein - lncRNA - - TSL:1

Ubac1-209 ENSMUST00000154336.1 786 No protein - lncRNA - - TSL:3

Ubac1-208 ENSMUST00000150608.1 658 No protein - lncRNA - - TSL:2

Ubac1-207 ENSMUST00000148725.1 610 No protein - lncRNA - - TSL:3

Ubac1-206 ENSMUST00000146898.1 426 No protein - lncRNA - - TSL:5

Ubac1-203 ENSMUST00000134990.7 405 No protein - lncRNA - - TSL:2

Page 7 of 9 https://www.alphaknockout.com

43.20 kb Forward strand 25.99Mb 26.00Mb 26.01Mb 26.02Mb 26.03Mb Contigs AL731682.20 > AL845455.7 > (Comprehensive set... < Gm13542-201processed pseudogene < Ubac1-205protein coding

< Ubac1-202lncRNA < Ubac1-207lncRNA < Ubac1-206lncRNA

< Ubac1-201protein coding

< Ubac1-209lncRNA < Ubac1-203lncRNA

< Ubac1-204protein coding

< Ubac1-208lncRNA

Regulatory Build

25.99Mb 26.00Mb 26.01Mb 26.02Mb 26.03Mb Reverse strand 43.20 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

pseudogene RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000036509

< Ubac1-201protein coding

Reverse strand 23.20 kb

ENSMUSP00000040... MobiDB lite Low complexity (Seg) Superfamily Ubiquitin-like domain superfamily UBA-like superfamily SMART Ubiquitin-associated domain Heat shock chaperonin-binding Pfam Ubiquitin-associated domain PROSITE profiles Ubiquitin-associated domain PANTHER PTHR46738 Gene3D 1.10.8.10 1.10.260.100 CDD cd17066 UBAC1, UBA domain 1 UBAC1, UBA domain 2

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 409

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9