https://www.alphaknockout.com

Mouse Ubac1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Ubac1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ubac1 (NCBI Reference Sequence: NM_133835 ; Ensembl: ENSMUSG00000036352 ) is located on Mouse 2. 10 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 10 (Transcript: ENSMUST00000036509). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Ubac1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-111B22 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 11.33% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 4980 bp, and the size of intron 2 for 3'-loxP site insertion: 1330 bp. The size of effective cKO region: ~621 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 10 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Ubac1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7121bp) | A(23.66% 1685) | C(21.57% 1536) | T(30.96% 2205) | G(23.8% 1695)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 26016667 26019666 3000 browser details YourSeq 175 560 766 3000 92.4% chr18 + 65005893 65006095 203 browser details YourSeq 171 558 748 3000 95.3% chr3 - 121405687 121406320 634 browser details YourSeq 171 560 746 3000 96.8% chr16 + 71104337 71104534 198 browser details YourSeq 168 560 748 3000 94.7% chr4 + 132560909 132561099 191 browser details YourSeq 167 562 747 3000 94.0% chr11 + 82796648 82796831 184 browser details YourSeq 165 570 747 3000 96.7% chr9 - 58508908 58509087 180 browser details YourSeq 165 560 747 3000 94.2% chr5 - 147150116 147150305 190 browser details YourSeq 164 563 747 3000 95.2% chr9 + 102868282 102868484 203 browser details YourSeq 164 550 745 3000 92.0% chr4 + 152320629 152320821 193 browser details YourSeq 164 560 747 3000 94.2% chr13 + 54476720 54476917 198 browser details YourSeq 162 563 760 3000 90.0% chr19 - 45333872 45334064 193 browser details YourSeq 162 561 749 3000 93.7% chr15 - 102048489 102048692 204 browser details YourSeq 162 560 750 3000 91.4% chr14 - 64241016 64241203 188 browser details YourSeq 162 560 747 3000 95.1% chr5 + 88771590 88771781 192 browser details YourSeq 162 556 746 3000 94.6% chr19 + 42065430 42066035 606 browser details YourSeq 161 178 750 3000 81.7% chr16 - 33807455 33807653 199 browser details YourSeq 161 561 749 3000 95.0% chr12 - 62571227 62571427 201 browser details YourSeq 161 560 747 3000 93.1% chr18 + 67998309 67998498 190 browser details YourSeq 160 560 746 3000 93.1% chr9 - 26501269 26501457 189

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 26013046 26016045 3000 browser details YourSeq 247 479 2415 3000 91.9% chr1 + 16088435 16303696 215262 browser details YourSeq 221 493 2422 3000 91.2% chr1 + 93703319 94066075 362757 browser details YourSeq 133 480 636 3000 93.0% chr1 + 97835235 97835398 164 browser details YourSeq 131 491 1797 3000 77.3% chr1 - 191174476 191174872 397 browser details YourSeq 129 479 616 3000 97.8% chr12 - 104879189 104879329 141 browser details YourSeq 127 479 622 3000 95.8% chr11 - 89113086 89113231 146 browser details YourSeq 126 479 619 3000 96.4% chr11 - 69403179 69403322 144 browser details YourSeq 126 475 614 3000 96.4% chr11 + 75660595 75660753 159 browser details YourSeq 122 475 614 3000 94.9% chr13 - 109713569 109713712 144 browser details YourSeq 121 483 622 3000 94.9% chr13 + 58502975 58503114 140 browser details YourSeq 120 482 622 3000 94.2% chr12 - 84922886 84923026 141 browser details YourSeq 120 476 614 3000 93.5% chr11 - 77589691 77589831 141 browser details YourSeq 120 484 619 3000 92.2% chr8 + 127797538 127797668 131 browser details YourSeq 120 480 615 3000 95.5% chr12 + 76157667 76157802 136 browser details YourSeq 120 479 615 3000 95.6% chr11 + 116316560 116316699 140 browser details YourSeq 120 479 615 3000 94.9% chr1 + 36008435 36008574 140 browser details YourSeq 119 486 615 3000 96.9% chr11 - 20040850 20040981 132 browser details YourSeq 119 482 619 3000 93.5% chr1 - 107533251 107533391 141 browser details YourSeq 119 481 615 3000 95.5% chr1 + 132686372 132686507 136

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Ubac1 ubiquitin associated domain containing 1 [ Mus musculus (house mouse) ] Gene ID: 98766, updated on 12-Aug-2019

Gene summary

Official Symbol Ubac1 provided by MGI Official Full Name ubiquitin associated domain containing 1 provided by MGI Primary source MGI:MGI:1920995 See related Ensembl:ENSMUSG00000036352 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Kpc2; GBDR1; Gdbr1; Ubadc1; AA407978; 1110033G07Rik Expression Broad expression in liver E14.5 (RPKM 72.0), liver E14 (RPKM 70.3) and 27 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 A3 See Ubac1 in Genome Data Viewer

Exon count: 12

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (25996958..26021781, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (25852478..25877280, complement)

Chromosome 2 - NC_000068.7

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Ubac1 ENSMUSG00000036352

Description ubiquitin associated domain containing 1 [Source:MGI Symbol;Acc:MGI:1920995] Gene Synonyms 1110033G07Rik, Kpc2, Ubadc1 Location Chromosome 2: 25,998,543-26,021,747 reverse strand. GRCm38:CM000995.2 About this gene This gene has 9 transcripts (splice variants), 193 orthologues, 5 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ubac1-201 ENSMUST00000036509.13 1852 409aa ENSMUSP00000040220.7 Protein coding CCDS15795 Q8VDI7 TSL:1 GENCODE basic APPRIS P2

Ubac1-204 ENSMUST00000136750.2 778 260aa ENSMUSP00000123115.1 Protein coding - F6UMH9 CDS 5' and 3' incomplete TSL:5 APPRIS ALT2

Ubac1-205 ENSMUST00000146363.7 551 184aa ENSMUSP00000117683.1 Protein coding - F6Z1E7 CDS 5' and 3' incomplete TSL:5

Ubac1-202 ENSMUST00000123275.7 814 No protein - lncRNA - - TSL:1

Ubac1-209 ENSMUST00000154336.1 786 No protein - lncRNA - - TSL:3

Ubac1-208 ENSMUST00000150608.1 658 No protein - lncRNA - - TSL:2

Ubac1-207 ENSMUST00000148725.1 610 No protein - lncRNA - - TSL:3

Ubac1-206 ENSMUST00000146898.1 426 No protein - lncRNA - - TSL:5

Ubac1-203 ENSMUST00000134990.7 405 No protein - lncRNA - - TSL:2

Page 6 of 8 https://www.alphaknockout.com

43.20 kb Forward strand 25.99Mb 26.00Mb 26.01Mb 26.02Mb 26.03Mb Contigs AL731682.20 > AL845455.7 > (Comprehensive set... < Gm13542-201processed pseudogene < Ubac1-205protein coding

< Ubac1-202lncRNA < Ubac1-207lncRNA < Ubac1-206lncRNA

< Ubac1-201protein coding

< Ubac1-209lncRNA < Ubac1-203lncRNA

< Ubac1-204protein coding

< Ubac1-208lncRNA

Regulatory Build

25.99Mb 26.00Mb 26.01Mb 26.02Mb 26.03Mb Reverse strand 43.20 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

pseudogene RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000036509

< Ubac1-201protein coding

Reverse strand 23.20 kb

ENSMUSP00000040... MobiDB lite Low complexity (Seg) Superfamily Ubiquitin-like domain superfamily UBA-like superfamily SMART Ubiquitin-associated domain Heat shock chaperonin-binding Pfam Ubiquitin-associated domain PROSITE profiles Ubiquitin-associated domain PANTHER PTHR46738 Gene3D 1.10.8.10 1.10.260.100 CDD cd17066 UBAC1, UBA domain 1 UBAC1, UBA domain 2

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 409

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8