https://www.alphaknockout.com

Mouse Uevld Knockout Project (CRISPR/Cas9)

Objective: To create a Uevld knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Uevld (NCBI Reference Sequence: NM_001040695 ; Ensembl: ENSMUSG00000043262 ) is located on Mouse 7. 12 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 12 (Transcript: ENSMUST00000094398). Exon 2~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 3.04% of the coding region. Exon 2~5 covers 31.92% of the coding region. The size of effective KO region: ~7790 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 12

Legends Exon of mouse Uevld Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.5% 530) | C(21.5% 430) | T(28.55% 571) | G(23.45% 469)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.45% 489) | C(20.95% 419) | T(33.95% 679) | G(20.65% 413)

Note: The 2000 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 - 46955703 46957702 2000 browser details YourSeq 200 102 930 2000 88.5% chr15 - 98875047 99367991 492945 browser details YourSeq 124 802 1016 2000 92.5% chr16 - 41796982 41797560 579 browser details YourSeq 122 802 977 2000 91.3% chr13 - 51659166 51659632 467 browser details YourSeq 118 775 956 2000 88.9% chr12 - 67589969 67590166 198 browser details YourSeq 116 802 956 2000 91.5% chr14 - 20002338 20002506 169 browser details YourSeq 112 802 954 2000 84.1% chr2 - 72215338 72215477 140 browser details YourSeq 112 798 956 2000 84.1% chr15 + 54828539 54828684 146 browser details YourSeq 111 802 952 2000 91.3% chr14 - 29117953 29118116 164 browser details YourSeq 111 802 955 2000 86.2% chr11 + 97649822 97649960 139 browser details YourSeq 111 802 955 2000 86.3% chr1 + 4812210 4812348 139 browser details YourSeq 110 802 955 2000 86.3% chr4 + 90080399 90080542 144 browser details YourSeq 109 803 952 2000 91.0% chr3 - 68581072 68581225 154 browser details YourSeq 109 802 952 2000 84.5% chr16 - 33524399 33524537 139 browser details YourSeq 109 789 953 2000 88.7% chr16 + 27202249 27202435 187 browser details YourSeq 108 802 948 2000 91.0% chr11 - 87194864 87195010 147 browser details YourSeq 107 806 950 2000 91.1% chr19 - 45243656 45243808 153 browser details YourSeq 107 802 955 2000 86.4% chr15 - 85094434 85094572 139 browser details YourSeq 106 802 957 2000 84.4% chr10 - 60481091 60481227 137 browser details YourSeq 106 789 948 2000 90.1% chr14 + 28818019 28818194 176

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 - 46945913 46947912 2000 browser details YourSeq 183 751 1603 2000 78.2% chr1 + 91408274 91408617 344 browser details YourSeq 161 1272 1658 2000 89.7% chr4 + 132150031 132561099 411069 browser details YourSeq 160 1230 1653 2000 86.5% chr18 - 77485942 77486286 345 browser details YourSeq 156 748 920 2000 95.9% chr15 - 103270736 103270924 189 browser details YourSeq 150 733 917 2000 91.3% chr2 - 32782004 32782191 188 browser details YourSeq 148 748 917 2000 95.8% chr1 + 157759476 157759648 173 browser details YourSeq 145 761 917 2000 96.9% chr5 + 120215116 120215308 193 browser details YourSeq 144 747 917 2000 90.6% chr13 + 42876881 42877042 162 browser details YourSeq 143 1466 1653 2000 92.0% chr1 - 80075289 80075483 195 browser details YourSeq 141 1270 1973 2000 93.3% chr8 + 122605989 122606735 747 browser details YourSeq 140 764 920 2000 95.6% chr17 - 67856619 67856779 161 browser details YourSeq 140 750 920 2000 90.9% chr16 + 35999708 35999877 170 browser details YourSeq 139 766 920 2000 95.5% chr1 - 74117131 74117290 160 browser details YourSeq 139 749 924 2000 87.4% chr15 + 17084979 17085145 167 browser details YourSeq 138 763 917 2000 94.9% chr3 - 96861280 96861442 163 browser details YourSeq 138 780 958 2000 91.4% chr14 - 57918468 57918637 170 browser details YourSeq 138 1460 1649 2000 89.8% chr5 + 145119983 145120178 196 browser details YourSeq 137 1373 1654 2000 92.6% chr15 - 71037391 71037842 452 browser details YourSeq 136 1464 1658 2000 87.3% chr10 - 21907538 21907727 190

Note: The 2000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Uevld UEV and lactate/malate dehyrogenase domains [ Mus musculus (house mouse) ] Gene ID: 54122, updated on 12-Aug-2019

Gene summary

Official Symbol Uevld provided by MGI Official Full Name UEV and lactate/malate dehyrogenase domains provided by MGI Primary source MGI:MGI:1860490 See related Ensembl:ENSMUSG00000043262 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Attp; 8430408E05Rik Expression Ubiquitous expression in placenta adult (RPKM 4.6), bladder adult (RPKM 3.5) and 28 other tissues See more Orthologs human all

Genomic context

Location: 7; 7 B3 See Uevld in Genome Data Viewer Exon count: 14

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (46923216..46958532, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (54178586..54213888, complement)

Chromosome 7 - NC_000073.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Uevld ENSMUSG00000043262

Description UEV and lactate/malate dehyrogenase domains [Source:MGI Symbol;Acc:MGI:1860490] Gene Synonyms 8430408E05Rik, Attp Location Chromosome 7: 46,923,216-46,958,527 reverse strand. GRCm38:CM001000.2 About this gene This gene has 6 transcripts (splice variants), 148 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 6 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Uevld- ENSMUST00000094398.12 4616 471aa ENSMUSP00000091964.4 Protein coding CCDS39965 Q3U1V6 TSL:1 201 GENCODE basic APPRIS P1

Uevld- ENSMUST00000207986.1 4425 250aa ENSMUSP00000146930.1 Nonsense mediated - Q3U1V6 TSL:1 203 decay

Uevld- ENSMUST00000208308.1 532 46aa ENSMUSP00000146385.1 Nonsense mediated - A0A140LHE8 TSL:5 204 decay

Uevld- ENSMUST00000210227.1 478 73aa ENSMUSP00000147932.1 Nonsense mediated - A0A1B0GSG9 TSL:5 206 decay

Uevld- ENSMUST00000207738.1 3226 No - Retained intron - - TSL:1 202 protein

Uevld- ENSMUST00000208388.1 801 No - Retained intron - - TSL:1 205 protein

Page 7 of 9 https://www.alphaknockout.com

55.31 kb Forward strand 46.92Mb 46.93Mb 46.94Mb 46.95Mb 46.96Mb Gm45628-201 >processed pseudogene Gm19248-201 >processed pseudogene (Comprehensive set...

Contigs < AC090123.68 Genes (Comprehensive set... < Tsg101-201protein coding < Uevld-201protein coding < C86187-202lncRNA

< Tsg101-205protein coding < Uevld-203nonsense mediated decay < C86187-203lncRNA

< Tsg101-208nonsense mediated decay < Gm45737-201TEC < Uevld-206nonsense mediated decay < C86187-201lncRNA

< Tsg101-209lncRNA < Uevld-202retained intron

< Tsg101-207protein coding < Uevld-205retained intron

< Tsg101-203protein coding < Uevld-204nonsense mediated decay

Regulatory Build

46.92Mb 46.93Mb 46.94Mb 46.95Mb 46.96Mb Reverse strand 55.31 kb

Regulation Legend CTCF Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

pseudogene processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000094398

< Uevld-201protein coding

Reverse strand 35.31 kb

ENSMUSP00000091... Superfamily Ubiquitin-conjugating enzyme/RWD-like NAD(P)-binding domain superfamily Lactate dehydrogenase/glycoside hydrolase, family 4, C-terminal

Prints L-lactate/malate dehydrogenase Pfam Ubiquitin E2 variant, N-terminal Lactate/malate dehydrogenase, C-terminal

Lactate/malate dehydrogenase, N-terminal PROSITE profiles Ubiquitin E2 variant, N-terminal

PANTHER PTHR23306

PTHR23306:SF18 Gene3D Ubiquitin-conjugating enzyme/RWD-like 3.40.50.720 Lactate dehydrogenase/glycoside hydrolase, family 4, C-terminal

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 471

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9