https://www.alphaknockout.com

Mouse Vars2 Knockout Project (CRISPR/Cas9)

Objective: To create a Vars2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Vars2 (NCBI Reference Sequence: NM_175137 ; Ensembl: ENSMUSG00000038838 ) is located on Mouse 17. 29 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 29 (Transcript: ENSMUST00000043674). Exon 7~25 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 7 starts from about 21.13% of the coding region. Exon 7~25 covers 63.05% of the coding region. The size of effective KO region: ~6721 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3' 16 18 21

1 7 8 9 10 11 12 13 14 15 17 19 20 2223 24 25 29

Legends Exon of mouse Vars2 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 933 bp section upstream of Exon 7 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 813 bp section downstream of Exon 25 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(933bp) | A(20.58% 192) | C(23.15% 216) | T(33.44% 312) | G(22.83% 213)

Note: The 933 bp section upstream of Exon 7 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(813bp) | A(23.25% 189) | C(23.49% 191) | T(26.69% 217) | G(26.57% 216)

Note: The 813 bp section downstream of Exon 25 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 933 1 933 933 100.0% chr17 - 35664853 35665785 933 browser details YourSeq 54 384 440 933 100.0% chr8 - 53087417 53087539 123 browser details YourSeq 46 380 428 933 98.0% chr1 + 98786274 98786330 57 browser details YourSeq 45 370 427 933 81.5% chr1 - 130816275 130816328 54 browser details YourSeq 41 384 427 933 97.8% chr1 - 11830016 11830071 56 browser details YourSeq 39 400 440 933 97.6% chr11 + 67493628 67493668 41 browser details YourSeq 36 404 440 933 100.0% chr3 - 122631481 122631533 53 browser details YourSeq 31 380 414 933 94.3% chr1 - 160331756 160331790 35 browser details YourSeq 30 611 649 933 78.8% chr5 + 129410623 129410656 34 browser details YourSeq 30 628 698 933 94.2% chr1 + 119512774 119512846 73 browser details YourSeq 29 612 647 933 96.9% chr15 - 29018841 29018890 50 browser details YourSeq 27 621 650 933 85.8% chr2 - 166537571 166537598 28 browser details YourSeq 26 624 649 933 100.0% chr2 - 168265787 168265812 26 browser details YourSeq 26 624 649 933 100.0% chr10 + 119387954 119387979 26 browser details YourSeq 26 624 649 933 100.0% chr10 + 116801005 116801030 26 browser details YourSeq 26 623 649 933 100.0% chr1 + 166931824 166931856 33 browser details YourSeq 25 624 650 933 100.0% chr7 - 141030894 141030928 35 browser details YourSeq 25 624 649 933 100.0% chr18 - 56096361 56096388 28 browser details YourSeq 25 623 647 933 100.0% chr14 + 37814348 37814372 25 browser details YourSeq 25 623 648 933 100.0% chr12 + 112138674 112138714 41

Note: The 933 bp section upstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 813 1 813 813 100.0% chr17 - 35657319 35658131 813 browser details YourSeq 66 200 371 813 88.1% chr11 - 4156882 4157170 289 browser details YourSeq 62 340 469 813 89.8% chr2 + 91245301 91245490 190 browser details YourSeq 59 159 238 813 92.8% chr13 - 52638785 52638895 111 browser details YourSeq 55 341 502 813 89.9% chr14 - 65854586 65854942 357 browser details YourSeq 52 189 357 813 94.9% chr4 - 3515343 3515604 262 browser details YourSeq 51 402 672 813 78.0% chr13 + 65467351 65467604 254 browser details YourSeq 50 169 355 813 93.2% chr10 + 42541099 42541342 244 browser details YourSeq 46 338 398 813 91.1% chr2 + 154383525 154383588 64 browser details YourSeq 46 338 500 813 92.6% chr15 + 51468009 51468299 291 browser details YourSeq 45 338 482 813 88.0% chr2 + 73771155 73771345 191 browser details YourSeq 43 335 397 813 89.6% chr12 + 83946045 83946105 61 browser details YourSeq 42 337 396 813 92.0% chr10 - 91064428 91064494 67 browser details YourSeq 42 303 431 813 89.2% chr1 - 59794755 59794881 127 browser details YourSeq 42 159 225 813 95.7% chr8 + 20412319 20412386 68 browser details YourSeq 40 167 238 813 97.7% chr8 - 111181487 111181590 104 browser details YourSeq 40 337 500 813 91.7% chr10 - 62277016 62277220 205 browser details YourSeq 40 337 390 813 93.5% chr2 + 24767113 24767166 54 browser details YourSeq 39 159 210 813 97.6% chr5 - 143395176 143395446 271 browser details YourSeq 39 303 383 813 90.7% chr12 - 87394143 87394222 80

Note: The 813 bp section downstream of Exon 25 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Vars2 valyl-tRNA synthetase 2, mitochondrial [ Mus musculus (house mouse) ] Gene ID: 68915, updated on 12-Aug-2019

Gene summary

Official Symbol Vars2 provided by MGI Official Full Name valyl-tRNA synthetase 2, mitochondrial provided by MGI Primary source MGI:MGI:1916165 See related Ensembl:ENSMUSG00000038838 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Vars2l; mKIAA1885; 1190004I24Rik Expression Ubiquitous expression in thymus adult (RPKM 24.0), stomach adult (RPKM 15.8) and 28 other tissues See more Orthologs human all

Genomic context

Location: 17; 17 B1 See Vars2 in Genome Data Viewer Exon count: 29

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (35655634..35667639, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (35792579..35804537, complement)

Chromosome 17 - NC_000083.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 12 transcripts

Gene: Vars2 ENSMUSG00000038838

Description valyl-tRNA synthetase 2, mitochondrial [Source:MGI Symbol;Acc:MGI:1916165] Gene Synonyms 1190004I24Rik, Vars2l Location Chromosome 17: 35,655,634-35,667,592 reverse strand. GRCm38:CM001010.2 About this gene This gene has 12 transcripts (splice variants), 174 orthologues, 7 paralogues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Vars2- ENSMUST00000043674.14 4425 1060aa ENSMUSP00000047917.8 Protein coding CCDS37602 Q3U2A8 TSL:1 201 GENCODE basic APPRIS P1

Vars2- ENSMUST00000168922.7 2507 581aa ENSMUSP00000129196.1 Nonsense mediated - F6YJY2 CDS 5' 208 decay incomplete TSL:1

Vars2- ENSMUST00000164404.7 1972 215aa ENSMUSP00000126084.1 Nonsense mediated - F7A8L1 CDS 5' 203 decay incomplete TSL:5

Vars2- ENSMUST00000169093.1 832 112aa ENSMUSP00000126794.1 Nonsense mediated - F6Z939 CDS 5' 209 decay incomplete TSL:5

Vars2- ENSMUST00000165144.1 718 75aa ENSMUSP00000132897.1 Nonsense mediated - E9PWQ5 TSL:3 205 decay

Vars2- ENSMUST00000165787.7 3803 No - Retained intron - - TSL:1 206 protein

Vars2- ENSMUST00000171536.7 786 No - Retained intron - - TSL:2 211 protein

Vars2- ENSMUST00000164295.7 722 No - Retained intron - - TSL:2 202 protein

Vars2- ENSMUST00000168885.1 708 No - Retained intron - - TSL:5 207 protein

Vars2- ENSMUST00000164978.1 618 No - Retained intron - - TSL:2 204 protein

Vars2- ENSMUST00000170701.1 603 No - Retained intron - - TSL:3 210 protein

Vars2- ENSMUST00000174129.7 319 No - Retained intron - - TSL:2 212 protein

Page 7 of 9 https://www.alphaknockout.com

31.96 kb Forward strand 35.65Mb 35.66Mb 35.67Mb Sfta2-202 >protein coding Gm20483-201 >lncRNA (Comprehensive set...

Sfta2-203 >lncRNA

Sfta2-201 >protein coding

Contigs CR974473.23 > CR974483.16 >

Genes (Comprehensive set... < Vars2-201protein coding < Gtf2h4-201protein coding

< Vars2-206retained intron < Gtf2h4-212retained intron

< Vars2-208nonsense mediated decay < Vars2-204retained intron < Gtf2h4-211retained intron

< Vars2-203nonsense mediated decay < Vars2-211retained intron< Gtf2h4-204protein coding

< Vars2-202retained intron < Vars2-205nonsense mediated decay

< Vars2-210retained intron < Gtf2h4-205retained intron

< Vars2-212retained intron < Gtf2h4-207protein coding

< Vars2-209nonsense mediated decay < Gtf2h4-208nonsense mediated decay

< Vars2-207retained intron < Gtf2h4-209protein coding

< Gtf2h4-202retained intron

< Gtf2h4-210retained intron

< Gtf2h4-206lncRNA

< Gtf2h4-203protein coding

Regulatory Build

35.65Mb 35.66Mb 35.67Mb Reverse strand 31.96 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000043674

< Vars2-201protein coding

Reverse strand 11.96 kb

ENSMUSP00000047... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) TIGRFAM Valine-tRNA ligase Superfamily SSF52374 Aminoacyl-tRNA synthetase, class Ia, anticodon-binding

Valyl/Leucyl/Isoleucyl-tRNA synthetase, editing domain Prints Valine-tRNA ligase Pfam Aminoacyl-tRNA synthetase, class Ia Methionyl/Valyl/Leucyl/Isoleucyl-tRNA synthetase, anticodon-binding

PROSITE patterns Aminoacyl-tRNA synthetase, class I, conserved site PANTHER PTHR11946:SF71

PTHR11946 Gene3D Rossmann-like alpha/beta/alpha sandwich fold 1.10.730.10

Valyl/Leucyl/Isoleucyl-tRNA synthetase, editing domain CDD cd00817 Valyl tRNA synthetase, anticodon-binding domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant missense variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1060

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9