https://www.alphaknockout.com

Mouse Vps26a Knockout Project (CRISPR/Cas9)

Objective: To create a Vps26a knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Vps26a (NCBI Reference Sequence: NM_001113355 ; Ensembl: ENSMUSG00000020078 ) is located on Mouse 10. 9 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 9 (Transcript: ENSMUST00000092473). Exon 2~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygotes for a null mutation induced by transgene insertion exhibit retarded growth of the embryonic ectoderm beginning at embryonic day 7.5 and often, defects of the amnion and chorion. Mutant embryos arrest about day 9.5.

Exon 2 starts from about 9.29% of the coding region. Exon 2~3 covers 20.98% of the coding region. The size of effective KO region: ~9648 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 9

Legends Exon of mouse Vps26a Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1002 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.35% 487) | C(18.3% 366) | T(35.05% 701) | G(22.3% 446)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1002bp) | A(30.84% 309) | C(15.77% 158) | T(34.13% 342) | G(19.26% 193)

Note: The 1002 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr10 - 62480708 62482707 2000 browser details YourSeq 184 999 1966 2000 91.9% chr10 + 62393103 62480791 87689 browser details YourSeq 146 1001 1175 2000 92.6% chr4 - 155782639 155782825 187 browser details YourSeq 145 995 1173 2000 91.1% chr8 - 25421152 25421332 181 browser details YourSeq 145 995 1170 2000 91.5% chr10 - 52592333 52592508 176 browser details YourSeq 141 988 1184 2000 89.9% chr5 + 121495803 121496019 217 browser details YourSeq 140 992 1170 2000 89.8% chr10 - 43894305 43894484 180 browser details YourSeq 138 1003 1177 2000 89.7% chrX - 113172603 113172778 176 browser details YourSeq 138 974 1167 2000 88.9% chr10 - 62551407 62551624 218 browser details YourSeq 138 998 1169 2000 90.7% chr2 + 120633846 120634020 175 browser details YourSeq 137 995 1170 2000 91.1% chr1 - 178242183 178463304 221122 browser details YourSeq 137 999 1173 2000 89.8% chr18 + 67694401 67694577 177 browser details YourSeq 136 992 1167 2000 87.8% chr5 - 86816061 86816234 174 browser details YourSeq 136 992 1173 2000 87.9% chr13 - 99288581 99288763 183 browser details YourSeq 136 992 1170 2000 87.1% chr1 - 79696082 79696256 175 browser details YourSeq 135 999 1177 2000 91.0% chr2 + 155455208 155455403 196 browser details YourSeq 134 994 1170 2000 88.2% chr2 + 179921391 179921570 180 browser details YourSeq 134 1007 1171 2000 91.0% chr2 + 128461808 128461973 166 browser details YourSeq 133 999 1169 2000 89.5% chr5 - 149023289 149023471 183 browser details YourSeq 133 1007 1170 2000 90.9% chr5 - 118650500 118650664 165

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1002 1 1002 1002 100.0% chr10 - 62470058 62471059 1002 browser details YourSeq 23 468 493 1002 96.0% chr1 - 127096709 127096738 30 browser details YourSeq 23 395 420 1002 96.2% chr1 - 9808105 9808131 27 browser details YourSeq 23 118 140 1002 100.0% chr12 + 76951531 76951553 23 browser details YourSeq 22 87 108 1002 100.0% chr2 - 140563585 140563606 22 browser details YourSeq 21 686 706 1002 100.0% chr13 - 107901651 107901671 21 browser details YourSeq 20 516 535 1002 100.0% chr10 - 109143517 109143536 20 browser details YourSeq 20 405 424 1002 100.0% chr10 - 73987845 73987864 20

Note: The 1002 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Vps26a VPS26 complex component A [ Mus musculus (house mouse) ] Gene ID: 30930, updated on 12-Aug-2019

Gene summary

Official Symbol Vps26a provided by MGI Official Full Name VPS26 retromer complex component A provided by MGI Primary source MGI:MGI:1353654 See related Ensembl:ENSMUSG00000020078 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as HB58; Vps26; AA407240 Expression Broad expression in testis adult (RPKM 37.1), placenta adult (RPKM 23.8) and 21 other tissues See more Orthologs human all

Genomic context

Location: 10; 10 B4 See Vps26a in Genome Data Viewer Exon count: 10

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (62454843..62486805, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (61917591..61949553, complement)

Chromosome 10 - NC_000076.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Vps26a ENSMUSG00000020078

Description VPS26 retromer complex component A [Source:MGI Symbol;Acc:MGI:1353654] Gene Synonyms H beta 58, HB58, Vps26 Location : 62,455,235-62,486,805 reverse strand. GRCm38:CM001003.2 About this gene This gene has 4 transcripts (splice variants), 218 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 8 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Vps26a- ENSMUST00000092473.4 2712 359aa ENSMUSP00000090130.3 Protein coding CCDS48579 P40336 TSL:1 201 GENCODE basic

Vps26a- ENSMUST00000105447.10 2597 327aa ENSMUSP00000101087.3 Protein coding CCDS35921 P40336 TSL:1 202 GENCODE basic APPRIS P1

Vps26a- ENSMUST00000217868.1 456 123aa ENSMUSP00000151888.1 Protein coding - A0A1W2P7Z9 CDS 3' 203 incomplete TSL:2

Vps26a- ENSMUST00000219574.1 2288 63aa ENSMUSP00000151774.1 Nonsense mediated - A0A1W2P7R7 TSL:1 204 decay

51.57 kb Forward strand 62.45Mb 62.46Mb 62.47Mb 62.48Mb 62.49Mb 4930507D05Rik-201 >lncRNA (Comprehensive set...

4930507D05Rik-202 >lncRNA

Contigs AC126428.3 >

Genes (Comprehensive set... < Supv3l1-201protein coding < Vps26a-201protein coding < Srgn-204protein coding

< Supv3l1-205nonsense mediated decay < Vps26a-203protein coding < Srgn-201protein coding

< Vps26a-204nonsense mediated decay < Srgn-205protein coding

< Vps26a-202protein coding

Regulatory Build

62.45Mb 62.46Mb 62.47Mb 62.48Mb 62.49Mb Reverse strand 51.57 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000092473

< Vps26a-201protein coding

Reverse strand 31.36 kb

ENSMUSP00000090... MobiDB lite Low complexity (Seg) Pfam Vacuolar protein sorting protein 26 related PANTHER PTHR12233:SF4

Vacuolar protein sorting protein 26 related Gene3D , C-terminal

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 359

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8