https://www.alphaknockout.com

Mouse Vipas39 Knockout Project (CRISPR/Cas9)

Objective: To create a Vipas39 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Vipas39 (NCBI Reference Sequence: NM_001142580 ; Ensembl: ENSMUSG00000021038 ) is located on Mouse 12. 20 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 20 (Transcript: ENSMUST00000072744). Exon 2~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a conditional allele activated by an inducible cre exhibit dry and scaly skin, hair loss, and defects in tail tendon collagen I structure.

Exon 2 starts from the coding region. Exon 2~5 covers 25.53% of the coding region. The size of effective KO region: ~6583 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 20

Legends Exon of mouse Vipas39 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 865 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.55% 531) | C(18.75% 375) | T(34.8% 696) | G(19.9% 398)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(865bp) | A(26.94% 233) | C(21.85% 189) | T(26.24% 227) | G(24.97% 216)

Note: The 865 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 - 87263998 87265997 2000 browser details YourSeq 36 900 969 2000 91.0% chr13 - 55367281 55367352 72 browser details YourSeq 36 878 921 2000 95.2% chr9 + 44797649 44797698 50 browser details YourSeq 32 227 259 2000 100.0% chr10 - 79999290 79999323 34 browser details YourSeq 31 904 969 2000 74.3% chr19 + 9947580 9947646 67 browser details YourSeq 31 1874 1998 2000 91.0% chr10 + 78704496 78704619 124 browser details YourSeq 30 885 921 2000 91.9% chr4 - 144885479 144885516 38 browser details YourSeq 30 231 260 2000 100.0% chr3 + 89600035 89600064 30 browser details YourSeq 30 892 921 2000 100.0% chr13 + 88807298 88807327 30 browser details YourSeq 29 770 817 2000 94.0% chr5 - 39679500 39679548 49 browser details YourSeq 29 889 921 2000 94.0% chr15 - 99589016 99589048 33 browser details YourSeq 29 897 925 2000 100.0% chr10 + 62773584 62773612 29 browser details YourSeq 28 888 921 2000 91.2% chrX - 105708674 105708707 34 browser details YourSeq 27 888 921 2000 86.3% chr2 - 121639520 121639551 32 browser details YourSeq 26 899 925 2000 100.0% chr14 - 18314879 18314906 28 browser details YourSeq 26 888 915 2000 96.5% chr1 + 63380470 63380497 28 browser details YourSeq 25 899 929 2000 96.3% chr4 + 43516851 43516882 32 browser details YourSeq 23 889 915 2000 96.0% chr2 - 113969786 113969813 28 browser details YourSeq 22 891 922 2000 84.4% chr4 + 144433215 144433246 32 browser details YourSeq 22 239 260 2000 100.0% chr13 + 44962473 44962494 22

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 865 1 865 865 100.0% chr12 - 87256550 87257414 865 browser details YourSeq 165 306 646 865 84.6% chr15 + 100886703 100887078 376 browser details YourSeq 153 306 606 865 83.8% chr16 - 22402764 22403255 492 browser details YourSeq 146 305 609 865 86.6% chr3 + 143128596 143128887 292 browser details YourSeq 139 306 639 865 85.4% chr19 - 48256513 48256886 374 browser details YourSeq 134 306 639 865 88.9% chr2 + 153721431 153721809 379 browser details YourSeq 134 305 609 865 85.0% chr1 + 162561580 162561910 331 browser details YourSeq 130 319 740 865 87.3% chr13 - 43183704 43184165 462 browser details YourSeq 129 300 606 865 89.1% chr5 - 128496800 128497114 315 browser details YourSeq 126 306 641 865 84.5% chr7 + 144214277 144214634 358 browser details YourSeq 124 305 643 865 85.5% chr17 + 49074096 49074489 394 browser details YourSeq 120 372 640 865 85.6% chr13 + 47163292 47163591 300 browser details YourSeq 118 305 643 865 87.3% chr17 - 83053852 83054226 375 browser details YourSeq 118 399 641 865 87.1% chr4 + 142869541 142870101 561 browser details YourSeq 117 238 642 865 85.9% chr2 - 35570212 35570614 403 browser details YourSeq 116 236 643 865 84.5% chr10 - 43240715 43241203 489 browser details YourSeq 112 309 642 865 87.3% chr5 - 67734860 67735206 347 browser details YourSeq 112 371 640 865 85.4% chr1 + 40488671 40489127 457 browser details YourSeq 110 358 609 865 89.9% chr18 - 77063465 77063731 267 browser details YourSeq 110 308 655 865 90.6% chr1 + 131808103 131808451 349

Note: The 865 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Vipas39 VPS33B interacting protein, apical-basolateral polarity regulator, spe-39 homolog [ Mus musculus (house mouse) ] Gene ID: 104799, updated on 12-Aug-2019

Gene summary

Official Symbol Vipas39 provided by MGI Official Full Name VPS33B interacting protein, apical-basolateral polarity regulator, spe-39 homolog provided by MGI Primary source MGI:MGI:2144805 See related Ensembl:ENSMUSG00000021038 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Spe39; Vipar; SPE-39; hSPE-39; AI413782; 6720456H09Rik; 9330175H22Rik Expression Ubiquitous expression in CNS E14 (RPKM 16.2), CNS E18 (RPKM 15.9) and 28 other tissues See more Orthologs human all

Genomic context

Location: 12; 12 D2 See Vipas39 in Genome Data Viewer Exon count: 22

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 12 NC_000078.6 (87238875..87266378, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 12 NC_000078.5 (88579825..88607236, complement)

Chromosome 12 - NC_000078.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Vipas39 ENSMUSG00000021038

Description VPS33B interacting protein, apical-basolateral polarity regulator, spe-39 homolog [Source:MGI Symbol;Acc:MGI:2144805] Gene Synonyms SPE-39, Vipar Location Chromosome 12: 87,238,868-87,266,256 reverse strand. GRCm38:CM001005.2 About this gene This gene has 6 transcripts (splice variants), 203 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Vipas39-202 ENSMUST00000072744.14 2524 491aa ENSMUSP00000072527.7 Protein coding CCDS49119 Q8BGQ1 TSL:1 GENCODE basic APPRIS P1

Vipas39-201 ENSMUST00000021426.9 2509 472aa ENSMUSP00000021426.9 Protein coding CCDS26074 Q8BGQ1 TSL:1 GENCODE basic

Vipas39-203 ENSMUST00000179379.8 2418 472aa ENSMUSP00000137190.1 Protein coding CCDS26074 Q8BGQ1 TSL:1 GENCODE basic

Vipas39-206 ENSMUST00000222350.1 1790 No protein - Retained intron - - TSL:1

Vipas39-204 ENSMUST00000220858.1 1398 No protein - Retained intron - - TSL:2

Vipas39-205 ENSMUST00000221707.1 417 No protein - Retained intron - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

47.39 kb Forward strand 87.23Mb 87.24Mb 87.25Mb 87.26Mb 87.27Mb Ahsa1-205 >protein coding (Comprehensive set...

Ahsa1-201 >protein coding

Ahsa1-202 >retained intron

Gm47293-201 >miRNA

Ahsa1-203 >retained intron

Ahsa1-204 >retained intron

Contigs AC151967.6 > Genes (Comprehensive set... < Noxred1-203protein coding < Vipas39-204retained intro

< Noxred1-201protein coding < Vipas39-202protein coding

< Noxred1-202protein coding < Vipas39-203protein coding

< Vipas39-201protein coding

< Vipas39-206retained intron

Regulatory Build

87.23Mb 87.24Mb 87.25Mb 87.26Mb 87.27Mb Reverse strand 47.39 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000072744

< Vipas39-202protein coding

Reverse strand 27.38 kb

ENSMUSP00000072... MobiDB lite Low complexity (Seg) Pfam Vps16, C-terminal PANTHER PTHR13364:SF6

Spermatogenesis-defective protein 39 Gene3D Vps16, C-terminal domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 491

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9