https://www.alphaknockout.com

Mouse Gapvd1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Gapvd1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Gapvd1 (NCBI Reference Sequence: NM_025709 ; Ensembl: ENSMUSG00000026867 ) is located on Mouse 2. 27 exons are identified, with the ATG start codon in exon 3 and the TGA stop codon in exon 27 (Transcript: ENSMUST00000102800). Exon 4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Gapvd1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-190C19 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 4 starts from about 4.31% of the coding region. The knockout of Exon 4 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 1288 bp, and the size of intron 4 for 3'-loxP site insertion: 1028 bp. The size of effective cKO region: ~1344 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 4 5 27 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Gapvd1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7844bp) | A(29.16% 2287) | C(18.03% 1414) | T(32.88% 2579) | G(19.94% 1564)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 34729425 34732424 3000 browser details YourSeq 464 451 2427 3000 93.2% chr12 + 86837225 87248698 411474 browser details YourSeq 246 773 1173 3000 89.6% chr10 + 120021605 120021928 324 browser details YourSeq 230 790 1181 3000 92.4% chr9 + 65918607 65919118 512 browser details YourSeq 190 769 1287 3000 94.9% chr10 - 24818444 24819046 603 browser details YourSeq 178 2274 2546 3000 91.7% chr1 + 74556254 74556772 519 browser details YourSeq 162 647 1011 3000 85.3% chr3 + 100408067 100408251 185 browser details YourSeq 160 808 1114 3000 93.5% chr6 + 120448974 120449594 621 browser details YourSeq 155 793 1515 3000 83.5% chr9 + 64070270 64070490 221 browser details YourSeq 151 2274 2488 3000 90.8% chr19 - 7389310 7389666 357 browser details YourSeq 149 773 1174 3000 86.8% chr7 + 99493654 99493822 169 browser details YourSeq 148 2283 2514 3000 92.6% chr19 - 32751837 32752272 436 browser details YourSeq 147 778 942 3000 95.2% chr13 - 106203573 106203741 169 browser details YourSeq 146 778 1174 3000 85.1% chr17 - 8291592 8291757 166 browser details YourSeq 145 792 1173 3000 85.9% chr1 - 176811846 176812049 204 browser details YourSeq 142 2232 2427 3000 92.9% chr11 + 102150204 102150688 485 browser details YourSeq 140 2272 2488 3000 93.8% chr9 - 62803057 62803360 304 browser details YourSeq 140 450 921 3000 88.2% chr4 - 119243080 119243531 452 browser details YourSeq 139 1051 1526 3000 83.1% chr4 - 106549644 106549885 242 browser details YourSeq 139 2271 2427 3000 95.0% chr4 + 83628899 83629087 189

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 34725081 34728080 3000 browser details YourSeq 129 44 426 3000 87.8% chr10 - 128685313 128916385 231073 browser details YourSeq 96 57 199 3000 88.5% chr2 - 49831106 49831256 151 browser details YourSeq 95 74 199 3000 91.0% chr11 - 79939536 79939668 133 browser details YourSeq 94 74 200 3000 90.2% chr11 + 96812787 96812930 144 browser details YourSeq 90 72 199 3000 89.0% chr2 - 67919675 67919807 133 browser details YourSeq 90 74 193 3000 90.5% chr16 - 56487434 56487558 125 browser details YourSeq 90 74 198 3000 90.5% chr9 + 50840702 50840833 132 browser details YourSeq 90 74 200 3000 89.8% chr8 + 29987737 29987882 146 browser details YourSeq 89 74 204 3000 87.5% chr6 - 23100349 23100483 135 browser details YourSeq 89 75 199 3000 89.5% chr2 + 71444683 71444812 130 browser details YourSeq 87 74 198 3000 86.7% chr17 + 88017777 88017908 132 browser details YourSeq 86 74 199 3000 88.4% chr14 - 52241825 52241955 131 browser details YourSeq 85 74 198 3000 89.3% chr16 + 32223952 32224083 132 browser details YourSeq 85 86 198 3000 90.0% chr1 + 58738563 58738690 128 browser details YourSeq 84 74 198 3000 85.3% chr15 - 81866879 81867011 133 browser details YourSeq 84 74 192 3000 89.1% chr9 + 48952050 48952176 127 browser details YourSeq 84 74 199 3000 88.4% chr4 + 147627377 147627509 133 browser details YourSeq 84 77 199 3000 87.3% chr4 + 50358187 50358315 129 browser details YourSeq 83 71 193 3000 87.7% chr17 - 30305891 30306018 128

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Gapvd1 GTPase activating protein and VPS9 domains 1 [ Mus musculus (house mouse) ] Gene ID: 66691, updated on 24-Oct-2019

Gene summary

Official Symbol Gapvd1 provided by MGI Official Full Name GTPase activating protein and VPS9 domains 1 provided by MGI Primary source MGI:MGI:1913941 See related Ensembl:ENSMUSG00000026867 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as RAP6; RME-6; Gapex-5; AW108497; mKIAA1521; 2010005B09Rik; 4432404J10Rik Expression Ubiquitous expression in CNS E14 (RPKM 8.3), liver E14 (RPKM 7.9) and 28 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 B See Gapvd1 in Genome Data Viewer

Exon count: 30

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (34676178..34755285, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (34532487..34610752, complement)

Chromosome 2 - NC_000068.7

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 15 transcripts

Gene: Gapvd1 ENSMUSG00000026867

Description GTPase activating protein and VPS9 domains 1 [Source:MGI Symbol;Acc:MGI:1913941] Gene Synonyms 2010005B09Rik, 4432404J10Rik Location Chromosome 2: 34,674,594-34,755,232 reverse strand. GRCm38:CM000995.2 About this gene This gene has 15 transcripts (splice variants), 203 orthologues, 6 paralogues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Gapvd1- ENSMUST00000102800.7 8339 1437aa ENSMUSP00000099864.1 Protein coding CCDS15949 Q6PAR5 TSL:1 202 GENCODE basic APPRIS P2

Gapvd1- ENSMUST00000028224.14 5695 1437aa ENSMUSP00000028224.8 Protein coding CCDS15949 Q6PAR5 TSL:1 201 GENCODE basic APPRIS P2

Gapvd1- ENSMUST00000113099.9 5758 1458aa ENSMUSP00000108723.3 Protein coding - Q6PAR5 TSL:5 203 GENCODE basic APPRIS ALT1

Gapvd1- ENSMUST00000113111.7 4543 896aa ENSMUSP00000108735.1 Protein coding - F7ADQ2 CDS 5' incomplete 206 TSL:5

Gapvd1- ENSMUST00000137528.7 2506 835aa ENSMUSP00000120138.1 Protein coding - F6X819 CDS 5' and 3' 208 incomplete TSL:5

Gapvd1- ENSMUST00000113103.8 2484 828aa ENSMUSP00000108727.2 Protein coding - F7ADS7 CDS 5' and 3' 205 incomplete TSL:1

Gapvd1- ENSMUST00000113101.7 2200 548aa ENSMUSP00000108725.1 Protein coding - F7ADT6 CDS 5' incomplete 204 TSL:5

Gapvd1- ENSMUST00000142436.1 675 146aa ENSMUSP00000126225.1 Protein coding - E9Q0D1 CDS 3' incomplete 210 TSL:3

Gapvd1- ENSMUST00000138203.1 705 62aa ENSMUSP00000127268.1 Nonsense mediated - F6UFP6 CDS 5' incomplete 209 decay TSL:3

Gapvd1- ENSMUST00000128855.2 589 97aa ENSMUSP00000129138.1 Nonsense mediated - F6YRL2 CDS 5' incomplete 207 decay TSL:5

Gapvd1- ENSMUST00000201772.1 4040 No - Retained intron - - TSL:NA 215 protein

Gapvd1- ENSMUST00000169207.1 2207 No - Retained intron - - TSL:1 214 protein

Gapvd1- ENSMUST00000156098.1 590 No - Retained intron - - TSL:2 212 protein

Gapvd1- ENSMUST00000150859.1 559 No - lncRNA - - TSL:5 211 protein

Gapvd1- ENSMUST00000167251.1 551 No - lncRNA - - TSL:3 213 protein

Page 6 of 8 https://www.alphaknockout.com

100.64 kb Forward strand 34.68Mb 34.70Mb 34.72Mb 34.74Mb 34.76Mb Contigs AL845262.3 > AL929106.5 > < Gapvd1-202protein coding (Comprehensive set...

< Gapvd1-206protein coding < Gapvd1-214retained intron

< Gapvd1-203protein coding

< Gapvd1-201protein coding

< Gapvd1-204protein coding < Gapvd1-212retained intron < Gapvd1-210protein coding

< Gapvd1-209nonsense mediated decay< Gapvd1-208protein coding

< Gapvd1-213lncRNA

< Gapvd1-215retained intron

< Gapvd1-205protein coding

< Gapvd1-207nonsense mediated decay

< Gapvd1-211lncRNA

Regulatory Build

34.68Mb 34.70Mb 34.72Mb 34.74Mb 34.76Mb Reverse strand 100.64 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000102800

< Gapvd1-202protein coding

Reverse strand 80.64 kb

ENSMUSP00000099... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Rho GTPase activation protein VPS9 domain superfamily

SMART VPS9 domain Pfam Ras GTPase-activating domain VPS9 domain

RABX5, catalytic core helical domain PROSITE profiles Ras GTPase-activating domain VPS9 domain

PANTHER PTHR23101:SF111

PTHR23101 Gene3D 1.10.506.10 VPS9 domain superfamily

1.10.246.120 CDD cd05129

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 200 400 600 800 1000 1200 1437

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8