https://www.alphaknockout.com

Mouse Vwa2 Knockout Project (CRISPR/Cas9)

Objective: To create a Vwa2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Vwa2 (NCBI Reference Sequence: NM_172840 ; Ensembl: ENSMUSG00000025082 ) is located on Mouse 19. 14 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 14 (Transcript: ENSMUST00000026068). Exon 2~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from the coding region. Exon 2~4 covers 11.0% of the coding region. The size of effective KO region: ~5887 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 14

Legends Exon of mouse Vwa2 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 4 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.1% 502) | C(22.15% 443) | T(27.45% 549) | G(25.3% 506)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.5% 550) | C(29.8% 596) | T(22.05% 441) | G(20.65% 413)

Note: The 2000 bp section downstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr19 + 56879136 56881135 2000 browser details YourSeq 129 1214 1523 2000 85.2% chr9 - 30884051 30884421 371 browser details YourSeq 116 1214 1523 2000 88.2% chr6 - 97844734 97845307 574 browser details YourSeq 110 1197 1522 2000 92.4% chr18 + 10491620 10491999 380 browser details YourSeq 105 1335 1529 2000 92.0% chr5 + 113466007 113466270 264 browser details YourSeq 103 196 1519 2000 93.5% chr1 - 10593684 10768387 174704 browser details YourSeq 102 1242 1519 2000 90.5% chr17 + 4160835 4161125 291 browser details YourSeq 100 1214 1529 2000 92.4% chr9 + 113692813 113693143 331 browser details YourSeq 100 1242 1519 2000 93.3% chr8 + 41409407 41409685 279 browser details YourSeq 98 1224 1529 2000 90.9% chr16 + 42495209 42495565 357 browser details YourSeq 97 1386 1542 2000 88.9% chr14 - 46731741 46731914 174 browser details YourSeq 97 1224 1519 2000 88.3% chr11 - 102450546 102450878 333 browser details YourSeq 96 1244 1519 2000 92.3% chr1 + 13089584 13198429 108846 browser details YourSeq 94 1224 1523 2000 89.9% chr5 - 147158843 147159204 362 browser details YourSeq 94 1224 1498 2000 90.6% chr15 - 100375955 100376249 295 browser details YourSeq 93 1250 1519 2000 92.0% chr1 - 18995773 19161317 165545 browser details YourSeq 92 1217 1513 2000 84.6% chr14 - 18452145 18452430 286 browser details YourSeq 91 1329 1514 2000 91.7% chr4 - 78761383 78761630 248 browser details YourSeq 90 1330 1519 2000 91.7% chr9 - 78002474 78002701 228 browser details YourSeq 89 1215 1504 2000 87.4% chr6 + 86702839 86703204 366

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr19 + 56887013 56889012 2000 browser details YourSeq 261 1260 1619 2000 95.9% chr19 - 53867239 54480216 612978 browser details YourSeq 256 1260 1614 2000 93.0% chr19 - 54479860 54480301 442 browser details YourSeq 256 1281 1619 2000 94.3% chr17 + 4376982 4377610 629 browser details YourSeq 251 1260 1636 2000 93.5% chr15 - 71736774 71737522 749 browser details YourSeq 228 1281 1619 2000 92.9% chr19 - 54479927 54480335 409 browser details YourSeq 219 1269 1619 2000 93.6% chr19 - 54479915 54480284 370 browser details YourSeq 200 1260 1614 2000 93.6% chr4 - 136868669 136869257 589 browser details YourSeq 198 1313 1619 2000 93.4% chr5 - 25703861 25704247 387 browser details YourSeq 195 1249 1553 2000 90.9% chr5 - 25703858 25704228 371 browser details YourSeq 195 1375 1614 2000 96.3% chr19 - 53867211 53867641 431 browser details YourSeq 189 1313 1619 2000 89.2% chr5 + 25546956 25547248 293 browser details YourSeq 186 1292 1611 2000 90.6% chr9 - 106207495 106207930 436 browser details YourSeq 186 1291 1566 2000 92.8% chr5 + 25546952 25547264 313 browser details YourSeq 174 1313 1623 2000 95.4% chr15 - 71736997 71737478 482 browser details YourSeq 172 1287 1614 2000 87.4% chr13 - 105380641 105380925 285 browser details YourSeq 172 1343 1617 2000 92.3% chr12 - 109431999 109432379 381 browser details YourSeq 167 1297 1544 2000 93.4% chr10 - 16975574 16975837 264 browser details YourSeq 166 1328 1613 2000 94.0% chr5 - 139665756 139666055 300 browser details YourSeq 163 1267 1553 2000 91.6% chr9 - 106207318 106207670 353

Note: The 2000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Vwa2 von Willebrand factor A domain containing 2 [ Mus musculus (house mouse) ] Gene ID: 240675, updated on 12-Aug-2019

Gene summary

Official Symbol Vwa2 provided by MGI Official Full Name von Willebrand factor A domain containing 2 provided by MGI Primary source MGI:MGI:2684334 See related Ensembl:ENSMUSG00000025082 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Amaco Expression Biased expression in genital fat pad adult (RPKM 6.7), kidney adult (RPKM 4.4) and 6 other tissues See more Orthologs human all

Genomic context

Location: 19; 19 D2 See Vwa2 in Genome Data Viewer Exon count: 16

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 19 NC_000085.6 (56874416..56912078)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 19 NC_000085.5 (56948906..56986568)

Chromosome 19 - NC_000085.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Vwa2 ENSMUSG00000025082

Description von Willebrand factor A domain containing 2 [Source:MGI Symbol;Acc:MGI:2684334] Gene Synonyms Amaco Location Chromosome 19: 56,874,249-56,912,078 forward strand. GRCm38:CM001012.2 About this gene This gene has 1 transcript (splice variant), 183 orthologues, 13 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Vwa2-201 ENSMUST00000026068.7 4049 791aa ENSMUSP00000026068.7 Protein coding CCDS29923 Q70UZ7 TSL:1 GENCODE basic APPRIS P1

57.83 kb Forward strand

56.87Mb 56.88Mb 56.89Mb 56.90Mb 56.91Mb 56.92Mb (Comprehensive set... Tdrd1-203 >protein coding Vwa2-201 >protein coding

Tdrd1-201 >protein coding

Tdrd1-204 >protein coding

Tdrd1-202 >protein coding

Contigs AC125150.3 >

Genes < Afap1l2-203protein coding (Comprehensive set...

< Afap1l2-201protein coding

< Afap1l2-202protein coding

< Afap1l2-207retained intron

Regulatory Build

56.87Mb 56.88Mb 56.89Mb 56.90Mb 56.91Mb 56.92Mb Reverse strand 57.83 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000026068

37.83 kb Forward strand

Vwa2-201 >protein coding

ENSMUSP00000026... MobiDB lite Low complexity (Seg) Cleavage site (Sign... Superfamily von Willebrand factor A-like domain superfamily SSF57196 SMART EGF-like calcium-binding domain

von Willebrand factor, type A

EGF-like domain Prints PR00453 Pfam EGF-like domain

von Willebrand factor, type A PROSITE profiles EGF-like domain

von Willebrand factor, type A PROSITE patterns EGF-like, conserved site

EGF-like, conserved site PANTHER von Willebrand factor A domain-containing protein 2

PTHR24020 Gene3D 2.10.25.10

von Willebrand factor A-like domain superfamily CDD cd01472 cd00053 cd00054

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 791

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8