https://www.alphaknockout.com

Mouse Svil Knockout Project (CRISPR/Cas9)

Objective: To create a Svil knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Svil (NCBI Reference Sequence: NM_153153 ; Ensembl: ENSMUSG00000024236 ) is located on Mouse 18. 35 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 35 (Transcript: ENSMUST00000025079). Exon 3~8 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit enhanched adhesion and thrombus formation.

Exon 3 starts from about 2.47% of the coding region. Exon 3~8 covers 28.0% of the coding region. The size of effective KO region: ~9890 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 7 8 35

Legends Exon of mouse Svil Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1969 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 455 bp section downstream of Exon 8 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(1969bp) | A(26.76% 527) | C(23.31% 459) | T(26.36% 519) | G(23.57% 464)

Note: The 1969 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(455bp) | A(23.96% 109) | C(27.47% 125) | T(29.01% 132) | G(19.56% 89)

Note: The 455 bp section downstream of Exon 8 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1969 1 1969 1969 100.0% chr18 + 5046917 5048885 1969 browser details YourSeq 60 1108 1268 1969 86.6% chr1 + 84859233 84859389 157 browser details YourSeq 49 1117 1267 1969 94.6% chr4 - 86890116 86890686 571 browser details YourSeq 48 1121 1271 1969 72.3% chr14 - 12103346 12103448 103 browser details YourSeq 42 1161 1268 1969 95.7% chr1 + 84859136 84859287 152 browser details YourSeq 38 1117 1268 1969 64.3% chr4 - 86889960 86890035 76 browser details YourSeq 37 238 282 1969 95.3% chr2 + 99661475 99661527 53 browser details YourSeq 34 248 334 1969 85.5% chr9 - 60497686 60497773 88 browser details YourSeq 34 1108 1176 1969 73.0% chr1 + 84859131 84859185 55 browser details YourSeq 29 1086 1221 1969 51.7% chr14 - 55088706 55088751 46 browser details YourSeq 28 1088 1177 1969 59.4% chr4 + 119778097 119778141 45 browser details YourSeq 28 1208 1268 1969 70.0% chr1 + 84859137 84859185 49 browser details YourSeq 27 1088 1131 1969 96.6% chr17 + 29725810 29725854 45 browser details YourSeq 27 1088 1131 1969 89.7% chr16 + 3745261 3745303 43 browser details YourSeq 26 1121 1179 1969 60.8% chr14 - 12103346 12103376 31

Note: The 1969 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 455 1 455 455 100.0% chr18 + 5058776 5059230 455 browser details YourSeq 48 143 446 455 96.2% chr10 - 15765370 16139797 374428 browser details YourSeq 32 161 201 455 94.6% chr13 - 85331781 85331836 56 browser details YourSeq 26 149 174 455 100.0% chr6 - 32033448 32033473 26 browser details YourSeq 26 152 189 455 84.3% chr12 - 57797751 57797788 38 browser details YourSeq 25 358 391 455 77.8% chr11 + 97191511 97191540 30 browser details YourSeq 25 182 214 455 96.3% chr11 + 86102538 86102579 42 browser details YourSeq 24 388 420 455 84.7% chr12 + 27966938 27966968 31 browser details YourSeq 21 149 169 455 100.0% chr13 + 29113781 29113801 21 browser details YourSeq 20 58 79 455 95.5% chr11 - 91279066 91279087 22 browser details YourSeq 20 152 171 455 100.0% chr11 - 8643235 8643254 20 browser details YourSeq 20 151 170 455 100.0% chr1 + 183936582 183936601 20 browser details YourSeq 20 152 171 455 100.0% chr1 + 180294997 180295016 20

Note: The 455 bp section downstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Svil supervillin [ Mus musculus (house mouse) ] Gene ID: 225115, updated on 10-Sep-2019

Gene summary

Official Symbol Svil provided by MGI Official Full Name supervillin provided by MGI Primary source MGI:MGI:2147319 See related Ensembl:ENSMUSG00000024236 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AU024053; B430302E16Rik Expression Broad expression in bladder adult (RPKM 33.3), heart adult (RPKM 16.5) and 20 other tissues See more Orthologs human all

Genomic context

Location: 18; 18 A1 See Svil in Genome Data Viewer Exon count: 44

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (4920467..5119293)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (5046587..5119291)

Chromosome 18 - NC_000084.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 15 transcripts

Gene: Svil ENSMUSG00000024236

Description supervillin [Source:MGI Symbol;Acc:MGI:2147319] Gene Synonyms B430302E16Rik Location Chromosome 18: 4,920,540-5,119,299 forward strand. GRCm38:CM001011.2 About this gene This gene has 15 transcripts (splice variants), 223 orthologues, 7 paralogues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Svil- ENSMUST00000126977.7 7907 2170aa ENSMUSP00000115078.1 Protein coding CCDS37720 Q8K4L3 TSL:5 203 GENCODE basic APPRIS P3

Svil- ENSMUST00000025079.15 7433 2170aa ENSMUSP00000025079.9 Protein coding CCDS37720 Q8K4L3 TSL:1 201 GENCODE basic APPRIS P3

Svil- ENSMUST00000140448.7 7423 2170aa ENSMUSP00000119803.1 Protein coding CCDS37720 Q8K4L3 TSL:5 210 GENCODE basic APPRIS P3

Svil- ENSMUST00000143254.7 6594 1766aa ENSMUSP00000119287.1 Protein coding CCDS84352 Q8K4L3 TSL:5 211 GENCODE basic APPRIS ALT2

Svil- ENSMUST00000210707.1 7633 2257aa ENSMUSP00000147843.1 Protein coding - A0A1B0GS91 TSL:5 215 GENCODE basic APPRIS ALT2

Svil- ENSMUST00000127297.7 6243 2056aa ENSMUSP00000115223.1 Protein coding - E9Q3Z5 TSL:5 204 GENCODE basic APPRIS ALT2

Svil- ENSMUST00000146723.1 507 169aa ENSMUSP00000115591.1 Protein coding - F6TBK9 CDS 5' and 3' 212 incomplete TSL:3

Svil- ENSMUST00000153016.7 497 40aa ENSMUSP00000121497.1 Protein coding - D3Z2X9 CDS 3' 214 incomplete TSL:2

Svil- ENSMUST00000131609.7 6420 2031aa ENSMUSP00000122242.1 Nonsense mediated - Q8K4L2 TSL:5 207 decay

Svil- ENSMUST00000125512.7 4060 749aa ENSMUSP00000121972.1 Nonsense mediated - F6R6A4 CDS 5' 202 decay incomplete TSL:5

Svil- ENSMUST00000129543.1 1732 No - Retained intron - - TSL:2 205 protein

Svil- ENSMUST00000131210.7 1560 No - Retained intron - - TSL:1 206 protein

Svil- ENSMUST00000138258.7 1430 No - Retained intron - - TSL:5 208 protein

Svil- ENSMUST00000139761.1 523 No - Retained intron - - TSL:2 209 protein

Svil- ENSMUST00000148564.1 1218 No - lncRNA - - TSL:1 213 protein

Page 7 of 9 https://www.alphaknockout.com

218.76 kb Forward strand

4.95Mb 5.00Mb 5.05Mb 5.10Mb (Comprehensive set... Svil-203 >protein coding

Svil-206 >retained intron Svil-202 >nonsense mediated decay

Svil-208 >retained intron Svil-209 >retained intron

Svil-211 >protein coding

Svil-214 >protein coding Svil-205 >retained intron

Svil-210 >protein coding

Svil-213 >lncRNA Svil-215 >protein coding

Svil-204 >protein coding

Svil-207 >nonsense mediated decay

Svil-201 >protein coding

Svil-212 >protein coding

Contigs AC124770.4 > < AC115928.10 Regulatory Build

4.95Mb 5.00Mb 5.05Mb 5.10Mb Reverse strand 218.76 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000025079

72.71 kb Forward strand

Svil-201 >protein coding

ENSMUSP00000025... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF55753 headpiece domain superfamily SMART Villin/ Villin headpiece Prints Villin/Gelsolin Pfam Gelsolin-like domain Villin headpiece

PROSITE profiles Villin headpiece PANTHER Villin/Gelsolin

Supervillin Gene3D ADF-H/Gelsolin-like domain superfamily Villin headpiece domain superfamily CDD cd11289 cd11293

cd11280 cd11288

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe deletion missense variant splice region variant synonymous variant

Scale bar 0 200 400 600 800 1000 1200 1400 1600 1800 2170

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9