https://www.alphaknockout.com

Mouse Parvb Knockout Project (CRISPR/Cas9)

Objective: To create a Parvb knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Parvb (NCBI Reference Sequence: NM_133167 ; Ensembl: ENSMUSG00000022438 ) is located on Mouse 15. 13 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 13 (Transcript: ENSMUST00000023072). Exon 2~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Disruption of this marker has no apparent adverse consequences.

Exon 2 starts from about 10.59% of the coding region. Exon 2~3 covers 14.7% of the coding region. The size of effective KO region: ~2339 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 13

Legends Exon of mouse Parvb Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.45% 449) | C(23.75% 475) | T(26.0% 520) | G(27.8% 556)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(19.45% 389) | C(29.25% 585) | T(31.1% 622) | G(20.2% 404)

Note: The 2000 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr15 + 84269232 84271231 2000 browser details YourSeq 40 707 746 2000 100.0% chr9 + 47119885 47119924 40 browser details YourSeq 30 767 838 2000 96.9% chr16 - 93815591 93815662 72 browser details YourSeq 24 1176 1199 2000 100.0% chr3 - 102168838 102168861 24 browser details YourSeq 23 817 839 2000 100.0% chr10 - 73124454 73124476 23 browser details YourSeq 23 813 835 2000 100.0% chr13 + 80932166 80932188 23 browser details YourSeq 23 816 838 2000 100.0% chr12 + 36380286 36380308 23 browser details YourSeq 22 815 838 2000 87.0% chr6 - 140626799 140626821 23 browser details YourSeq 22 815 838 2000 87.0% chr13 - 117309330 117309352 23 browser details YourSeq 22 817 838 2000 100.0% chr2 + 156810081 156810102 22 browser details YourSeq 21 821 843 2000 95.7% chr15 - 51487562 51487584 23 browser details YourSeq 21 821 841 2000 100.0% chr7 + 64149744 64149764 21 browser details YourSeq 21 819 839 2000 100.0% chr6 + 23614489 23614509 21 browser details YourSeq 20 819 838 2000 100.0% chr4 - 153889202 153889221 20

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr15 + 84273571 84275570 2000 browser details YourSeq 424 147 1912 2000 95.6% chr19 - 45985012 46218311 233300 browser details YourSeq 213 146 1637 2000 95.4% chr19 - 45985103 46051610 66508 browser details YourSeq 211 1603 1919 2000 90.8% chr11 + 91915495 91915774 280 browser details YourSeq 198 141 376 2000 91.8% chr1 - 135111534 135111753 220 browser details YourSeq 198 147 359 2000 97.2% chr13 + 97390049 97390280 232 browser details YourSeq 196 147 365 2000 93.7% chr6 + 99989420 99989625 206 browser details YourSeq 194 144 364 2000 95.0% chr10 - 39206416 39206886 471 browser details YourSeq 191 147 353 2000 97.1% chr5 + 124539623 124540037 415 browser details YourSeq 189 130 356 2000 93.6% chr10 - 86737588 86737815 228 browser details YourSeq 188 147 347 2000 97.1% chr5 - 106796918 106797128 211 browser details YourSeq 188 147 346 2000 95.5% chr11 + 3826951 3827147 197 browser details YourSeq 187 147 355 2000 96.1% chr8 - 121557383 121557597 215 browser details YourSeq 187 147 355 2000 96.1% chr11 + 72311997 72312236 240 browser details YourSeq 186 147 332 2000 100.0% chr8 - 105576539 105576724 186 browser details YourSeq 186 147 351 2000 96.6% chr8 - 37967464 37967673 210 browser details YourSeq 186 129 355 2000 91.9% chr6 - 146556816 146557046 231 browser details YourSeq 185 147 676 2000 94.3% chr7 - 105691684 105692329 646 browser details YourSeq 185 146 354 2000 94.0% chr2 + 65319938 65320143 206 browser details YourSeq 185 145 359 2000 91.7% chr19 + 22118619 22118822 204

Note: The 2000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Parvb parvin, beta [ Mus musculus (house mouse) ] Gene ID: 170736, updated on 12-Aug-2019

Gene summary

Official Symbol Parvb provided by MGI Official Full Name parvin, beta provided by MGI Primary source MGI:MGI:2153063 See related Ensembl:ENSMUSG00000022438 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as D15Gsk1; affixin; AI595373; AW742462 Expression Ubiquitous expression in heart adult (RPKM 18.6), lung adult (RPKM 17.3) and 26 other tissues See more Orthologs human all

Genomic context

Location: 15; 15 E2 See Parvb in Genome Data Viewer Exon count: 17

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 15 NC_000081.6 (84232043..84315689)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 15 NC_000081.5 (84062473..84146039)

Chromosome 15 - NC_000081.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Parvb ENSMUSG00000022438

Description parvin, beta [Source:MGI Symbol;Acc:MGI:2153063] Gene Synonyms D15Gsk1, affixin Location Chromosome 15: 84,232,043-84,315,688 forward strand. GRCm38:CM001008.2 About this gene This gene has 4 transcripts (splice variants), 194 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 25 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Parvb- ENSMUST00000023072.6 3784 365aa ENSMUSP00000023072.6 Protein coding CCDS37167 Q3UGT9 TSL:1 201 Q9ES46 GENCODE basic APPRIS P1

Parvb- ENSMUST00000122818.1 438 93aa ENSMUSP00000117594.1 Nonsense mediated - F2Z417 TSL:3 202 decay

Parvb- ENSMUST00000131158.1 438 No - Retained intron - - TSL:5 203 protein

Parvb- ENSMUST00000146020.1 378 No - lncRNA - - TSL:3 204 protein

103.65 kb Forward strand 84.24Mb 84.26Mb 84.28Mb 84.30Mb 84.32Mb (Comprehensive set... Parvb-201 >protein coding

Parvb-202 >nonsense mediated decay Parvg-201 >protein coding

Parvb-204 >lncRNA Parvg-205 >protein coding

Parvb-203 >retained intron Parvg-203 >lncRNA

Parvg-204 >lncRNA

Parvg-206 >lncRNA

Contigs AL626769.18 > Genes < Gm22890-201snoRNA (Comprehensive set...

Regulatory Build

84.24Mb 84.26Mb 84.28Mb 84.30Mb 84.32Mb Reverse strand 103.65 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000023072

83.65 kb Forward strand

Parvb-201 >protein coding

ENSMUSP00000023... MobiDB lite Low complexity (Seg) Superfamily CH domain superfamily SMART Calponin homology domain Pfam Calponin homology domain PROSITE profiles Calponin homology domain PIRSF Parvin PANTHER Parvin

PTHR12114:SF7 Gene3D CH domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 365

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8