https://www.alphaknockout.com

Mouse Zfyve26 Knockout Project (CRISPR/Cas9)

Objective: To create a Zfyve26 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Zfyve26 (NCBI Reference Sequence: NM_001008550 ; Ensembl: ENSMUSG00000066440 ) is located on Mouse 12. 42 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 42 (Transcript: ENSMUST00000021547). Exon 5~13 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygoys for a null allele display a late-onset spastic gait disorder with cerebellar ataxia, axon degeneration, and progressive loss of cortical motoneurons and Purkinje cells preceded by accumulation of autofluorescent, electron-dense, membrane-enclosed material in lysosomal structures.

Exon 5 starts from about 4.8% of the coding region. Exon 5~13 covers 26.51% of the coding region. The size of effective KO region: ~8790 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 5 6 7 8 9 10 11 12 13 42

Legends Exon of mouse Zfyve26 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 5 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 530 bp section downstream of Exon 13 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.3% 506) | C(20.05% 401) | T(34.0% 680) | G(20.65% 413)

Note: The 2000 bp section upstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(530bp) | A(29.06% 154) | C(21.32% 113) | T(26.79% 142) | G(22.83% 121)

Note: The 530 bp section downstream of Exon 13 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 - 79287880 79289879 2000 browser details YourSeq 361 954 1546 2000 91.8% chr11 + 50167134 50167724 591 browser details YourSeq 332 935 1397 2000 95.4% chr16 - 11157956 11158528 573 browser details YourSeq 322 930 1397 2000 90.9% chr4 - 151925581 151925967 387 browser details YourSeq 308 953 1411 2000 95.1% chr17 + 46185460 46667452 481993 browser details YourSeq 295 1149 1550 2000 93.0% chr7 - 80691749 80692343 595 browser details YourSeq 287 1197 1546 2000 91.5% chr16 + 15659108 15659450 343 browser details YourSeq 284 1199 1545 2000 92.0% chr7 + 127828746 127829089 344 browser details YourSeq 278 1066 1544 2000 91.6% chr4 + 123761335 123761867 533 browser details YourSeq 263 1197 1537 2000 94.3% chr11 + 116666934 116667620 687 browser details YourSeq 262 954 1418 2000 92.8% chr15 - 80122205 80122844 640 browser details YourSeq 261 1033 1547 2000 88.9% chr12 + 110881041 110881445 405 browser details YourSeq 255 1197 1546 2000 95.7% chr3 - 89466133 89466562 430 browser details YourSeq 246 1201 1549 2000 95.2% chr6 + 86603477 86604018 542 browser details YourSeq 243 1033 1395 2000 95.5% chr11 - 6536579 6537185 607 browser details YourSeq 241 930 1398 2000 90.5% chr4 - 129179184 129179494 311 browser details YourSeq 241 1197 1545 2000 95.1% chr17 + 56083619 56301531 217913 browser details YourSeq 240 693 1398 2000 89.0% chr11 - 120735960 120736300 341 browser details YourSeq 237 1203 1533 2000 96.2% chr9 + 21413603 21776212 362610 browser details YourSeq 233 1197 1516 2000 97.2% chr11 + 98813101 98813652 552

Note: The 2000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 530 1 530 530 100.0% chr12 - 79278560 79279089 530 browser details YourSeq 78 318 425 530 86.8% chr2 - 167111088 167111193 106 browser details YourSeq 72 312 414 530 84.3% chr13 + 31749449 31749549 101 browser details YourSeq 72 191 410 530 93.1% chr11 + 50196062 50196584 523 browser details YourSeq 71 318 413 530 84.1% chr7 + 139317825 139317918 94 browser details YourSeq 70 314 434 530 91.7% chr5 + 110703080 110703231 152 browser details YourSeq 69 250 338 530 88.8% chr12 - 79271953 79272041 89 browser details YourSeq 69 311 428 530 93.7% chr11 - 40548776 40548921 146 browser details YourSeq 67 311 409 530 83.9% chr11 + 119527182 119527280 99 browser details YourSeq 66 311 428 530 92.3% chr2 - 30569078 30569223 146 browser details YourSeq 66 312 411 530 83.0% chr11 + 96401147 96401246 100 browser details YourSeq 65 292 413 530 85.8% chr9 - 111624641 111624758 118 browser details YourSeq 65 318 410 530 82.7% chr2 + 101587152 101587243 92 browser details YourSeq 64 311 415 530 86.3% chr9 - 60291470 60291572 103 browser details YourSeq 64 314 410 530 90.3% chrX + 159068703 159068797 95 browser details YourSeq 64 311 413 530 87.2% chr9 + 108101661 108101761 101 browser details YourSeq 64 311 413 530 87.2% chr4 + 134046356 134046456 101 browser details YourSeq 62 295 410 530 91.9% chr3 + 4846109 4846380 272 browser details YourSeq 62 314 410 530 88.9% chr17 + 64844905 64844999 95 browser details YourSeq 61 314 414 530 82.8% chr8 - 72576369 72576468 100

Note: The 530 bp section downstream of Exon 13 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Zfyve26 zinc finger, FYVE domain containing 26 [ Mus musculus (house mouse) ] Gene ID: 211978, updated on 12-Aug-2019

Gene summary

Official Symbol Zfyve26 provided by MGI Official Full Name zinc finger, FYVE domain containing 26 provided by MGI Primary source MGI:MGI:1924767 See related Ensembl:ENSMUSG00000066440 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gm893; mKIAA0321; 4930465A13; 9330197E15Rik; A630028O16Rik Expression Ubiquitous expression in thymus adult (RPKM 5.9), placenta adult (RPKM 5.2) and 28 other tissues See more Orthologs human all

Genomic context

Location: 12; 12 C3 See Zfyve26 in Genome Data Viewer Exon count: 42

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 12 NC_000078.6 (79232346..79296323, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 12 NC_000078.5 (80333334..80397269, complement)

Chromosome 12 - NC_000078.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 14 transcripts

Gene: Zfyve26 ENSMUSG00000066440

Description zinc finger, FYVE domain containing 26 [Source:MGI Symbol;Acc:MGI:1924767] Gene Synonyms 9330197E15Rik, A630028O16Rik, LOC380767 Location Chromosome 12: 79,232,346-79,296,304 reverse strand. GRCm38:CM001005.2 About this gene This gene has 14 transcripts (splice variants), 196 orthologues, 12 paralogues, is a member of 1 Ensembl protein family and is associated with 19 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Zfyve26-201 ENSMUST00000021547.7 9391 2529aa ENSMUSP00000021547.6 Protein coding CCDS36481 Q5DU37 TSL:1 GENCODE basic APPRIS P1

Zfyve26-203 ENSMUST00000218377.1 2726 567aa ENSMUSP00000151280.1 Protein coding - A0A1W2P6J0 TSL:5 GENCODE basic

Zfyve26-210 ENSMUST00000219912.1 2062 487aa ENSMUSP00000151290.1 Protein coding - A0A1W2P6K6 CDS 5' incomplete TSL:1

Zfyve26-209 ENSMUST00000219842.1 4831 No protein - Retained intron - - TSL:1

Zfyve26-206 ENSMUST00000218510.1 4255 No protein - Retained intron - - TSL:NA

Zfyve26-212 ENSMUST00000220092.1 3782 No protein - Retained intron - - TSL:1

Zfyve26-208 ENSMUST00000219571.1 824 No protein - Retained intron - - TSL:2

Zfyve26-211 ENSMUST00000220003.1 737 No protein - Retained intron - - TSL:2

Zfyve26-204 ENSMUST00000218433.1 655 No protein - Retained intron - - TSL:2

Zfyve26-213 ENSMUST00000220262.1 655 No protein - Retained intron - - TSL:NA

Zfyve26-202 ENSMUST00000217723.1 626 No protein - Retained intron - - TSL:NA

Zfyve26-207 ENSMUST00000219105.1 584 No protein - Retained intron - - TSL:2

Zfyve26-205 ENSMUST00000218447.1 499 No protein - Retained intron - - TSL:NA

Zfyve26-214 ENSMUST00000220390.1 552 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

83.96 kb Forward strand 79.24Mb 79.26Mb 79.28Mb 79.30Mb Rdh12-201 >protein coding Rad51b-202 >protein coding (Comprehensive set...

Rdh12-202 >protein coding Rad51b-201 >protein coding

Rdh12-204 >retained intron Rad51b-203 >protein coding

Contigs AC154585.2 > AC158527.2 >

Genes < Zfyve26-201protein coding (Comprehensive set...

< Zfyve26-203protein coding

< Zfyve26-204retained intron < Zfyve26-210protein coding < Zfyve26-209retained intron

< Zfyve26-207retained intron < Zfyve26-202retained intron < Zfyve26-206retained intron

< Zfyve26-205retained intron < Zfyve26-213retained intron < Zfyve26-208retained intron

< Zfyve26-212retained intron

< Zfyve26-211retained intron

< Zfyve26-214lncRNA

Regulatory Build

79.24Mb 79.26Mb 79.28Mb 79.30Mb Reverse strand 83.96 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000021547

< Zfyve26-201protein coding

Reverse strand 63.96 kb

ENSMUSP00000021... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Zinc finger, FYVE/PHD-type SMART FYVE zinc finger Pfam FYVE zinc finger PROSITE profiles Zinc finger, FYVE-related PANTHER Zinc finger FYVE domain-containing protein 26 Gene3D Zinc finger, RING/FYVE/PHD-type CDD cd15724

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2529

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9