https://www.alphaknockout.com

Mouse Wdfy3 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Wdfy3 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Wdfy3 (NCBI Reference Sequence: NM_172882 ; Ensembl: ENSMUSG00000043940 ) is located on Mouse 5. 67 exons are identified, with the ATG start codon in exon 4 and the TGA stop codon in exon 67 (Transcript: ENSMUST00000053177). Exon 8 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Wdfy3 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-447M5 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for hypomorphic mutations of this gene exhibit perinatal lethality, altered neural progenitor divisions and neuronal migration, a regionally enlarged cerebral cortex, and focal cortical dysplasias.

Exon 8 starts from about 5.48% of the coding region. The knockout of Exon 8 will result in frameshift of the gene. The size of intron 7 for 5'-loxP site insertion: 4210 bp, and the size of intron 8 for 3'-loxP site insertion: 1569 bp. The size of effective cKO region: ~693 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 8 9 67 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Wdfy3 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7193bp) | A(27.99% 2013) | C(20.8% 1496) | T(28.81% 2072) | G(22.41% 1612)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 101953424 101956423 3000 browser details YourSeq 172 955 2560 3000 86.2% chr2 + 168319466 168451442 131977 browser details YourSeq 160 663 1042 3000 88.6% chr9 - 69790273 69790623 351 browser details YourSeq 157 897 2561 3000 92.0% chr10 - 93629571 93692640 63070 browser details YourSeq 150 2469 2935 3000 82.5% chr9 + 103317367 103317821 455 browser details YourSeq 148 882 1063 3000 92.1% chr16 - 61216703 61216919 217 browser details YourSeq 139 891 1060 3000 89.5% chr13 + 100668126 100668277 152 browser details YourSeq 136 892 1071 3000 85.6% chr8 + 66232498 66232664 167 browser details YourSeq 136 891 1054 3000 93.2% chr19 + 5189290 5189471 182 browser details YourSeq 135 897 2489 3000 92.5% chr11 + 59949049 60011541 62493 browser details YourSeq 132 891 1042 3000 94.1% chr12 + 78870293 78870447 155 browser details YourSeq 132 891 1060 3000 88.0% chr1 + 134472871 134473033 163 browser details YourSeq 131 893 1050 3000 90.9% chr2 - 170569426 170569581 156 browser details YourSeq 130 891 1060 3000 88.6% chr14 - 62781481 62781647 167 browser details YourSeq 130 893 1038 3000 94.6% chr11 + 86602604 86602749 146 browser details YourSeq 129 884 1060 3000 84.0% chr12 + 12231528 12231696 169 browser details YourSeq 128 880 1039 3000 89.8% chr1 - 6194789 6194941 153 browser details YourSeq 128 881 1039 3000 88.6% chr5 + 136999391 136999542 152 browser details YourSeq 128 883 1039 3000 88.9% chr10 + 57194229 57194382 154 browser details YourSeq 127 891 1039 3000 93.3% chr11 + 98634687 98634848 162

Note: The 3000 bp section upstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 101949731 101952730 3000 browser details YourSeq 221 279 693 3000 91.1% chr5 - 103937663 103938125 463 browser details YourSeq 219 279 693 3000 89.3% chr5 - 106728336 106728775 440 browser details YourSeq 213 258 712 3000 88.3% chr1 + 10115070 10115578 509 browser details YourSeq 209 290 704 3000 87.8% chr1 - 74705775 74706233 459 browser details YourSeq 202 273 707 3000 85.4% chr7 - 65165525 65166017 493 browser details YourSeq 200 279 669 3000 85.3% chr14 - 60428893 60429332 440 browser details YourSeq 199 289 946 3000 84.5% chr13 - 17843691 17844310 620 browser details YourSeq 199 279 693 3000 91.4% chr11 + 105359258 105359716 459 browser details YourSeq 189 285 703 3000 88.0% chr6 - 97607811 97608286 476 browser details YourSeq 189 279 707 3000 87.6% chr13 - 73751455 73751900 446 browser details YourSeq 186 279 693 3000 86.1% chr1 - 164957159 164957626 468 browser details YourSeq 185 279 588 3000 88.4% chr5 + 38857366 38857745 380 browser details YourSeq 182 286 691 3000 82.7% chr11 - 76731232 76731663 432 browser details YourSeq 181 267 693 3000 88.2% chr18 - 82843675 82844127 453 browser details YourSeq 181 288 707 3000 93.0% chr12 - 55472794 55473257 464 browser details YourSeq 181 279 693 3000 88.4% chr10 - 89449788 89450279 492 browser details YourSeq 179 286 699 3000 89.5% chr9 + 102389155 102389612 458 browser details YourSeq 179 279 588 3000 88.3% chr16 + 4743049 4743391 343 browser details YourSeq 179 285 705 3000 89.8% chr10 + 117932641 117933105 465

Note: The 3000 bp section downstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Wdfy3 WD repeat and FYVE domain containing 3 [ Mus musculus (house mouse) ] Gene ID: 72145, updated on 19-Oct-2019

Gene summary

Official Symbol Wdfy3 provided by MGI Official Full Name WD repeat and FYVE domain containing 3 provided by MGI Primary source MGI:MGI:1096875 See related Ensembl:ENSMUSG00000043940 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as ALFY; BWF1; Bchs; Ggtb3; ZFYVE25; AW319683; D5Ertd66e; mKIAA0993; B930017C24; 2610509D04Rik Expression Ubiquitous expression in cerebellum adult (RPKM 13.9), whole brain E14.5 (RPKM 11.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 5 E4-E5; 5 48.95 cM See Wdfy3 in Genome Data Viewer

Exon count: 70

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (101832953..102072215, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (102261972..102498940, complement)

Chromosome 5 - NC_000071.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Wdfy3 ENSMUSG00000043940

Description WD repeat and FYVE domain containing 3 [Source:MGI Symbol;Acc:MGI:1096875] Gene Synonyms 2610509D04Rik, Alfy, Bchs, Bwf1, D5Ertd66e, Ggtb3 Location Chromosome 5: 101,832,956-102,069,921 reverse strand. GRCm38:CM000998.2 About this gene This gene has 7 transcripts (splice variants), 227 orthologues, 7 paralogues, is a member of 1 Ensembl protein family and is associated with 27 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Wdfy3- ENSMUST00000053177.13 14274 3508aa ENSMUSP00000052607.7 Protein coding CCDS19473 Q6VNB8 TSL:1 201 GENCODE basic APPRIS P2

Wdfy3- ENSMUST00000174598.7 10581 3526aa ENSMUSP00000134244.1 Protein coding - G3UYW1 TSL:5 205 GENCODE basic APPRIS ALT1

Wdfy3- ENSMUST00000212024.1 10539 3512aa ENSMUSP00000148521.1 Protein coding - A0A1D5RLV7 TSL:5 207 GENCODE basic

Wdfy3- ENSMUST00000174698.1 3858 913aa ENSMUSP00000134541.1 Protein coding - Q6VNB8 TSL:1 206 GENCODE basic

Wdfy3- ENSMUST00000172927.1 641 124aa ENSMUSP00000133979.1 Nonsense mediated - G3UY81 CDS 5' 203 decay incomplete TSL:5

Wdfy3- ENSMUST00000173955.1 2713 No - Retained intron - - TSL:1 204 protein

Wdfy3- ENSMUST00000172512.1 704 No - lncRNA - - TSL:3 202 protein

Page 6 of 8 https://www.alphaknockout.com

256.97 kb Forward strand 101.85Mb 101.90Mb 101.95Mb 102.00Mb 102.05Mb Cds1-201 >protein coding Gm42934-201 >TEC Gm29707-202 >lncRNA (Comprehensive set...

Gm20548-201 >lncRNA Gm29707-201 >lncRNA

Gm20548-202 >lncRNA

Gm20548-205 >lncRNA

Gm20548-204 >lncRNA

Gm20548-203 >lncRNA

Contigs < AC171109.2 < AC131915.7 AC158143.6 > Genes (Comprehensive set... < Wdfy3-201protein coding

< Wdfy3-204retained intron < Wdfy3-203nonsense mediated decay < Gm43787-201TEC

< Wdfy3-207protein coding

< Wdfy3-205protein coding

< Wdfy3-206protein coding

< Wdfy3-202lncRNA

Regulatory Build

101.85Mb 101.90Mb 101.95Mb 102.00Mb 102.05Mb Reverse strand 256.97 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000053177

< Wdfy3-201protein coding

Reverse strand 236.97 kb

ENSMUSP00000052... MobiDB lite Low complexity (Seg) Superfamily Armadillo-type fold Concanavalin A-like lectin/glucanase domain superfamily BEACH domain superfamily Zinc finger, FYVE/PHD-type

SSF50729 WD40-repeat-containing domain superfamily SMART BEACH domain FYVE zinc finger

WD40 repeat Pfam PH-BEACH domain FYVE zinc finger

BEACH domain WD40 repeat PROSITE profiles BEACH domain Zinc finger, FYVE-related

PH-BEACH domain WD40 repeat

WD40-repeat-containing domain PROSITE patterns WD40 repeat, conserved site PANTHER PTHR46108:SF1

PTHR46108 Gene3D BEACH domain superfamily Zinc finger, RING/FYVE/PHD-type

2.30.29.40 WD40/YVTN repeat-like-containing domain superfamily CDD BEACH domain cd15719

PH-BEACH domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant missense variant splice region variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2400 2800 3508

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8