https://www.alphaknockout.com

Mouse Wfs1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Wfs1 conditional knockout mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Wfs1 ( NCBI Reference Sequence: NM_011716.2 ; Ensembl: ENSMUSG00000039474 ) is located on mouse 5. 8 exons are identified , with the ATG start codon in exon 2 and the TGA stop codon in exon 8 (Transcript: ENSMUST00000043964). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the mouse Wfs1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-46B6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null allele exhibit decreased pancreatic beta cells and impaired glucose tolerance. Mice homozygous for a knock-out allele exhibit impaired glucose tolerance, decreased body weight, and abnormal behavior associated with increased sensitivity to stress.

Exon 3 starts from about 8.84% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 5091 bp, and the size of intron 3 for 3'-loxP site insertion: 1268 bp. The size of effective cKO region: ~1199 bp. This strategy is designed based on genetic information in existing databases. Due to the complexity of biological processes, all risk of loxP insertion on gene transcription, RNA splicing and translation cannot be predicted at existing technological level.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 4 8 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Wfs1 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7083bp) | A(22.39% 1586) | C(25.81% 1828) | T(23.58% 1670) | G(28.22% 1999)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 36977217 36980216 3000 browser details YourSeq 116 2 148 3000 89.8% chr11 + 67933569 67933719 151 browser details YourSeq 105 1 148 3000 85.9% chr16 - 55945223 55945374 152 browser details YourSeq 103 1 146 3000 85.6% chr1 - 183452407 183452556 150 browser details YourSeq 103 2 148 3000 88.9% chr11 + 96020025 96020352 328 browser details YourSeq 102 2 199 3000 87.6% chr1 - 143050549 143050779 231 browser details YourSeq 102 2 148 3000 94.1% chr9 + 64142683 64143004 322 browser details YourSeq 102 22 146 3000 91.2% chr2 + 101813485 101813610 126 browser details YourSeq 101 27 194 3000 88.5% chr10 + 82918831 82918995 165 browser details YourSeq 100 2 146 3000 91.7% chr8 + 106091550 106091700 151 browser details YourSeq 99 1 148 3000 91.6% chr4 - 106300626 106300776 151 browser details YourSeq 99 2 148 3000 91.0% chr1 - 178312945 178313095 151 browser details YourSeq 97 2 148 3000 90.2% chr11 + 70697840 70698001 162 browser details YourSeq 95 2 148 3000 91.6% chr9 - 75446053 75446429 377 browser details YourSeq 95 17 146 3000 89.3% chr10 - 42754795 42755328 534 browser details YourSeq 94 3 148 3000 91.3% chr11 - 35823897 35824052 156 browser details YourSeq 92 8 148 3000 91.1% chr17 - 27993437 27993583 147 browser details YourSeq 92 1 148 3000 81.4% chr12 - 76045413 76045543 131 browser details YourSeq 91 8 148 3000 90.3% chr4 + 11621136 11621282 147 browser details YourSeq 91 2 143 3000 85.8% chr11 + 77524624 77524763 140

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 36973634 36976633 3000 browser details YourSeq 144 1554 1729 3000 91.4% chr3 - 131080713 131080890 178 browser details YourSeq 144 1542 1724 3000 92.4% chr2 - 59701651 59701836 186 browser details YourSeq 144 1560 1738 3000 90.6% chr2 + 152168628 152168808 181 browser details YourSeq 143 1537 1729 3000 88.0% chr7 + 65846291 65846482 192 browser details YourSeq 142 1547 1723 3000 90.4% chr12 + 108299986 108300163 178 browser details YourSeq 141 1554 1735 3000 90.3% chr1 - 181253325 181253512 188 browser details YourSeq 141 1540 1723 3000 87.0% chr4 + 149816204 149816382 179 browser details YourSeq 140 1537 1724 3000 86.9% chr8 - 48149235 48149417 183 browser details YourSeq 140 1547 1725 3000 89.5% chr13 - 104502256 104502433 178 browser details YourSeq 138 1549 1723 3000 90.7% chr18 + 49853990 49854169 180 browser details YourSeq 135 1536 1753 3000 84.8% chr7 - 126738730 126738932 203 browser details YourSeq 135 1549 1722 3000 89.9% chr3 - 58443983 58444157 175 browser details YourSeq 135 1547 1723 3000 86.8% chr7 + 44363265 44363439 175 browser details YourSeq 135 1554 1723 3000 91.5% chr7 + 27464758 27464929 172 browser details YourSeq 134 1549 1724 3000 91.6% chr10 - 117196345 117196529 185 browser details YourSeq 134 1547 1725 3000 89.4% chr4 + 124814208 124814391 184 browser details YourSeq 133 1547 1727 3000 87.6% chr16 - 28923285 28923467 183 browser details YourSeq 133 1540 1725 3000 85.7% chr1 + 193206571 193206747 177 browser details YourSeq 132 1549 1723 3000 87.9% chr1 + 165989318 165989492 175

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and protein information: Wfs1 wolframin ER transmembrane glycoprotein [ Mus musculus (house mouse) ] Gene ID: 22393, updated on 2-Feb-2021

Gene summary

Official Symbol Wfs1 provided by MGI Official Full Name wolframin ER transmembrane glycoprotein provided by MGI Primary source MGI:MGI:1328355 See related Ensembl:ENSMUSG00000039474 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as wol; AI481085; wolframin Expression Broad expression in adrenal adult (RPKM 31.3), heart adult (RPKM 27.5) and 24 other tissues See more Orthologs human all NEW Try the new Gene table Try the new Transcript table

Genomic context

Location: 5; 5 B3 See Wfs1 in Genome Data Viewer

Exon count: 8

Annotation release Status Assembly Chr Location

109 current GRCm39 (GCF_000001635.27) 5 NC_000071.7 (37123448..37146326, complement)

108.20200622 previous assembly GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (36966104..36988982, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (37357343..37380221, complement)

Chromosome 5 - NC_000071.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Wfs1 ENSMUSG00000039474

Description wolframin ER transmembrane glycoprotein [Source:MGI Symbol;Acc:MGI:1328355] Gene Synonyms Wolfram syndrome 1 homolog (human), wolframin Location Chromosome 5: 37,123,448-37,146,549 reverse strand. GRCm39:CM000998.3 About this gene This gene has 3 transcripts (splice variants), 264 orthologues and is associated with 42 phenotypes. Transcripts

UniProt Name Transcript ID bp Protein Translation ID Biotype CCDS Flags Match

Wfs1- ENSMUST00000043964.13 3787 890aa ENSMUSP00000048053.7 Protein coding CCDS19245 P56695 TSL:1 201 Q3TDI2 GENCODE basic APPRIS P2

Wfs1- ENSMUST00000166339.8 3284 814aa ENSMUSP00000132404.2 Protein coding - Q3UN10 TSL:1 202 GENCODE basic APPRIS ALT2

Wfs1- ENSMUST00000167937.2 714 157aa ENSMUSP00000125779.2 Nonsense mediated - F7D926 CDS 5' 203 decay incomplete TSL:5

43.10 kb Forward strand 37.12Mb 37.13Mb 37.14Mb 37.15Mb Contigs < AC115722.12 Genes (Comprehensive set... < Wfs1-201protein coding

< Wfs1-202protein coding

< Wfs1-203nonsense mediated decay

Regulatory Build

37.12Mb 37.13Mb 37.14Mb 37.15Mb Reverse strand 43.10 kb

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000043964

< Wfs1-201protein coding

Reverse strand 23.10 kb

ENSMUSP00000048... Transmembrane heli... MobiDB lite Low complexity (Seg) Prints Wolframin family

Wolframin PANTHER PTHR13098:SF5

Wolframin family Gene3D Tetratricopeptide-like helical domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant splice region variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 800 890

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7