https://www.alphaknockout.com

Mouse Krt32 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Krt32 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Krt32 (NCBI Reference Sequence: NM_001159374.2 ; Ensembl: ENSMUSG00000046095 ) is located on Mouse 11. 7 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 7 (Transcript: ENSMUST00000107419). Exon 4~6 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Krt32 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-309N18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 4 starts from about 53.27% of the coding region. The knockout of Exon 4~6 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 1263 bp, and the size of intron 6 for 3'-loxP site insertion: 2599 bp. The size of effective cKO region: ~1699 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Krt32 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8199bp) | A(24.94% 2045) | C(22.14% 1815) | T(28.44% 2332) | G(24.48% 2007)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 100085337 100088336 3000 browser details YourSeq 238 383 1980 3000 83.3% chr11 - 100039994 100105348 65355 browser details YourSeq 204 383 1972 3000 83.2% chr11 - 100014106 100016010 1905 browser details YourSeq 166 383 1989 3000 76.1% chr11 - 100049438 100050345 908 browser details YourSeq 142 383 598 3000 82.9% chr11 - 100041144 100041359 216 browser details YourSeq 129 1627 1994 3000 85.7% chr11 - 100104039 100142463 38425 browser details YourSeq 104 383 514 3000 89.4% chr11 - 100029557 100029688 132 browser details YourSeq 95 1440 1536 3000 99.0% chr7 + 107809675 107809771 97 browser details YourSeq 95 1440 1535 3000 100.0% chr13 + 116084511 116084809 299 browser details YourSeq 70 1181 1426 3000 80.0% chr1 - 87713725 87713944 220 browser details YourSeq 66 1128 1413 3000 76.2% chr11 - 118391460 118391715 256 browser details YourSeq 59 1140 1315 3000 78.0% chr17 - 29803221 29803352 132 browser details YourSeq 55 1624 1902 3000 74.4% chr11 - 100267020 100267267 248 browser details YourSeq 53 1156 1429 3000 90.8% chr19 + 53435929 53436203 275 browser details YourSeq 53 1127 1412 3000 75.4% chr17 + 45255818 45256072 255 browser details YourSeq 51 1490 1558 3000 81.5% chr10 - 120996490 120996544 55 browser details YourSeq 46 1161 1244 3000 75.0% chr15 + 71997807 71997871 65 browser details YourSeq 45 1129 1199 3000 82.6% chr10 - 10090451 10090520 70 browser details YourSeq 45 1140 1220 3000 75.0% chr10 + 118652389 118652462 74 browser details YourSeq 44 1283 1419 3000 92.4% chr12 - 77188710 77188858 149

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 100080638 100083637 3000 browser details YourSeq 75 750 870 3000 83.2% chr13 - 49211068 49211172 105 browser details YourSeq 69 747 861 3000 91.6% chr14 + 32144802 32145036 235 browser details YourSeq 68 650 972 3000 74.7% chr10 + 42087770 42088002 233 browser details YourSeq 65 759 868 3000 97.3% chr1 + 43219957 43220086 130 browser details YourSeq 64 749 848 3000 94.5% chr9 - 78639716 78640087 372 browser details YourSeq 60 666 855 3000 75.4% chr12 + 9372130 9372213 84 browser details YourSeq 53 747 836 3000 93.7% chr15 - 77353629 77353812 184 browser details YourSeq 51 2914 2992 3000 82.2% chr3 + 96085446 96085523 78 browser details YourSeq 50 753 845 3000 81.5% chr1 + 187971197 187971267 71 browser details YourSeq 49 759 865 3000 96.4% chr14 - 121789023 121789195 173 browser details YourSeq 49 2910 2992 3000 96.3% chrX + 151157655 151157739 85 browser details YourSeq 49 2892 2994 3000 69.1% chr8 + 111183574 111183662 89 browser details YourSeq 46 760 848 3000 94.4% chr16 - 30862923 30863120 198 browser details YourSeq 45 2895 2992 3000 94.4% chr12 - 81515951 81516049 99 browser details YourSeq 45 2906 2992 3000 86.0% chr14 + 64844452 64844537 86 browser details YourSeq 44 818 867 3000 96.0% chr13 - 71143663 71143737 75 browser details YourSeq 44 2910 2996 3000 78.9% chr1 - 178119631 178119708 78 browser details YourSeq 44 2910 2988 3000 74.7% chr16 + 4890177 4890248 72 browser details YourSeq 43 2869 2987 3000 75.0% chr12 - 13637489 13637595 107

Note: The 3000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Krt32 32 [ Mus musculus (house mouse) ] Gene ID: 16670, updated on 26-Jun-2020

Gene summary

Official Symbol Krt32 provided by MGI Official Full Name keratin 32 provided by MGI Primary source MGI:MGI:1309995 See related Ensembl:ENSMUSG00000046095 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Ha3; K32; Hka2; Krt1-2; Krtha2 Expression Biased expression in lung adult (RPKM 3.1), stomach adult (RPKM 1.0) and 2 other tissues See more

Genomic context

Location: 11 D; 11 63.4 cM See Krt32 in Genome Data Viewer Exon count: 7

Annotation release Status Assembly Chr Location

108.20200622 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (100080848..100088269, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (99942377..99949429, complement)

Chromosome 11 - NC_000077.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Krt32 ENSMUSG00000046095

Description keratin 32 [Source:MGI Symbol;Acc:MGI:1309995] Gene Synonyms Krt1-2, mHa2 Location Chromosome 11: 100,080,848-100,088,226 reverse strand. GRCm38:CM001004.2 About this gene This gene has 2 transcripts (splice variants), 99 orthologues, 68 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Krt32-201 ENSMUST00000107419.1 1664 453aa ENSMUSP00000103042.1 Protein coding CCDS56805 B1ATJ5 TSL:1 GENCODE basic APPRIS P1

Krt32-202 ENSMUST00000139873.1 998 No protein - Processed transcript - - TSL:1

27.38 kb Forward strand

100.075Mb 100.080Mb 100.085Mb 100.090Mb 100.095Mb Gm11571-201 >antisense (Comprehensive set...

Contigs AL662808.12 >

Genes (Comprehensive set... < Krt32-202processed transcript < Krt35-201protein coding

< Krt32-201protein coding < Krt35-202retained intron

Regulatory Build

100.075Mb 100.080Mb 100.085Mb 100.090Mb 100.095Mb Reverse strand 27.38 kb

Regulation Legend

Open Chromatin Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000107419

< Krt32-201protein coding

Reverse strand 7.37 kb

ENSMUSP00000103... Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF64593 SMART , rod domain Prints Keratin, type I Pfam Intermediate filament, rod domain PROSITE profiles Intermediate filament, rod domain PROSITE patterns Intermediate filament protein, conserved site PANTHER Keratin, type I

PTHR23239:SF155 Gene3D 1.20.5.500 1.20.5.170

Intermediate filament, rod domain, coil 1B

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 453

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7