https://www.alphaknockout.com

Mouse Krt31 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Krt31 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Krt31 (NCBI Reference Sequence: NM_010659 ; Ensembl: ENSMUSG00000048981 ) is located on Mouse 11. 7 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 7 (Transcript: ENSMUST00000007318). Exon 4~6 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Krt31 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-351F1 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 4 starts from about 47.2% of the coding region. The knockout of Exon 4~6 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 938 bp, and the size of intron 6 for 3'-loxP site insertion: 608 bp. The size of effective cKO region: ~1333 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Krt31 cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7833bp) | A(24.77% 1940) | C(23.4% 1833) | T(29.59% 2318) | G(22.24% 1742)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 100048752 100051751 3000 browser details YourSeq 611 290 2287 3000 94.0% chr11 - 100014116 100270638 256523 browser details YourSeq 512 1256 2314 3000 93.6% chr11 - 100026435 100029839 3405 browser details YourSeq 432 1285 2305 3000 96.0% chr11 - 100039994 100041481 1488 browser details YourSeq 197 1407 2314 3000 89.9% chr11 - 100086348 100087954 1607 browser details YourSeq 167 1428 2317 3000 88.9% chr11 - 100104041 100135574 31534 browser details YourSeq 131 1432 1894 3000 89.3% chr11 - 100094846 100095901 1056 browser details YourSeq 125 1432 2254 3000 79.2% chr11 - 99387556 99542897 155342 browser details YourSeq 101 2695 2845 3000 91.8% chr12 - 43964523 43964669 147 browser details YourSeq 101 2695 2836 3000 96.4% chr13 + 108431365 108431637 273 browser details YourSeq 96 2705 2857 3000 89.2% chr12 - 46130688 46130820 133 browser details YourSeq 77 2702 2792 3000 94.4% chr2 - 18646884 18646978 95 browser details YourSeq 77 2703 2792 3000 89.3% chr1 + 67879754 67879838 85 browser details YourSeq 76 2705 2800 3000 87.1% chr12 + 11515879 11515968 90 browser details YourSeq 73 2696 2785 3000 91.1% chr11 + 63251233 63251322 90 browser details YourSeq 71 2694 2812 3000 96.2% chr3 + 34193712 34193887 176 browser details YourSeq 59 2753 2840 3000 81.9% chr8 - 94131361 94131432 72 browser details YourSeq 58 2196 2285 3000 82.3% chr11 - 100142079 100142168 90 browser details YourSeq 58 2694 2762 3000 94.0% chr12 + 20454415 20454503 89 browser details YourSeq 56 2652 2729 3000 98.4% chr15 - 23372414 23372717 304

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 100044419 100047418 3000 browser details YourSeq 75 1090 2450 3000 87.2% chr1 - 41961140 42038600 77461 browser details YourSeq 72 2281 2507 3000 79.4% chr6 + 115710121 115710317 197 browser details YourSeq 67 2397 2520 3000 82.6% chr11 - 117023612 117023736 125 browser details YourSeq 60 2402 2489 3000 84.1% chr14 + 31450281 31450368 88 browser details YourSeq 59 2412 2525 3000 85.6% chr11 - 116823087 116823202 116 browser details YourSeq 55 2410 2492 3000 83.2% chr14 + 118329975 118330057 83 browser details YourSeq 54 2394 2493 3000 80.8% chr12 + 3712580 3712677 98 browser details YourSeq 53 2413 2492 3000 87.4% chr11 + 62115088 62115171 84 browser details YourSeq 51 2307 2481 3000 78.7% chr14 + 99845439 99845691 253 browser details YourSeq 50 2403 2488 3000 79.1% chrX + 134319001 134319086 86 browser details YourSeq 50 2410 2471 3000 90.4% chr10 + 111903220 111903281 62 browser details YourSeq 49 2349 2487 3000 77.6% chr9 + 112517005 112517143 139 browser details YourSeq 49 2478 2816 3000 87.7% chr1 + 84906494 84911779 5286 browser details YourSeq 47 1080 1131 3000 98.1% chr11 - 113191251 113191309 59 browser details YourSeq 46 2410 2491 3000 85.8% chr1 - 72124964 72125043 80 browser details YourSeq 46 2390 2454 3000 86.0% chr5 + 143121641 143121706 66 browser details YourSeq 45 2401 2491 3000 74.8% chr12 - 84357206 84357296 91 browser details YourSeq 44 2769 2835 3000 92.5% chr7 - 45130440 45130508 69 browser details YourSeq 44 2406 2495 3000 74.5% chr6 - 112977208 112977297 90

Note: The 3000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Krt31 31 [ Mus musculus (house mouse) ] Gene ID: 16660, updated on 12-Aug-2019

Gene summary

Official Symbol Krt31 provided by MGI Official Full Name keratin 31 provided by MGI Primary source MGI:MGI:1309993 See related Ensembl:ENSMUSG00000048981 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Ha1; Kha1; Krt1-1; MKHA-1 Expression Low expression observed in reference dataset See more

Genomic context

Location: 11 D; 11 63.39 cM See Krt31 in Genome Data Viewer Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (100046646..100050551, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (99907960..99911865, complement)

Chromosome 11 - NC_000077.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Krt31 ENSMUSG00000048981

Description keratin 31 [Source:MGI Symbol;Acc:MGI:1309993] Gene Synonyms Ha1, Kha1, Krt1-1, MKHA-1 Location Chromosome 11: 100,046,646-100,050,551 reverse strand. GRCm38:CM001004.2 About this gene This gene has 1 transcript (splice variant), 177 orthologues, 68 paralogues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Krt31-201 ENSMUST00000007318.1 1581 416aa ENSMUSP00000007318.1 Protein coding CCDS25405 Q61765 TSL:1 GENCODE basic APPRIS P1

23.91 kb Forward strand 100.04Mb 100.05Mb 100.06Mb Gm11571-201 >lncRNA (Comprehensive set...

Contigs AL592545.10 > Genes (Comprehensive set... < Krt34-201protein coding < Krt31-201protein coding

Regulatory Build

100.04Mb 100.05Mb 100.06Mb Reverse strand 23.91 kb

Regulation Legend CTCF Open Chromatin Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000007318

< Krt31-201protein coding

Reverse strand 3.91 kb

ENSMUSP00000007... Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF64593

SSF46579 SMART , rod domain Prints Keratin, type I Pfam Intermediate filament, rod domain PROSITE profiles Intermediate filament, rod domain PROSITE patterns Intermediate filament protein, conserved site PANTHER Keratin, type I

PTHR23239:SF224 Gene3D 1.20.5.500 1.20.5.170

Intermediate filament, rod domain, coil 1B

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 416

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7