https://www.alphaknockout.com

Mouse Krt26 Knockout Project (CRISPR/Cas9)

Objective: To create a Krt26 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Krt26 gene (NCBI Reference Sequence: NM_001033397 ; Ensembl: ENSMUSG00000075570 ) is located on Mouse chromosome 11. 8 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 8 (Transcript: ENSMUST00000100482). Exon 1~8 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 1 starts from about 0.07% of the coding region. Exon 1~8 covers 100.0% of the coding region. The size of effective KO region: ~8266 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8

Legends Exon of mouse Krt26 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.3% 586) | C(23.3% 466) | T(25.0% 500) | G(22.4% 448)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(28.6% 572) | C(22.05% 441) | T(26.1% 522) | G(23.25% 465)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr11 - 99337905 99339904 2000 browser details YourSeq 176 500 1359 2000 91.6% chr10 - 76819035 76944851 125817 browser details YourSeq 165 899 1233 2000 94.6% chr11 + 58180573 58181104 532 browser details YourSeq 164 898 1233 2000 94.1% chr2 + 25553631 25554063 433 browser details YourSeq 162 902 1233 2000 96.1% chr7 - 28968158 28968491 334 browser details YourSeq 158 890 1233 2000 92.1% chr13 + 56552352 56552797 446 browser details YourSeq 156 890 1064 2000 95.4% chr7 - 99498572 99498747 176 browser details YourSeq 155 899 1073 2000 92.5% chr8 + 107111610 107111781 172 browser details YourSeq 155 898 1070 2000 94.8% chr17 + 46682503 46682675 173 browser details YourSeq 154 899 1064 2000 96.4% chr18 - 56929068 56929233 166 browser details YourSeq 154 901 1064 2000 97.0% chr11 - 29672732 29672895 164 browser details YourSeq 154 899 1064 2000 96.4% chr6 + 71493309 71493474 166 browser details YourSeq 154 899 1064 2000 96.4% chr17 + 3089181 3089346 166 browser details YourSeq 154 899 1072 2000 92.4% chr15 + 9176481 9176651 171 browser details YourSeq 153 884 1072 2000 89.7% chr4 - 25263542 25263721 180 browser details YourSeq 153 902 1072 2000 92.9% chr3 - 153870768 153870935 168 browser details YourSeq 153 899 1070 2000 94.8% chr13 - 23987097 23987274 178 browser details YourSeq 153 901 1064 2000 97.0% chr1 - 136571384 136571548 165 browser details YourSeq 153 900 1064 2000 96.4% chr7 + 34290181 34290345 165 browser details YourSeq 153 882 1063 2000 97.0% chr5 + 137325072 137325254 183

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr11 - 99327637 99329636 2000 browser details YourSeq 70 1350 1552 2000 78.6% chr4 - 109619736 109619927 192 browser details YourSeq 67 1449 1833 2000 81.4% chr4 + 123640482 123640968 487 browser details YourSeq 51 1534 1958 2000 65.3% chr7 + 43900235 43900490 256 browser details YourSeq 48 1372 1674 2000 88.8% chr12 - 66140420 66140753 334 browser details YourSeq 47 1508 1578 2000 76.4% chr11 + 62075690 62075745 56 browser details YourSeq 46 1506 1797 2000 91.1% chr4 - 85284795 85285086 292 browser details YourSeq 46 1523 1794 2000 89.7% chr1 - 62107679 62107951 273 browser details YourSeq 45 178 240 2000 83.4% chr1 + 176379131 176379191 61 browser details YourSeq 43 183 245 2000 88.0% chr2 + 118843181 118843242 62 browser details YourSeq 41 1649 1700 2000 87.8% chr7 + 144383357 144383407 51 browser details YourSeq 40 1508 1555 2000 91.7% chr2 + 70801951 70801998 48 browser details YourSeq 38 1455 1561 2000 89.8% chr13 + 84040994 84041102 109 browser details YourSeq 37 1516 1617 2000 89.4% chr3 - 67673016 67673119 104 browser details YourSeq 37 1522 1581 2000 95.0% chr3 + 27871510 27871570 61 browser details YourSeq 36 1816 1878 2000 82.5% chr17 - 90080448 90080506 59 browser details YourSeq 34 1648 1693 2000 97.4% chr10 - 117502659 117502707 49 browser details YourSeq 33 1539 1581 2000 88.4% chr1 - 43320278 43320320 43 browser details YourSeq 31 1503 1537 2000 94.3% chr15 + 60756264 60756298 35 browser details YourSeq 30 1648 1694 2000 79.6% chr6 + 50786520 50786565 46

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Krt26 26 [ Mus musculus (house mouse) ] Gene ID: 320864, updated on 12-Aug-2019

Gene summary

Official Symbol Krt26 provided by MGI Official Full Name keratin 26 provided by MGI Primary source MGI:MGI:2444913 See related Ensembl:ENSMUSG00000075570 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Krt25b; 4732407F15Rik Expression Low expression observed in reference dataset See more Orthologs human all

Genomic context

Location: 11; 11 D See Krt26 in Genome Data Viewer Exon count: 8

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (99328484..99337965, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (99189798..99199279, complement)

Chromosome 11 - NC_000077.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Krt26 ENSMUSG00000075570

Description keratin 26 [Source:MGI Symbol;Acc:MGI:2444913] Gene Synonyms 4732407F15Rik Location Chromosome 11: 99,328,550-99,337,966 reverse strand. GRCm38:CM001004.2 About this gene This gene has 2 transcripts (splice variants), 91 orthologues, 68 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Krt26-201 ENSMUST00000100482.2 2471 462aa ENSMUSP00000098051.2 Protein coding CCDS25378 Q3TRJ4 TSL:1 GENCODE basic APPRIS P1

Krt26-202 ENSMUST00000148770.1 3667 No protein - lncRNA - - TSL:2

29.42 kb Forward strand

99.32Mb 99.33Mb 99.34Mb Contigs < AL590991.14

Genes (Comprehensive set... < Krt25-201protein coding < Krt26-202lncRNA < Krt27-201protein coding

< Krt26-201protein coding

Regulatory Build

99.32Mb 99.33Mb 99.34Mb Reverse strand 29.42 kb

Regulation Legend

CTCF Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000100482

< Krt26-201protein coding

Reverse strand 9.35 kb

ENSMUSP00000098... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF64593 SMART , rod domain Prints Keratin, type I Pfam Intermediate filament, rod domain

PROSITE profiles Intermediate filament, rod domain

PANTHER PTHR23239:SF162

Keratin, type I Gene3D 1.20.5.500 1.20.5.170

Intermediate filament, rod domain, coil 1B

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 462

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8