https://www.alphaknockout.com

Mouse Krt33b Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Krt33b conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Krt33b (NCBI Reference Sequence: NM_013570 ; Ensembl: ENSMUSG00000057723 ) is located on Mouse 11. 7 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 7 (Transcript: ENSMUST00000073890). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Krt33b gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-351F1 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 35.64% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 879 bp, and the size of intron 3 for 3'-loxP site insertion: 795 bp. The size of effective cKO region: ~657 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 4 5 6 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Krt33b Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7157bp) | A(28.13% 2013) | C(22.79% 1631) | T(25.23% 1806) | G(23.85% 1707)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 100026844 100029843 3000 browser details YourSeq 442 5 2373 3000 93.5% chr11 - 100014749 100016161 1413 browser details YourSeq 419 5 2374 3000 96.3% chr11 - 100040958 100050496 9539 browser details YourSeq 328 34 371 3000 98.6% chr11 - 100041144 100041481 338 browser details YourSeq 191 156 2373 3000 84.8% chr11 - 100049863 100087954 38092 browser details YourSeq 180 820 1039 3000 92.7% chr14 + 4646647 4998634 351988 browser details YourSeq 178 795 1037 3000 95.7% chr1 + 190152000 190152706 707 browser details YourSeq 170 804 1037 3000 94.4% chr12 + 9443651 9444124 474 browser details YourSeq 165 802 1037 3000 91.7% chr14 + 87892567 87892800 234 browser details YourSeq 158 804 1037 3000 93.6% chr11 + 105522548 105522785 238 browser details YourSeq 153 802 1036 3000 89.6% chr18 + 67972484 67972684 201 browser details YourSeq 152 804 1037 3000 88.0% chr12 - 12370593 12370802 210 browser details YourSeq 149 832 1002 3000 97.7% chr1 - 171024835 171025205 371 browser details YourSeq 141 828 1032 3000 92.1% chr10 + 57300586 57300965 380 browser details YourSeq 138 808 1007 3000 93.9% chr13 - 41120338 41120708 371 browser details YourSeq 137 802 1004 3000 92.1% chr1 - 132343885 132344139 255 browser details YourSeq 136 840 1039 3000 92.1% chr14 + 3046009 3046334 326 browser details YourSeq 135 839 1034 3000 94.2% chr14 + 87892460 87892695 236 browser details YourSeq 126 802 1009 3000 86.6% chr12 - 12370585 12370756 172 browser details YourSeq 126 830 984 3000 94.7% chr1 - 171024805 171025076 272

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 100023187 100026186 3000 browser details YourSeq 778 544 2169 3000 94.4% chr11 - 100047004 100048503 1500 browser details YourSeq 762 536 2145 3000 93.8% chr11 - 100037768 100039090 1323 browser details YourSeq 716 530 1381 3000 94.5% chr11 - 100011886 100012731 846 browser details YourSeq 284 800 1381 3000 91.3% chr11 - 100092939 100103298 10360 browser details YourSeq 233 799 1371 3000 88.0% chr11 - 100083882 100093797 9916 browser details YourSeq 206 799 1371 3000 79.3% chr11 - 100102797 100204552 101756 browser details YourSeq 116 1209 1367 3000 86.8% chr11 - 99538545 99538707 163 browser details YourSeq 81 645 845 3000 77.4% chr11 - 100084799 100084987 189 browser details YourSeq 42 999 1051 3000 97.9% chr16 - 26377238 26377294 57 browser details YourSeq 35 998 1051 3000 74.4% chr15 + 19480474 19480514 41 browser details YourSeq 28 999 1026 3000 100.0% chr1 + 178603940 178603967 28 browser details YourSeq 25 1021 1046 3000 100.0% chr16 - 42273921 42273951 31 browser details YourSeq 24 2917 2945 3000 88.5% chr13 - 96307390 96307417 28 browser details YourSeq 23 2917 2944 3000 88.0% chr12 + 108722231 108722257 27

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Krt33b 33B [ Mus musculus (house mouse) ] Gene ID: 16671, updated on 13-Aug-2019

Gene summary

Official Symbol Krt33b provided by MGI Official Full Name keratin 33B provided by MGI Primary source MGI:MGI:1309991 See related Ensembl:ENSMUSG00000057723 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Ha3; Ha4; mHa3; Krt1-3 Expression Low expression observed in reference dataset See more

Genomic context

Location: 11 D; 11 63.37 cM See Krt33b in Genome Data Viewer Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (100023634..100029868, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (99884948..99891182, complement)

Chromosome 11 - NC_000077.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Krt33b ENSMUSG00000057723

Description keratin 33B [Source:MGI Symbol;Acc:MGI:1309991] Gene Synonyms Ha3, Ha4, Krt1-3, mHa3 Location Chromosome 11: 100,023,634-100,029,868 reverse strand. GRCm38:CM001004.2 About this gene This gene has 1 transcript (splice variant), 177 orthologues, 68 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Krt33b-201 ENSMUST00000073890.3 1581 404aa ENSMUSP00000073552.3 Protein coding CCDS48928 Q61897 TSL:1 GENCODE basic APPRIS P1

26.23 kb Forward strand

100.015Mb 100.020Mb 100.025Mb 100.030Mb 100.035Mb Contigs AL592545.10 >

Genes < Krt33a-201protein coding < Krt33b-201protein coding < Krt34-201protein coding (Comprehensive set...

< Krt33a-202lncRNA

Regulatory Build

100.015Mb 100.020Mb 100.025Mb 100.030Mb 100.035Mb Reverse strand 26.23 kb

Regulation Legend

CTCF Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000073890

< Krt33b-201protein coding

Reverse strand 6.24 kb

ENSMUSP00000073... Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF64593

SSF46579 SMART , rod domain Prints Keratin, type I Pfam Intermediate filament, rod domain PROSITE profiles Intermediate filament, rod domain PROSITE patterns Intermediate filament protein, conserved site PANTHER PTHR23239:SF224

Keratin, type I Gene3D Intermediate filament, rod domain, coil 1B 1.20.5.170

1.20.5.500

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 404

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7