https://www.alphaknockout.com

Mouse Krt72 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Krt72 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Krt72 gene (NCBI Reference Sequence: NM_213728 ; Ensembl: ENSMUSG00000056605 ) is located on Mouse chromosome 15. 9 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 9 (Transcript: ENSMUST00000071104). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Krt72 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-437J8 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 29.1% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 981 bp, and the size of intron 2 for 3'-loxP site insertion: 2662 bp. The size of effective cKO region: ~715 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 9 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Krt72 cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7215bp) | A(27.69% 1998) | C(22.2% 1602) | T(24.46% 1765) | G(25.64% 1850)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr15 - 101785275 101788274 3000 browser details YourSeq 214 19 328 3000 97.8% chr12 - 45894657 45895096 440 browser details YourSeq 204 130 337 3000 99.1% chr1 + 105945801 105946008 208 browser details YourSeq 202 133 349 3000 96.8% chr5 - 53973408 53973657 250 browser details YourSeq 201 133 418 3000 97.2% chr1 + 26946258 26946612 355 browser details YourSeq 199 133 336 3000 99.1% chr14 - 104019284 104019502 219 browser details YourSeq 198 133 341 3000 98.6% chr8 - 44232229 44232527 299 browser details YourSeq 198 133 336 3000 98.6% chr11 + 110342493 110342696 204 browser details YourSeq 198 133 332 3000 99.5% chr1 + 102460093 102460292 200 browser details YourSeq 196 133 332 3000 99.0% chr6 - 125838878 125839077 200 browser details YourSeq 196 133 333 3000 99.1% chr4 - 57872940 57873162 223 browser details YourSeq 196 133 333 3000 99.1% chr3 - 61555097 61555302 206 browser details YourSeq 196 133 333 3000 99.1% chr16 - 86297550 86297752 203 browser details YourSeq 196 133 333 3000 99.1% chr13 - 20668316 20668517 202 browser details YourSeq 196 133 338 3000 97.6% chr12 - 38509122 38509327 206 browser details YourSeq 196 133 338 3000 96.6% chr4 + 101061053 101061255 203 browser details YourSeq 196 133 334 3000 97.6% chr2 + 87185775 87185975 201 browser details YourSeq 196 133 333 3000 99.1% chr14 + 53900329 53900541 213 browser details YourSeq 196 133 332 3000 99.0% chr12 + 101619553 101619752 200 browser details YourSeq 196 133 337 3000 96.1% chr10 + 17143278 17143479 202

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr15 - 101781560 101784559 3000 browser details YourSeq 63 897 999 3000 86.3% chr2 - 174541875 174541997 123 browser details YourSeq 62 892 973 3000 84.5% chr15 + 38572235 38572313 79 browser details YourSeq 60 904 997 3000 83.9% chr4 - 126902155 126902238 84 browser details YourSeq 57 892 997 3000 93.9% chr5 + 36737081 36737191 111 browser details YourSeq 55 892 973 3000 95.1% chr11 - 97382314 97382412 99 browser details YourSeq 55 894 1034 3000 95.1% chr10 - 71658268 71658671 404 browser details YourSeq 53 888 1001 3000 95.0% chr4 + 40891884 40892257 374 browser details YourSeq 52 894 997 3000 92.0% chr15 - 99282698 99282823 126 browser details YourSeq 51 894 975 3000 94.8% chr7 - 126651329 126651616 288 browser details YourSeq 51 904 993 3000 90.5% chr1 - 59047376 59047589 214 browser details YourSeq 50 892 975 3000 81.5% chr10 - 85156857 85156928 72 browser details YourSeq 49 873 961 3000 83.6% chr4 + 115838651 115838758 108 browser details YourSeq 48 915 997 3000 87.5% chr7 - 78780195 78780301 107 browser details YourSeq 48 2899 2958 3000 88.0% chr15 - 101558347 101558405 59 browser details YourSeq 48 894 973 3000 78.6% chr1 - 108927582 108927649 68 browser details YourSeq 48 890 997 3000 86.3% chr1 - 4848440 4848545 106 browser details YourSeq 46 895 973 3000 78.2% chrUn_JH584304 - 12809 12876 68 browser details YourSeq 46 894 973 3000 77.0% chr6 - 120352230 120352291 62 browser details YourSeq 46 874 933 3000 98.0% chr16 - 53953901 53954442 542

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Krt72 72 [ Mus musculus (house mouse) ] Gene ID: 105866, updated on 13-Aug-2019

Gene summary

Official Symbol Krt72 provided by MGI Official Full Name keratin 72 provided by MGI Primary source MGI:MGI:2146034 See related Ensembl:ENSMUSG00000056605 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as K72; CK-72; K6irs2; Krt2-35; AI507495; Krt72-ps Expression Low expression observed in reference dataset See more Orthologs human all

Genomic context

Location: 15; 15 F2 See Krt72 in Genome Data Viewer

Exon count: 9

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 15 NC_000081.6 (101776172..101786520, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 15 NC_000081.5 (101606991..101616889, complement)

Chromosome 15 - NC_000081.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Krt72 ENSMUSG00000056605

Description keratin 72 [Source:MGI Symbol;Acc:MGI:2146034] Gene Synonyms K6irs2, Krt72-ps Location Chromosome 15: 101,776,172-101,786,460 reverse strand. GRCm38:CM001008.2 About this gene This gene has 2 transcripts (splice variants), 80 orthologues, 68 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Krt72-201 ENSMUST00000071104.5 1953 520aa ENSMUSP00000065922.4 Protein coding CCDS27863 Q6IME9 TSL:2 GENCODE basic APPRIS P1

Krt72-202 ENSMUST00000127671.2 715 No protein - lncRNA - - TSL:3

30.29 kb Forward strand 101.77Mb 101.78Mb 101.79Mb Contigs AC104862.15 > Genes (Comprehensive set... < Krt72-201protein coding < Krt73-201protein coding

< Krt72-202lncRNA

Regulatory Build

101.77Mb 101.78Mb 101.79Mb Reverse strand 30.29 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000071104

< Krt72-201protein coding

Reverse strand 10.29 kb

ENSMUSP00000065... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF64593 SMART , rod domain Prints Keratin, type II Pfam Keratin type II head

Intermediate filament, rod domain PROSITE profiles Intermediate filament, rod domain PROSITE patterns Intermediate filament protein, conserved site PANTHER PTHR45616

PTHR45616:SF2 Gene3D 1.20.5.500 1.20.5.170

Intermediate filament, rod domain, coil 1B

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 520

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7