https://www.alphaknockout.com

Mouse Tgm3 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tgm3 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tgm3 (NCBI Reference Sequence: NM_009374 ; Ensembl: ENSMUSG00000027401 ) is located on Mouse 2. 13 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 13 (Transcript: ENSMUST00000110299). Exon 2~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tgm3 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-111F6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for an ENU or null mutation exhibit rough-looking, curly hair. Null mutants display delayed skin barrier formation, loss of vibrissae, and brittle hairs.

Exon 2 starts from about 0.38% of the coding region. The knockout of Exon 2~4 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 11279 bp, and the size of intron 4 for 3'-loxP site insertion: 1308 bp. The size of effective cKO region: ~2188 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 13 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Tgm3 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8688bp) | A(26.07% 2265) | C(23.58% 2049) | T(28.63% 2487) | G(21.72% 1887)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 130020440 130023439 3000 browser details YourSeq 84 1618 1882 3000 83.8% chr8 - 105975689 106149057 173369 browser details YourSeq 79 1767 1898 3000 93.5% chr13 + 80879242 80879373 132 browser details YourSeq 77 1526 1882 3000 77.4% chr1 - 155510652 155510960 309 browser details YourSeq 73 1733 1877 3000 90.3% chr8 + 80797525 80797671 147 browser details YourSeq 72 1768 1903 3000 87.0% chr5 + 34247713 34247847 135 browser details YourSeq 71 1769 1877 3000 95.1% chr10 - 21127356 21127470 115 browser details YourSeq 68 1619 1882 3000 78.7% chr11 + 95554323 95554506 184 browser details YourSeq 66 1765 1883 3000 78.0% chr11 - 5050581 5050699 119 browser details YourSeq 64 1768 1877 3000 94.5% chr1 + 167360926 167361035 110 browser details YourSeq 63 1713 1873 3000 81.7% chr11 - 98406236 98406379 144 browser details YourSeq 63 1767 1877 3000 94.4% chr8 + 123488777 123489082 306 browser details YourSeq 63 1768 1881 3000 90.0% chr10 + 52076989 52077105 117 browser details YourSeq 62 1768 1877 3000 91.4% chr17 - 15434524 15434631 108 browser details YourSeq 62 1766 1877 3000 93.3% chr10 - 13410518 13410631 114 browser details YourSeq 62 1766 1884 3000 83.1% chr10 + 7208170 7208277 108 browser details YourSeq 60 1768 1874 3000 93.1% chr8 + 92857688 92857798 111 browser details YourSeq 60 1766 1874 3000 90.7% chr17 + 46291466 46291576 111 browser details YourSeq 60 1733 1880 3000 92.9% chr10 + 85437739 85437887 149 browser details YourSeq 59 1766 1876 3000 95.5% chr1 - 184500574 184500692 119

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 130025628 130028627 3000 browser details YourSeq 49 1736 1785 3000 100.0% chr1 + 186444946 186444996 51

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and protein information: Tgm3 3, E polypeptide [ Mus musculus (house mouse) ] Gene ID: 21818, updated on 21-Aug-2019

Gene summary

Official Symbol Tgm3 provided by MGI Official Full Name transglutaminase 3, E polypeptide provided by MGI Primary source MGI:MGI:98732 See related Ensembl:ENSMUSG00000027401 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as we; TGE; TG E; TG(E); TGase E; TGase-3; AI893889 Expression Biased expression in colon adult (RPKM 142.1), lung adult (RPKM 18.2) and 1 other tissueS ee more Orthologs human all

Genomic context

Location: 2; 2 F1 See Tgm3 in Genome Data Viewer

Exon count: 15

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (129987794..130050399)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (129838110..129876135)

Chromosome 2 - NC_000068.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Tgm3 ENSMUSG00000027401

Description transglutaminase 3, E polypeptide [Source:MGI Symbol;Acc:MGI:98732] Gene Synonyms TG E, we Location Chromosome 2: 130,012,349-130,050,399 forward strand. GRCm38:CM000995.2 About this gene This gene has 1 transcript (splice variant), 140 orthologues, 8 paralogues, is a member of 1 Ensembl protein family and is associated with 20 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tgm3-201 ENSMUST00000110299.2 4135 693aa ENSMUSP00000105928.2 Protein coding CCDS38240 Q08189 TSL:1 GENCODE basic APPRIS P1

58.05 kb Forward strand 130.01Mb 130.02Mb 130.03Mb 130.04Mb 130.05Mb 130.06Mb (Comprehensive set... Tgm3-201 >protein coding

Contigs < AL808127.4 Genes < Mir6339-201miRNA (Comprehensive set...

Regulatory Build

130.01Mb 130.02Mb 130.03Mb 130.04Mb 130.05Mb 130.06Mb Reverse strand 58.05 kb

Regulation Legend CTCF Enhancer Open Chromatin Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000110299

38.05 kb Forward strand

Tgm3-201 >protein coding

ENSMUSP00000105... MobiDB lite Low complexity (Seg) Superfamily Immunoglobulin E-set Papain-like cysteine peptidase superfamily Transglutaminase, C-terminal domain superfamily

SMART Transglutaminase-like Pfam Transglutaminase, N-terminal Transglutaminase-like Transglutaminase, C-terminal

PROSITE patterns Transglutaminase, active site

PIRSF Protein-glutamine gamma-glutamyltransferase, animal PANTHER PTHR11590:SF36

PTHR11590 Gene3D Immunoglobulin-like fold

Transglutaminase-like superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 693

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7