https://www.alphaknockout.com

Mouse Cldn4 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Cldn4 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cldn4 (NCBI Reference Sequence: NM_009903 ; Ensembl: ENSMUSG00000047501 ) is located on Mouse 5. 1 exon is identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 1 (Transcript: ENSMUST00000051401). Exon 1 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cldn4 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-61L4 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit hydropherosis due to kidney pelvis and ureteral urothelium proliferation that leads to impaired calcium and chloride ion reabsorbtion and premature death.

Exon 1 covers 100.0% of the coding region. Start codon is in exon 1, and stop codon is in exon 1. The size of effective cKO region: ~2137 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

gRNA region

Wildtype allele A T

5' G gRNA region 3'

1 6

Targeting vector A T G

Targeted allele A T G

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Cldn4 cKO region Exon of mouse Mettl27 loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(6630bp) | A(22.91% 1519) | C(25.82% 1712) | T(24.27% 1609) | G(27.0% 1790)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 134946746 134949745 3000 browser details YourSeq 147 517 1155 3000 83.0% chr13 - 42212552 42212781 230 browser details YourSeq 131 172 1036 3000 78.8% chr5 - 143628263 143628567 305 browser details YourSeq 125 570 725 3000 93.2% chr7 - 16708849 16709011 163 browser details YourSeq 122 570 724 3000 90.7% chr2 + 158558000 158558158 159 browser details YourSeq 116 580 724 3000 91.1% chr17 - 12366775 12366930 156 browser details YourSeq 116 584 724 3000 92.8% chr7 + 128079825 128079976 152 browser details YourSeq 113 584 730 3000 92.0% chr4 - 144028513 144028679 167 browser details YourSeq 112 584 724 3000 91.2% chr13 - 58541084 58541233 150 browser details YourSeq 112 592 734 3000 92.5% chr17 + 35228438 35228597 160 browser details YourSeq 110 604 744 3000 87.5% chr4 - 135827148 135827280 133 browser details YourSeq 109 592 733 3000 89.9% chr1 + 109320675 109320848 174 browser details YourSeq 108 570 704 3000 91.0% chrX - 169946226 169946365 140 browser details YourSeq 108 584 712 3000 93.0% chr10 - 64047208 64047347 140 browser details YourSeq 108 580 704 3000 93.6% chrX + 137047109 137047241 133 browser details YourSeq 108 580 713 3000 91.0% chr5 + 58948579 58948720 142 browser details YourSeq 108 581 724 3000 88.5% chr1 + 103653256 103653405 150 browser details YourSeq 107 593 714 3000 95.8% chr8 - 105839673 105839840 168 browser details YourSeq 107 626 1089 3000 81.6% chr3 - 93658449 93658688 240 browser details YourSeq 107 605 730 3000 93.5% chr7 + 69190419 69190546 128

Note: The 3000 bp section upstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 134943116 134946115 3000 browser details YourSeq 85 2475 2813 3000 89.9% chr17 - 24778054 24778566 513 browser details YourSeq 85 2475 2813 3000 90.4% chr1 + 165423735 165424091 357 browser details YourSeq 79 2475 2813 3000 79.2% chr7 - 84264288 84264537 250 browser details YourSeq 73 2458 2811 3000 82.1% chr6 + 52040508 52040848 341 browser details YourSeq 71 2478 2813 3000 78.1% chr9 - 109628427 109628716 290 browser details YourSeq 71 1478 1942 3000 74.8% chr8 - 86193192 86193545 354 browser details YourSeq 65 2475 2813 3000 70.4% chr1 - 22715808 22715905 98 browser details YourSeq 63 2474 2786 3000 72.5% chr19 - 42274768 42274838 71 browser details YourSeq 63 2753 2829 3000 92.3% chr14 - 31447538 31447977 440 browser details YourSeq 62 2750 2862 3000 88.8% chr5 - 147685273 147685570 298 browser details YourSeq 62 2735 2860 3000 94.3% chr12 - 103088127 103088257 131 browser details YourSeq 61 2476 2813 3000 69.9% chr1 + 36931202 36931296 95 browser details YourSeq 60 2478 2813 3000 66.7% chr14 + 61254996 61255089 94 browser details YourSeq 59 2735 2813 3000 87.4% chr3 - 144607420 144607496 77 browser details YourSeq 59 2735 2821 3000 82.8% chr17 - 28337230 28337314 85 browser details YourSeq 58 2735 2813 3000 87.2% chr6 - 85327701 85327777 77 browser details YourSeq 58 2735 2813 3000 86.2% chr15 - 102455257 102455333 77 browser details YourSeq 58 2735 2813 3000 89.1% chr15 + 103186864 103186939 76 browser details YourSeq 57 2735 2813 3000 85.1% chrX - 150963724 150963798 75

Note: The 3000 bp section downstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Cldn4 claudin 4 [ Mus musculus (house mouse) ] Gene ID: 12740, updated on 10-Oct-2019

Gene summary

Official Symbol Cldn4 provided by MGI Official Full Name claudin 4 provided by MGI Primary source MGI:MGI:1313314 See related Ensembl:ENSMUSG00000047501 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Cep-r; Cpetr; Cpetr1 Summary This gene encodes a member of the claudin family. Claudins are integral membrane and components of tight Orthologs junction strands. Tight junction strands serve as a physical barrier to prevent solutes and water from passing freely through the paracellular space between epithelial or endothelial cell sheets, and also play critical roles in maintaining cell polarity and signal transductions. The protein encoded by this gene is a high-affinity receptor for clostridium perfringens enterotoxin (CPE) produced by the bacterium Clostridium perfringens, and the interaction with CPE results in increased membrane permeability by forming small pores in plasma membrane. This protein augments alveolar epithelial barrier function and is induced in acute lung injury. It is highly expressed in pancreatic and ovarian cancers. [provided by RefSeq, Aug 2010] human all

Genomic context

Location: 5 G2; 5 74.9 cM See Cldn4 in Genome Data Viewer

Exon count: 1

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (134945123..134946934, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (135420993..135422804, complement)

Chromosome 5 - NC_000071.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Cldn4 ENSMUSG00000047501

Description claudin 4 [Source:MGI Symbol;Acc:MGI:1313314] Gene Synonyms Cpetr, Cpetr1 Location Chromosome 5: 134,945,119-134,946,934 reverse strand. GRCm38:CM000998.2 About this gene This gene has 1 transcript (splice variant), 231 orthologues, 40 paralogues, is a member of 1 Ensembl protein family and is associated with 13 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cldn4-201 ENSMUST00000051401.3 1816 210aa ENSMUSP00000053420.2 Protein coding CCDS19728 O35054 Q3UM35 TSL:NA GENCODE basic APPRIS P1

21.82 kb Forward strand 134.940Mb 134.945Mb 134.950Mb 134.955Mb Mettl27-201 >protein coding (Comprehensive set...

Mettl27-204 >protein coding

Mettl27-205 >protein coding

Mettl27-202 >protein coding

Mettl27-203 >protein coding

Mettl27-209 >protein coding

Mettl27-207 >lncRNA Mettl27-208 >protein coding

Contigs AC079938.3 > Genes (Comprehensive set... < Cldn4-201protein coding

Regulatory Build

134.940Mb 134.945Mb 134.950Mb 134.955Mb Reverse strand 21.82 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000051401

< Cldn4-201protein coding

Reverse strand 1.82 kb

ENSMUSP00000053... Transmembrane heli... Low complexity (Seg) Prints Claudin-4

PR01077 Pfam PMP-22/EMP/MP20/Claudin superfamily

PROSITE patterns Claudin, conserved site PANTHER Claudin

PTHR12002:SF89 Gene3D 1.20.140.150

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 210

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7