https://www.alphaknockout.com

Mouse Cldn20 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Cldn20 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cldn20 (NCBI Reference Sequence: NM_001101560 ; Ensembl: ENSMUSG00000091530 ) is located on Mouse 17. 1 exon is identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 1 (Transcript: ENSMUST00000168560). Exon 1 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cldn20 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-415H22 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 1 covers 100.0% of the coding region. Start codon is in exon 1, and stop codon is in exon 1. The size of effective cKO region: ~1170 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm cKO region Exon of mouse Cldn20 loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(6657bp) | A(30.06% 2001) | C(21.0% 1398) | T(27.99% 1863) | G(20.96% 1395)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 + 3529555 3532554 3000 browser details YourSeq 170 1313 1522 3000 93.2% chr10 - 13866898 13867136 239 browser details YourSeq 158 1316 1513 3000 88.8% chr4 + 154824389 154824565 177 browser details YourSeq 157 1313 1511 3000 88.5% chr6 - 56227071 56227261 191 browser details YourSeq 157 1300 1510 3000 92.9% chr1 + 58419039 58419413 375 browser details YourSeq 155 1316 1512 3000 92.3% chr6 - 21766577 21766908 332 browser details YourSeq 154 1313 1488 3000 94.3% chr7 + 76027816 76028010 195 browser details YourSeq 154 1320 1513 3000 91.5% chr10 + 117988814 117989027 214 browser details YourSeq 153 1316 1511 3000 88.0% chr4 - 138159556 138159743 188 browser details YourSeq 151 1316 1499 3000 92.7% chr18 - 40236139 40236332 194 browser details YourSeq 150 1325 1515 3000 89.4% chr5 - 80354197 80354383 187 browser details YourSeq 150 1316 1490 3000 94.2% chr17 + 65431198 65431375 178 browser details YourSeq 149 1316 1507 3000 91.4% chrX + 8293483 8293674 192 browser details YourSeq 149 1320 1512 3000 88.3% chr9 + 20936843 20937027 185 browser details YourSeq 149 1325 1503 3000 95.2% chr4 + 77723266 77723446 181 browser details YourSeq 149 1313 1504 3000 91.8% chr13 + 44702396 44702591 196 browser details YourSeq 148 1316 1492 3000 90.3% chr17 - 87716255 87716429 175 browser details YourSeq 147 1313 1502 3000 90.3% chr11 + 30225187 30225380 194 browser details YourSeq 146 1313 1515 3000 87.2% chr4 - 119186683 119186846 164 browser details YourSeq 146 1316 1510 3000 94.0% chr1 - 131162394 131162774 381

Note: The 3000 bp section upstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 + 3533212 3536211 3000 browser details YourSeq 47 2024 2081 3000 82.0% chr18 - 77571862 77571911 50 browser details YourSeq 44 2035 2084 3000 96.0% chr18 + 76620350 76620650 301 browser details YourSeq 41 2027 2070 3000 97.8% chr5 - 105083935 105083982 48 browser details YourSeq 40 2034 2078 3000 95.6% chr18 - 53725225 53725271 47 browser details YourSeq 40 2042 2084 3000 97.7% chr13 - 71184324 71184371 48 browser details YourSeq 33 2024 2062 3000 97.3% chr13 - 57237801 57237852 52 browser details YourSeq 24 828 854 3000 96.2% chr2 - 20215978 20216006 29 browser details YourSeq 23 849 883 3000 82.9% chr11 + 87032091 87032125 35

Note: The 3000 bp section downstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Cldn20 claudin 20 [ Mus musculus (house mouse) ] Gene ID: 621628, updated on 13-Aug-2019

Gene summary

Official Symbol Cldn20 provided by MGI Official Full Name claudin 20 provided by MGI Primary source MGI:MGI:3646757 See related Ensembl:ENSMUSG00000091530 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as EG621628 Summary This gene encodes a member of the claudin family. Claudins are integral membrane and components of tight Orthologs junction strands. Tight junction strands serve as a physical barrier to prevent solutes and water from passing freely through the paracellular space between epithelial or endothelial cell sheets, and also play critical roles in maintaining cell polarity and signal transductions. The protein encoded by this gene is identified in retinal pigment epithelium (RPE) and analysis of the RPE transcriptome reveals that this appears late during development of chick embryo. [provided by RefSeq, Aug 2010] human all

Genomic context

Location: 17; 17 A1 See Cldn20 in Genome Data Viewer

Exon count: 1

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (3532554..3533213)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (3532554..3533213)

Chromosome 17 - NC_000083.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Cldn20 ENSMUSG00000091530

Description claudin 20 [Source:MGI Symbol;Acc:MGI:3646757] Gene Synonyms EG621628 Location Chromosome 17: 3,532,554-3,533,213 forward strand. GRCm38:CM001010.2 About this gene This gene has 2 transcripts (splice variants), 228 orthologues, 40 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cldn20-201 ENSMUST00000168560.1 660 219aa ENSMUSP00000126169.1 Protein coding CCDS49928 G5E8X0 TSL:NA GENCODE basic APPRIS P1

Cldn20-202 ENSMUST00000232647.1 656 No protein - Processed pseudogene - - -

20.66 kb Forward strand

3.525Mb 3.530Mb 3.535Mb 3.540Mb (Comprehensive set... Tiam2-208 >nonsense mediated decay Cldn20-201 >protein coding

Cldn20-202 >processed pseudogene

Contigs AC122350.4 > Genes < Tfb1m-201protein coding (Comprehensive set...

Regulatory Build

3.525Mb 3.530Mb 3.535Mb 3.540Mb Reverse strand 20.66 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000168560

660 bp Forward strand

Cldn20-201 >protein coding

ENSMUSP00000126... Transmembrane heli... Cleavage site (Sign... Prints PR01077 Pfam PMP-22/EMP/MP20/Claudin superfamily PROSITE patterns Claudin, conserved site PANTHER PTHR12002:SF17

Claudin Gene3D 1.20.140.150

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 219

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7