http://www.alphaknockout.com/ Mouse Clcn7 Conditional Knockout Project (CRISPR/Cas9)*

Objective: To create a Clcn7 conditional knockout mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Clcn7 gene ( NCBI Reference Sequence: NM_011930 ; Ensembl: ENSMUSG00000036636 ) is located on mouse chromosome 17. 25 exons are identified , with the ATG start codon in exon 1 and the TGA stop codon in exon 25 (Transcript: ENSMUST00000040729). Exon 2~5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the mouse Clcn7 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-99I16 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit postnatal lethality, abnormal bone formation, including , and retinal degeneration. Mice homozygous for a conditional allele exhibit lysosomal defects with neuronal degeneration and accumulationof giant lysosomes in renal tubule cells.

The knockout of Exon 2~5 will result in frameshift of the gene, and covers 14.11% of the coding region. The size of intron 1 for 5'-loxP site insertion: 10815 bp, and the size of intron 5 for 3'-loxP site insertion: 320 bp. The size of effective cKO region: ~2525 bp. This strategy is designed based on genetic information in existing databases. Due to the complexity of biological processes, all risk of loxP insertion on gene transcription, RNA splicing and protein translation cannot be predicted at existing technological level.

Page 1 of 8 http://www.alphaknockout.com/

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 25 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Clcn7 Homology arm cKO region loxP site

Page 2 of 8 http://www.alphaknockout.com/

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8840bp) | A(22.31% 1972) | C(24.79% 2191) | G(26.57% 2349) | T(26.33% 2328)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 http://www.alphaknockout.com/

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 + 25141181 25144180 3000 browser details YourSeq 219 369 2450 3000 91.7% chr4 + 38381161 38713313 332153 browser details YourSeq 171 211 493 3000 92.6% chr7 + 92754465 92873137 118673 browser details YourSeq 137 339 504 3000 88.6% chr11 - 107297467 107297624 158 browser details YourSeq 132 246 498 3000 89.5% chr8 + 110861821 110862071 251 browser details YourSeq 128 246 466 3000 89.6% chr6 + 61289215 61289666 452 browser details YourSeq 127 359 503 3000 92.2% chr17 - 74098930 74099071 142 browser details YourSeq 125 2299 2454 3000 92.1% chr6 - 18118015 18118167 153 browser details YourSeq 125 361 531 3000 86.5% chr16 - 94303870 94304024 155 browser details YourSeq 125 2295 2453 3000 90.9% chr13 + 111365957 111366116 160 browser details YourSeq 123 2295 2454 3000 90.8% chr19 - 14100550 14100706 157 browser details YourSeq 121 2299 2456 3000 90.6% chr4 - 129042179 129042333 155 browser details YourSeq 120 360 525 3000 87.4% chr7 - 100664352 100664766 415 browser details YourSeq 120 360 530 3000 84.5% chr15 - 38645227 38645380 154 browser details YourSeq 119 367 504 3000 91.0% chr9 + 44731276 44731409 134 browser details YourSeq 119 2320 2450 3000 97.6% chr7 + 29891457 29891591 135 browser details YourSeq 119 363 493 3000 95.5% chr16 + 3487910 3488040 131 browser details YourSeq 118 368 504 3000 93.5% chr7 - 126783605 126783744 140 browser details YourSeq 118 2320 2454 3000 96.2% chr1 - 4866726 4866862 137 browser details YourSeq 118 358 505 3000 87.2% chr7 + 35516763 35516903 141

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 + 25146611 25149610 3000 browser details YourSeq 28 2664 2697 3000 96.7% chr1 + 158922142 158922191 50 browser details YourSeq 24 2812 2843 3000 87.5% chr5 + 68135315 68135346 32 browser details YourSeq 21 1456 1476 3000 100.0% chr4 + 36265811 36265831 21 browser details YourSeq 20 1238 1259 3000 95.5% chr1 - 31709920 31709941 22

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 http://www.alphaknockout.com/ Gene and protein information: Clcn7 , voltage-sensitive 7 [ Mus musculus (house mouse) ] Gene ID: 26373, updated on 12-Feb-2021

Gene summary

Official Symbol Clcn7 provided by MGI Official Full Name chloride channel, voltage-sensitive 7 provided by MGI Primary source MGI:MGI:1347048 See related Ensembl:ENSMUSG00000036636 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as ClC-; ClC-7; D17Wsu51e Expression Ubiquitous expression in genital fat pad adult (RPKM 32.4), kidney adult (RPKM 17.7) and 28 other tissues See more Orthologs human all NEW Try the new Gene table Try the new Transcript table

Genomic context

Location: 17 A3.3; 17 12.53 cM See Clcn7 in Genome Data Viewer

Exon count: 26

Annotation release Status Assembly Chr Location

109 current GRCm39 (GCF_000001635.27) 17 NC_000083.7 (25352353..25381077)

108.20200622 previous assembly GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (25133378..25162103)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (25270339..25299044)

Chromosome 17 - NC_000083.7

Page 5 of 8 http://www.alphaknockout.com/

Transcript information: This gene has 7 transcripts

Gene: Clcn7 ENSMUSG00000036636

Description chloride channel, voltage-sensitive 7 [Source:MGI Symbol;Acc:MGI:1347048] Gene Synonyms ClC-7 Location Chromosome 17: 25,352,365-25,381,078 forward strand. GRCm39:CM001010.3 About this gene This gene has 7 transcripts (splice variants), 200 orthologues, 8 paralogues and is associated with 68 phenotypes. Transcripts

UniProt Name Transcript ID bp Protein Translation ID Biotype CCDS Flags Match

Clcn7- ENSMUST00000040729.9 4071 803aa ENSMUSP00000035964.3 Protein coding CCDS28509 O70496 TSL:1 201 Q6RUT9 GENCODE basic APPRIS P3

Clcn7- ENSMUST00000160961.8 3983 783aa ENSMUSP00000124194.2 Protein coding CCDS84282 E9PYL4 TSL:1 204 GENCODE basic APPRIS ALT2

Clcn7- ENSMUST00000162862.3 4237 860aa ENSMUSP00000124527.3 Protein coding - F6SUM2 TSL:5 206 GENCODE basic APPRIS ALT2

Clcn7- ENSMUST00000159773.2 605 202aa ENSMUSP00000125546.2 Protein coding - F7BK14 CDS 5' and 3' 203 incomplete TSL:5

Clcn7- ENSMUST00000233633.2 4067 391aa ENSMUSP00000156968.2 Nonsense mediated - A0A3B2W4I8 - 207 decay

Clcn7- ENSMUST00000159426.2 1006 No - Retained intron - - TSL:5 202 protein

Clcn7- ENSMUST00000162722.2 584 No - Retained intron - - TSL:2 205 protein

Page 6 of 8 http://www.alphaknockout.com/

48.71 kb Forward strand 25.35Mb 25.36Mb 25.37Mb 25.38Mb 25.39Mb Genes (Comprehensive set... Ptx4-201 >protein coding Clcn7-207 >nonsense mediated decay Ccdc154-201 >protein coding

Ptx4-203 >protein coding Clcn7-201 >protein coding Ccdc154-203 >protein coding

Ptx4-202 >protein coding Clcn7-206 >protein coding Ccdc154-202 >protein coding

Ptx4-204 >nonsense mediated decay Clcn7-204 >protein coding

Clcn7-205 >retained intron Clcn7-202 >retained intron

Clcn7-203 >protein coding mmu-mir-12188.1-201 >miRNA

Ccdc154-204 >nonsense mediated decay

Contigs AC130711.3 > Regulatory Build

25.35Mb 25.36Mb 25.37Mb 25.38Mb 25.39Mb Reverse strand 48.71 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 http://www.alphaknockout.com/

Transcript: ENSMUST00000040729

28.71 kb Forward strand

Clcn7-201 >protein coding

ENSMUSP00000035... Transmembrane heli... MobiDB lite Low complexity (Seg) Superfamily Chloride channel, core SSF54631

SMART CBS domain Prints Chloride channel ClC-7 Chloride channel, voltage gated Chloride channel, voltage gated CBS domain PROSITE profiles CBS domain PANTHER PTHR11689

PTHR11689:SF92 Gene3D Chloride channel, core 3.10.580.10

CDD cd03685 cd04591

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 803

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

Page 8 of 8