https://www.alphaknockout.com

Mouse Ldhc Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Ldhc conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ldhc gene (NCBI Reference Sequence: NM_013580 ; Ensembl: ENSMUSG00000030851 ) is located on Mouse chromosome 7. 8 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 8 (Transcript: ENSMUST00000014545). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Ldhc gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-169L3 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous male mice are infertile. Spermatogenesis appears normal, but sperm motility decreases rapidly after their release from the epididymus. In vitro fertilization is blocked unless the zona pellucida is removed; even then, the rate of sperm penetration is lower than for wild-type sperm.

Exon 3 starts from about 12.75% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 4513 bp, and the size of intron 3 for 3'-loxP site insertion: 3072 bp. The size of effective cKO region: ~618 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 8 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Ldhc Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7118bp) | A(29.56% 2104) | C(21.96% 1563) | T(26.66% 1898) | G(21.82% 1553)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 46863160 46866159 3000 browser details YourSeq 322 738 1723 3000 95.8% chr7 - 46863897 46864882 986 browser details YourSeq 269 334 924 3000 89.9% chr5 - 114186795 114187422 628 browser details YourSeq 267 489 924 3000 93.3% chr14 + 56772160 56772638 479 browser details YourSeq 263 488 919 3000 86.4% chr16 - 93915863 93916198 336 browser details YourSeq 254 514 924 3000 89.0% chr16 - 18520913 18521220 308 browser details YourSeq 251 497 906 3000 87.5% chr4 - 152057355 152057662 308 browser details YourSeq 250 496 915 3000 88.0% chr1 + 167338352 167338670 319 browser details YourSeq 248 513 914 3000 87.3% chr18 + 61924410 61924700 291 browser details YourSeq 244 489 906 3000 90.1% chr11 + 103757065 103877621 120557 browser details YourSeq 240 601 1349 3000 91.3% chr3 - 105920742 105921431 690 browser details YourSeq 239 513 924 3000 94.8% chr17 + 32261634 32727458 465825 browser details YourSeq 235 493 917 3000 86.1% chr8 + 26181555 26181881 327 browser details YourSeq 224 519 925 3000 89.9% chr12 - 65056935 65057237 303 browser details YourSeq 224 1284 1717 3000 87.9% chr11 - 107499223 107499558 336 browser details YourSeq 223 551 920 3000 90.6% chr6 + 39431181 39431481 301 browser details YourSeq 219 630 924 3000 91.4% chr4 + 108244306 108352809 108504 browser details YourSeq 217 630 924 3000 90.7% chr19 - 37183079 37183668 590 browser details YourSeq 217 539 924 3000 90.4% chr1 - 181135880 181136228 349 browser details YourSeq 214 630 920 3000 93.2% chr2 - 84684190 84701316 17127

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 46866778 46869777 3000 browser details YourSeq 378 219 1822 3000 92.7% chr11 + 68928551 69137720 209170 browser details YourSeq 263 569 1215 3000 92.6% chr15 + 78931946 78932632 687 browser details YourSeq 250 620 1215 3000 84.5% chr1 - 36008085 36008582 498 browser details YourSeq 231 619 1189 3000 93.0% chr12 + 109467398 109468041 644 browser details YourSeq 229 621 1215 3000 84.4% chr4 - 99025654 99025977 324 browser details YourSeq 220 624 1482 3000 93.0% chr18 - 24465234 24559339 94106 browser details YourSeq 218 643 1215 3000 85.0% chr14 - 121392688 121393095 408 browser details YourSeq 212 619 1195 3000 86.6% chr3 + 69015436 69015867 432 browser details YourSeq 167 348 776 3000 85.2% chr5 + 149573486 149573755 270 browser details YourSeq 162 624 1048 3000 95.1% chr19 - 6033547 6034074 528 browser details YourSeq 162 620 795 3000 94.8% chr13 + 33345131 33345304 174 browser details YourSeq 161 620 797 3000 97.1% chr18 - 21606675 21606853 179 browser details YourSeq 158 623 802 3000 94.4% chr9 + 65008503 65008692 190 browser details YourSeq 158 619 797 3000 94.3% chr13 + 58552644 58552821 178 browser details YourSeq 156 620 807 3000 90.6% chr13 - 75705968 75706137 170 browser details YourSeq 154 643 1180 3000 83.4% chr9 - 88535657 88535922 266 browser details YourSeq 154 623 790 3000 95.9% chr2 + 26328372 26328539 168 browser details YourSeq 153 620 800 3000 93.3% chr1 + 105913815 105913998 184 browser details YourSeq 152 617 803 3000 90.4% chr10 - 18398291 18398456 166

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and protein information: Ldhc C [ Mus musculus (house mouse) ] Gene ID: 16833, updated on 24-Oct-2019

Gene summary

Official Symbol Ldhc provided by MGI Official Full Name lactate dehydrogenase C provided by MGI Primary source MGI:MGI:96764 See related Ensembl:ENSMUSG00000030851 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Ldh3; Ldh-3; Ldh-x; Ldhc4; LDH-C4 Expression Restricted expression toward testis adult (RPKM 1111.1) See more Orthologs human all

Genomic context

Location: 7 B3; 7 30.6 cM See Ldhc in Genome Data Viewer

Exon count: 10

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (46861263..46878147)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (54116633..54133512)

Chromosome 7 - NC_000073.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Ldhc ENSMUSG00000030851

Description lactate dehydrogenase C [Source:MGI Symbol;Acc:MGI:96764] Gene Synonyms Ldh-3, Ldh3, Ldhc4 Location Chromosome 7: 46,861,203-46,878,142 forward strand. GRCm38:CM001000.2 About this gene This gene has 5 transcripts (splice variants), 99 orthologues, 4 paralogues, is a member of 1 Ensembl and is associated with 6 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ldhc-201 ENSMUST00000014545.10 1370 332aa ENSMUSP00000014545.4 Protein coding CCDS21290 P00342 Q548Z6 TSL:1 GENCODE basic APPRIS P1

Ldhc-204 ENSMUST00000210585.1 1071 274aa ENSMUSP00000147492.1 Protein coding - A0A1B0GRE9 TSL:1 GENCODE basic

Ldhc-202 ENSMUST00000126004.2 872 236aa ENSMUSP00000115652.2 Protein coding - D3YZE4 CDS 3' incomplete TSL:5

Ldhc-205 ENSMUST00000211784.1 829 171aa ENSMUSP00000148038.1 Protein coding - A0A1B0GSR2 TSL:1 GENCODE basic

Ldhc-203 ENSMUST00000148565.7 767 235aa ENSMUSP00000114206.1 Protein coding - D3YVR7 CDS 3' incomplete TSL:3

Page 6 of 8 https://www.alphaknockout.com

36.94 kb Forward strand

46.86Mb 46.87Mb 46.88Mb Genes (Comprehensive set... Ldha-208 >protein coding Ldhc-201 >protein coding

Ldha-202 >protein coding Ldhc-204 >protein coding

Ldha-214 >protein coding Ldhc-205 >protein coding

Ldha-209 >nonsense mediated decay Ldhc-203 >protein coding

Ldha-215 >protein coding Ldhc-202 >protein coding

Ldha-203 >protein coding

Ldha-210 >protein coding

Ldha-201 >protein coding

Ldha-213 >protein coding

Ldha-211 >retained intron

Contigs < AC090123.68 Regulatory Build

46.86Mb 46.87Mb 46.88Mb Reverse strand 36.94 kb

Regulation Legend CTCF Enhancer Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000014545

16.94 kb Forward strand

Ldhc-201 >protein coding

ENSMUSP00000014... PDB-ENSP mappings TIGRFAM L-lactate dehydrogenase

Superfamily NAD(P)-binding domain superfamily Lactate dehydrogenase/glycoside hydrolase, family 4, C-terminal

Prints L-lactate/ Pfam Lactate/malate dehydrogenase, N-terminal Lactate/malate dehydrogenase, C-terminal

PROSITE profiles PS51257

PROSITE patterns L-lactate dehydrogenase,

PIRSF L-lactate/malate dehydrogenase

PANTHER PTHR43128

PTHR43128:SF5 HAMAP L-lactate dehydrogenase

Gene3D 3.40.50.720 Lactate dehydrogenase/glycoside hydrolase, family 4, C-terminal

CDD cd05293

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 332

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8