https://www.alphaknockout.com

Mouse Clcf1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Clcf1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Clcf1 (NCBI Reference Sequence: NM_019952 ; Ensembl: ENSMUSG00000040663 ) is located on Mouse 19. 3 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 3 (Transcript: ENSMUST00000046506). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Clcf1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-41B18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit postnatal lethality associated with a failure to suckle and decreased facial and spinal motor neurons.

Exon 3 covers 72.89% of the coding region. Start codon is in exon 1, and stop codon is in exon 3. The size of intron 2 for 5'-loxP site insertion: 1729 bp. The size of effective cKO region: ~2167 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Clcf1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(6992bp) | A(21.47% 1501) | C(28.92% 2022) | T(27.33% 1911) | G(22.28% 1558)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 + 4218825 4221824 3000 browser details YourSeq 60 535 1079 3000 67.7% chr11 + 50217435 50217622 188 browser details YourSeq 37 532 596 3000 78.5% chr15 - 78136795 78136859 65 browser details YourSeq 36 572 663 3000 87.5% chr18 - 12061443 12061537 95 browser details YourSeq 33 536 577 3000 89.2% chr2 - 120608191 120608231 41 browser details YourSeq 31 536 598 3000 94.3% chr14 + 34835719 34835781 63 browser details YourSeq 28 638 670 3000 96.7% chr1 - 127825030 127825063 34 browser details YourSeq 28 527 568 3000 90.0% chr7 + 11230526 11230566 41 browser details YourSeq 25 2431 2475 3000 77.8% chr5 - 10258501 10258545 45 browser details YourSeq 21 1207 1228 3000 100.0% chr11 - 69433005 69433027 23 browser details YourSeq 20 548 567 3000 100.0% chr1 - 134072894 134072913 20

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 + 4222567 4225566 3000 browser details YourSeq 497 830 1506 3000 87.2% chr6 + 142325831 142326481 651 browser details YourSeq 226 1966 2711 3000 88.4% chr11 - 106411151 106894217 483067 browser details YourSeq 219 1979 2713 3000 89.0% chr12 - 21337658 21381385 43728 browser details YourSeq 155 1899 2110 3000 85.2% chr1_GL456221_random + 90978 91167 190 browser details YourSeq 151 1893 2106 3000 90.1% chr10 + 31303898 31304134 237 browser details YourSeq 145 2549 2713 3000 94.6% chr2 - 92138093 92138285 193 browser details YourSeq 145 2549 2713 3000 94.6% chr6 + 115684084 115684276 193 browser details YourSeq 145 1913 2110 3000 85.4% chr1 + 85125845 85126022 178 browser details YourSeq 141 1908 2111 3000 83.9% chr13 - 120236641 120236831 191 browser details YourSeq 140 1908 2109 3000 84.4% chr8 + 3010531 3010718 188 browser details YourSeq 140 1908 2109 3000 84.4% chr4 + 3275160 3275347 188 browser details YourSeq 139 2541 2713 3000 92.2% chr5 - 115087490 115087684 195 browser details YourSeq 138 2551 2713 3000 93.2% chr4 + 41051659 41051855 197 browser details YourSeq 136 1901 2110 3000 84.3% chr7 + 126394378 126394544 167 browser details YourSeq 135 1915 2098 3000 85.3% chr11 + 22670266 22670432 167 browser details YourSeq 132 2072 2691 3000 81.3% chrX - 93896550 93896777 228 browser details YourSeq 130 1940 2110 3000 88.3% chrX - 100596104 100596266 163 browser details YourSeq 130 2540 2713 3000 90.8% chr18 - 74975122 74975333 212 browser details YourSeq 130 1942 2102 3000 90.7% chr6 + 128633554 128633736 183

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Clcf1 cardiotrophin-like factor 1 [ Mus musculus (house mouse) ] Gene ID: 56708, updated on 12-Aug-2019

Gene summary

Official Symbol Clcf1 provided by MGI Official Full Name cardiotrophin-like cytokine factor 1 provided by MGI Primary source MGI:MGI:1930088 See related Ensembl:ENSMUSG00000040663 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as CLC; Bsf3; BSF-3; NNT-1 Expression Broad expression in spleen adult (RPKM 32.4), mammary gland adult (RPKM 16.1) and 15 other tissues See more Orthologs human all

Genomic context

Location: 19; 19 A See Clcf1 in Genome Data Viewer

Exon count: 4

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 19 NC_000085.6 (4214238..4223505)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 19 NC_000085.5 (4214392..4222615)

Chromosome 19 - NC_000085.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Clcf1 ENSMUSG00000040663

Description cardiotrophin-like cytokine factor 1 [Source:MGI Symbol;Acc:MGI:1930088] Gene Synonyms Bsf3, CLC, NNT-1/BSF-3 Location Chromosome 19: 4,214,238-4,223,490 forward strand. GRCm38:CM001012.2 View alleles of this gene on alternative sequences About this gene This gene has 5 transcripts (splice variants), 1 gene allele, 178 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 10 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Clcf1- ENSMUST00000046506.6 1847 225aa ENSMUSP00000045562.6 Protein coding CCDS29423 Q9QZM3 TSL:1 201 GENCODE basic APPRIS P1

Clcf1- ENSMUST00000235355.1 380 40aa ENSMUSP00000157886.1 Protein coding - A0A494BA42 CDS 3' 205 incomplete

Clcf1- ENSMUST00000138090.1 1878 44aa ENSMUSP00000118157.1 Nonsense mediated - D6RIL9 TSL:1 204 decay

Clcf1- ENSMUST00000132305.1 3550 No - Retained intron - - TSL:2 203 protein

Clcf1- ENSMUST00000126457.1 646 No - lncRNA - - TSL:3 202 protein

Page 6 of 8 https://www.alphaknockout.com

29.25 kb Forward strand

4.21Mb 4.22Mb 4.23Mb (Comprehensive set... Clcf1-201 >protein coding Pold4-201 >protein coding

Clcf1-203 >retained intron Pold4-203 >protein coding

Clcf1-204 >nonsense mediated decay Pold4-202 >protein coding

Gm45928-201 >nonsense mediated decay

Clcf1-205 >protein coding

Gm45928-202 >protein coding

Clcf1-202 >lncRNA

Contigs AC109138.10 > Genes < Gm26115-201miRNA (Comprehensive set...

Regulatory Build

4.21Mb 4.22Mb 4.23Mb Reverse strand 29.25 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000046506

9.25 kb Forward strand

Clcf1-201 >protein coding

ENSMUSP00000045... Cleavage site (Sign... Superfamily Four-helical cytokine-like, core Pfam Plethodontid receptivity factor PRF/cardiotrophin-like PANTHER PTHR21353:SF7

Plethodontid receptivity factor PRF/cardiotrophin-like Gene3D 1.20.1250.10

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 200 225

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8