https://www.alphaknockout.com

Mouse Cfdp1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Cfdp1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cfdp1 (NCBI Reference Sequence: NM_011801 ; Ensembl: ENSMUSG00000031954 ) is located on Mouse 8. 7 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 7 (Transcript: ENSMUST00000034432). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cfdp1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-91B18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 7.34% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 8932 bp, and the size of intron 2 for 3'-loxP site insertion: 3103 bp. The size of effective cKO region: ~618 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Cfdp1 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7118bp) | A(25.72% 1831) | C(20.13% 1433) | T(31.86% 2268) | G(22.28% 1586)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 - 111845462 111848461 3000 browser details YourSeq 302 2490 3000 3000 87.2% chr9 - 43136070 43136540 471 browser details YourSeq 295 2486 2960 3000 85.4% chr5 - 92046906 92047354 449 browser details YourSeq 291 2483 2981 3000 87.3% chr14 + 27292876 27293314 439 browser details YourSeq 279 2485 2993 3000 85.4% chr17 - 46761858 46762358 501 browser details YourSeq 275 2490 2977 3000 89.9% chrX + 20294442 20295075 634 browser details YourSeq 273 2513 2954 3000 88.3% chr7 + 25828977 25829397 421 browser details YourSeq 271 2216 2981 3000 87.2% chr11 + 100450040 100450739 700 browser details YourSeq 264 2212 2993 3000 85.1% chr18 + 34227716 34228144 429 browser details YourSeq 263 2487 2993 3000 89.8% chr4 + 149292243 149292843 601 browser details YourSeq 255 2687 3000 3000 89.7% chr1 - 86649152 86649461 310 browser details YourSeq 253 2521 2979 3000 86.1% chr15 + 99046029 99046460 432 browser details YourSeq 252 2515 2970 3000 89.6% chr11 - 69530348 69531131 784 browser details YourSeq 251 2676 3000 3000 90.1% chr2 - 154945752 154946084 333 browser details YourSeq 251 2687 3000 3000 89.4% chr2 - 32114741 32115052 312 browser details YourSeq 250 2697 3000 3000 91.8% chr2 - 144286629 144287127 499 browser details YourSeq 248 2550 2993 3000 91.4% chr7 - 141347058 141347803 746 browser details YourSeq 247 2483 2993 3000 85.0% chr2 - 180229582 180229915 334 browser details YourSeq 247 2674 3000 3000 88.2% chr11 + 74937694 74938009 316 browser details YourSeq 246 2677 2993 3000 92.1% chr8 + 117289750 117290090 341

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 - 111841844 111844843 3000 browser details YourSeq 206 676 1095 3000 86.2% chr15 - 102391467 102391817 351 browser details YourSeq 158 676 866 3000 92.0% chr3 + 90585123 90585316 194 browser details YourSeq 155 674 1111 3000 91.9% chr11 + 55813191 55813687 497 browser details YourSeq 153 679 862 3000 92.3% chr13 - 55806651 55806834 184 browser details YourSeq 153 675 867 3000 90.0% chr1 + 39820321 39820514 194 browser details YourSeq 152 675 864 3000 90.5% chr13 + 66728536 66728729 194 browser details YourSeq 149 676 856 3000 90.4% chr3 + 154628757 154628935 179 browser details YourSeq 149 675 858 3000 90.8% chr13 + 66303668 66303851 184 browser details YourSeq 148 678 1060 3000 83.0% chr7 - 127125047 127125369 323 browser details YourSeq 148 675 857 3000 90.7% chr13 - 65493297 65493479 183 browser details YourSeq 148 675 857 3000 90.7% chr13 + 66449719 66449901 183 browser details YourSeq 147 675 858 3000 90.2% chr13 + 65829412 65829595 184 browser details YourSeq 146 675 864 3000 88.9% chr13 + 65603906 65604099 194 browser details YourSeq 145 675 843 3000 91.1% chr11 - 69184261 69184427 167 browser details YourSeq 144 674 1246 3000 83.0% chr3 - 103950644 103950873 230 browser details YourSeq 144 672 843 3000 89.9% chr15 - 68442664 68442831 168 browser details YourSeq 142 676 843 3000 90.4% chr19 - 53728265 53728430 166 browser details YourSeq 140 676 843 3000 89.8% chr10 - 116933741 116933906 166 browser details YourSeq 140 674 843 3000 89.3% chr10 + 45492433 45492600 168

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Cfdp1 craniofacial development protein 1 [ Mus musculus (house mouse) ] Gene ID: 23837, updated on 12-Aug-2019

Gene summary

Official Symbol Cfdp1 provided by MGI Official Full Name craniofacial development protein 1 provided by MGI Primary source MGI:MGI:1344403 See related Ensembl:ENSMUSG00000031954 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Bcnt; Cfdp; cp27; AA408409; Bucentaur Expression Broad expression in CNS E11.5 (RPKM 77.7), placenta adult (RPKM 58.1) and 16 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 E1 See Cfdp1 in Genome Data Viewer

Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (111766682..111854337, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (114292373..114378210, complement)

Chromosome 8 - NC_000074.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Cfdp1 ENSMUSG00000031954

Description craniofacial development protein 1 [Source:MGI Symbol;Acc:MGI:1344403] Gene Synonyms Bcnt, Bucentaur, cp27 Location Chromosome 8: 111,768,491-111,854,291 reverse strand. GRCm38:CM001001.2 About this gene This gene has 1 transcript (splice variant), 288 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cfdp1-201 ENSMUST00000034432.6 1178 295aa ENSMUSP00000034432.5 Protein coding CCDS22680 O88271 TSL:1 GENCODE basic APPRIS P1

105.80 kb Forward strand 111.76Mb 111.78Mb 111.80Mb 111.82Mb 111.84Mb 111.86Mb Contigs < AC127315.4 Genes (Comprehensive set... < Cfdp1-201protein coding

< Tmem170-201protein coding

Regulatory Build

111.76Mb 111.78Mb 111.80Mb 111.82Mb 111.84Mb 111.86Mb Reverse strand 105.80 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000034432

< Cfdp1-201protein coding

Reverse strand 85.80 kb

ENSMUSP00000034... MobiDB lite Low complexity (Seg) Pfam BCNT-C domain PROSITE profiles BCNT-C domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 295

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7