Mouse Cep104 Conditional Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Cep104 Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Cep104 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Cep104 gene (NCBI Reference Sequence: NM_177673 ; Ensembl: ENSMUSG00000039523 ) is located on Mouse chromosome 4. 22 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 22 (Transcript: ENSMUST00000047497). Exon 5~6 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cep104 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-101G23 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 5 starts from about 15.41% of the coding region. The knockout of Exon 5~6 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 1291 bp, and the size of intron 6 for 3'-loxP site insertion: 1224 bp. The size of effective cKO region: ~1111 bp. The cKO region does not have any other known gene. Page 1 of 7 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele gRNA region 5' gRNA region 3' 1 3 4 5 6 7 8 22 Targeting vector Targeted allele Constitutive KO allele (After Cre recombination) Legends Exon of mouse Cep104 Homology arm cKO region loxP site Page 2 of 7 https://www.alphaknockout.com Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution Window size: 300 bp Sequence 12 Summary: Full Length(7611bp) | A(22.84% 1738) | C(24.96% 1900) | T(24.16% 1839) | G(28.04% 2134) Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 7 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr4 + 153979815 153982814 3000 browser details YourSeq 81 620 735 3000 90.1% chr2 - 173769902 173770179 278 browser details YourSeq 81 620 720 3000 95.6% chr18 - 58501729 58501895 167 browser details YourSeq 79 620 720 3000 94.5% chr10 - 68225208 68225606 399 browser details YourSeq 78 620 717 3000 93.2% chr15 - 81309012 81309108 97 browser details YourSeq 78 620 720 3000 97.6% chr8 + 38872152 38872258 107 browser details YourSeq 78 620 720 3000 96.5% chr1 + 190870871 190871026 156 browser details YourSeq 77 620 720 3000 95.3% chr16 - 93125829 93125929 101 browser details YourSeq 77 620 720 3000 86.8% chr12 - 12324512 12324596 85 browser details YourSeq 77 620 717 3000 96.5% chr13 + 24411276 24411375 100 browser details YourSeq 73 620 717 3000 96.3% chr6 + 136654452 136654551 100 browser details YourSeq 73 620 720 3000 84.4% chr17 + 48841764 48841848 85 browser details YourSeq 72 635 720 3000 94.7% chr8 - 110515122 110515205 84 browser details YourSeq 72 635 720 3000 98.7% chr17 - 4041431 4041574 144 browser details YourSeq 72 620 720 3000 94.0% chrY + 2644984 2645092 109 browser details YourSeq 71 647 720 3000 98.7% chr14 + 63702233 63702310 78 browser details YourSeq 70 620 715 3000 83.2% chr19 - 10410069 10410146 78 browser details YourSeq 70 620 720 3000 92.6% chr7 + 3388079 3388179 101 browser details YourSeq 70 620 720 3000 94.8% chr2 + 5483503 5483605 103 browser details YourSeq 69 635 720 3000 86.7% chr10 - 62065876 62065951 76 Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr4 + 153983926 153986925 3000 browser details YourSeq 152 2204 2381 3000 92.7% chr4 + 153985809 153985986 178 browser details YourSeq 128 2235 2405 3000 88.5% chr4 + 153985808 153985946 139 browser details YourSeq 126 1858 2026 3000 97.1% chr4 + 153986007 153986303 297 browser details YourSeq 97 2092 2198 3000 95.4% chr4 + 153985953 153986059 107 browser details YourSeq 91 2274 2405 3000 85.0% chr4 + 153985783 153985882 100 browser details YourSeq 89 1858 1962 3000 92.4% chr4 + 153986199 153986303 105 browser details YourSeq 63 1863 2129 3000 72.3% chr1 - 147294014 147294108 95 browser details YourSeq 56 1857 2384 3000 67.2% chr1 + 185056174 185056365 192 browser details YourSeq 55 2339 2405 3000 92.4% chr4 + 153985912 153985978 67 browser details YourSeq 54 2342 2405 3000 92.2% chr4 + 153985787 153985850 64 browser details YourSeq 48 1868 2038 3000 69.7% chr4 - 137905563 137905661 99 browser details YourSeq 48 2249 2357 3000 87.1% chr5 + 107769501 107769606 106 browser details YourSeq 45 1868 2039 3000 76.5% chr4 - 137905665 137905815 151 browser details YourSeq 36 2339 2378 3000 95.0% chr4 + 153985848 153985887 40 browser details YourSeq 33 1987 2097 3000 59.5% chr1 - 147294038 147294082 45 browser details YourSeq 21 2383 2405 3000 95.7% chr4 + 153985796 153985818 23 Note: The 3000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found. Page 4 of 7 https://www.alphaknockout.com Gene and protein information: Cep104 centrosomal protein 104 [ Mus musculus (house mouse) ] Gene ID: 230967, updated on 12-Aug-2019 Gene summary Official Symbol Cep104 provided by MGI Official Full Name centrosomal protein 104 provided by MGI Primary source MGI:MGI:2687282 See related Ensembl:ENSMUSG00000039523 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI115523; Kiaa0562; mKIAA0562; A930027E11 Expression Ubiquitous expression in testis adult (RPKM 18.4), adrenal adult (RPKM 10.5) and 28 other tissues See more Orthologs human all Genomic context Location: 4; 4 E2 See Cep104 in Genome Data Viewer Exon count: 22 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (153975126..154008732) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (153349670..153381334) Chromosome 4 - NC_000070.6 Page 5 of 7 https://www.alphaknockout.com Transcript information: This gene has 3 transcripts Gene: Cep104 ENSMUSG00000039523 Description centrosomal protein 104 [Source:MGI Symbol;Acc:MGI:2687282] Gene Synonyms BC046331 Location Chromosome 4: 153,975,194-154,008,732 forward strand. GRCm38:CM000997.2 About this gene This gene has 3 transcripts (splice variants), 194 orthologues and is a member of 1 Ensembl protein family. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Cep104- ENSMUST00000047497.14 5151 926aa ENSMUSP00000040762.8 Protein coding CCDS19005 Q80V31 TSL:1 201 GENCODE basic APPRIS P1 Cep104- ENSMUST00000183790.1 1787 524aa ENSMUSP00000139349.1 Nonsense mediated - V9GXW2 CDS 5' 203 decay incomplete TSL:5 Cep104- ENSMUST00000155414.1 388 No - Retained intron - - TSL:3 202 protein 53.54 kb Forward strand 153.97Mb 153.98Mb 153.99Mb 154.00Mb 154.01Mb Genes (Comprehensive set... Cep104-201 >protein coding Lrrc47-202 >retained intron Cep104-203 >nonsense mediated decay Lrrc47-205 >protein coding Cep104-202 >retained intron Lrrc47-201 >protein coding Contigs AL806525.23 > Genes < Dffb-201protein coding (Comprehensive set... < Dffb-202nonsense mediated decay < Dffb-205lncRNA < Dffb-203retained intron < Dffb-204retained intron Regulatory Build 153.97Mb 153.98Mb 153.99Mb 154.00Mb 154.01Mb Reverse strand 53.54 kb Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site Gene Legend Protein Coding merged Ensembl/Havana Ensembl protein coding Non-Protein Coding RNA gene processed transcript Page 6 of 7 https://www.alphaknockout.com Transcript: ENSMUST00000047497 33.54 kb Forward strand Cep104-201 >protein coding ENSMUSP00000040... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Galactose-binding-like domain superfamily Armadillo-type fold SMART TOG domain PANTHER PTHR13371 PTHR13371:SF0 Gene3D Armadillo-like helical All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend inframe insertion missense variant synonymous variant Scale bar 0 80 160 240 320 400 480 560 640 720 800 926 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 7 of 7.