https://www.alphaknockout.com

Mouse Cetn1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Cetn1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cetn1 (NCBI Reference Sequence: NM_007593 ; Ensembl: ENSMUSG00000050996 ) is located on Mouse 18. 1 exon is identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 1 (Transcript: ENSMUST00000234003). Exon 1 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cetn1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-359C24 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele show male infertility associated with oligozoospermia, teratozoospermia, immotile sperm, and altered centriole rearrangement at a late stage of spermiogenesis.

Exon 1 covers 100.0% of the coding region. Start codon is in exon 1, and stop codon is in exon 1. The size of effective cKO region: ~1511 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

gRNA region

Wildtype allele A T

5' G gRNA region 3'

1

Targeting vector A T G

Targeted allele A T G

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Cetn1 cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(6516bp) | A(29.36% 1913) | C(20.04% 1306) | T(30.77% 2005) | G(19.83% 1292)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr18 - 9619420 9622419 3000 browser details YourSeq 38 2879 2936 3000 73.9% chr2 - 72402505 72402546 42 browser details YourSeq 38 1379 1426 3000 89.6% chr16 - 27728171 27728218 48 browser details YourSeq 31 9 41 3000 97.0% chr18 - 81059019 81059051 33 browser details YourSeq 31 21 113 3000 97.0% chr2 + 128263020 128263114 95 browser details YourSeq 30 4 39 3000 81.9% chr17 + 15784226 15784258 33 browser details YourSeq 29 136 190 3000 78.6% chr7 + 78785286 78785338 53 browser details YourSeq 22 95 118 3000 95.9% chr12 - 77153333 77153356 24 browser details YourSeq 22 99 168 3000 65.8% chr7 + 101088199 101088268 70

Note: The 3000 bp section upstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr18 - 9615904 9618903 3000 browser details YourSeq 74 1557 1712 3000 85.9% chr5 - 139814832 139814984 153 browser details YourSeq 71 1560 1712 3000 87.9% chr5 - 25741978 25742126 149 browser details YourSeq 64 1568 1708 3000 81.3% chr2 + 164550081 164550212 132 browser details YourSeq 62 1568 1712 3000 83.8% chr9 - 64383090 64383232 143 browser details YourSeq 62 1557 1699 3000 85.0% chr2 - 127399418 127399555 138 browser details YourSeq 62 1209 1610 3000 79.5% chr11 + 94914621 94914990 370 browser details YourSeq 60 1487 1703 3000 92.9% chr12 - 69519204 69519477 274 browser details YourSeq 60 1568 1712 3000 85.9% chr2 + 146479968 146480111 144 browser details YourSeq 60 1566 1706 3000 93.0% chr1 + 118627455 118710414 82960 browser details YourSeq 58 1575 1703 3000 84.1% chr9 + 113582548 113582671 124 browser details YourSeq 54 1564 1705 3000 91.0% chr15 - 83235441 83235584 144 browser details YourSeq 54 1535 1707 3000 81.6% chr15 - 78254230 78254392 163 browser details YourSeq 54 1559 1702 3000 78.7% chr17 + 28776780 28776903 124 browser details YourSeq 53 1570 1706 3000 84.5% chr15 - 38354360 38354496 137 browser details YourSeq 53 1567 1712 3000 84.1% chr12 + 82664005 82664148 144 browser details YourSeq 51 1568 1702 3000 93.3% chr18 - 47346236 47346371 136 browser details YourSeq 51 1592 1716 3000 88.2% chr8 + 106950225 106950347 123 browser details YourSeq 51 1567 1716 3000 84.7% chr13 + 113347056 113347203 148 browser details YourSeq 51 1563 1711 3000 91.4% chr1 + 73767638 73767785 148

Note: The 3000 bp section downstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Cetn1 centrin 1 [ Mus musculus (house mouse) ] Gene ID: 26369, updated on 12-Aug-2019

Gene summary

Official Symbol Cetn1 provided by MGI Official Full Name centrin 1 provided by MGI Primary source MGI:MGI:1347086 See related Ensembl:ENSMUSG00000050996 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as caltractin Orthologs human all

Genomic context

Location: 18; 18 A1 See Cetn1 in Genome Data Viewer Exon count: 1

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (9618419..9619469, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (9618417..9619467, complement)

Chromosome 18 - NC_000084.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Cetn1 ENSMUSG00000050996

Description centrin 1 [Source:MGI Symbol;Acc:MGI:1347086] Location : 9,615,524-9,619,478 reverse strand. GRCm38:CM001011.2 About this gene This gene has 3 transcripts (splice variants), 219 orthologues, 22 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cetn1-201 ENSMUST00000062769.6 1247 172aa ENSMUSP00000057392.6 Protein coding CCDS37731 P41209 TSL:1 GENCODE basic APPRIS P1

Cetn1-202 ENSMUST00000234003.1 1051 172aa ENSMUSP00000157153.1 Protein coding CCDS37731 P41209 GENCODE basic APPRIS P1

Cetn1-203 ENSMUST00000234590.1 1247 172aa ENSMUSP00000157126.1 Nonsense mediated decay - P41209 -

23.95 kb Forward strand

9.610Mb 9.615Mb 9.620Mb 9.625Mb Gm22765-201 >snoRNA Gm4834-201 >processed pseudogene (Comprehensive set...

Contigs AC102492.8 >

Genes (Comprehensive set... < Cetn1-201protein coding

< Cetn1-203nonsense mediated decay

< Cetn1-202protein coding

9.610Mb 9.615Mb 9.620Mb 9.625Mb Reverse strand 23.95 kb

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

pseudogene processed transcript RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000234003

< Cetn1-202protein coding

Reverse strand 1.05 kb

ENSMUSP00000157... MobiDB lite Superfamily EF-hand domain pair SMART EF-hand domain Pfam EF-hand domain PROSITE profiles EF-hand domain PROSITE patterns ATP-dependent RNA helicase DEAD-box, conserved site

EF-Hand 1, calcium-binding site PANTHER PTHR23050:SF218

PTHR23050 Gene3D 1.10.238.10 CDD EF-hand domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend synonymous variant

Scale bar 0 20 40 60 80 100 120 140 172

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7