https://www.alphaknockout.com
Mouse Padi1 Conditional Knockout Project (CRISPR/Cas9)
Objective: To create a Padi1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.
Strategy summary: The Padi1 gene (NCBI Reference Sequence: NM_011059 ; Ensembl: ENSMUSG00000025329 ) is located on Mouse chromosome 4. 16 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 16 (Transcript: ENSMUST00000026378). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Padi1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-131M6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:
Exon 2 starts from about 4.68% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 13087 bp, and the size of intron 2 for 3'-loxP site insertion: 774 bp. The size of effective cKO region: ~681 bp. The cKO region does not have any other known gene.
Page 1 of 7 https://www.alphaknockout.com
Overview of the Targeting Strategy
Wildtype allele gRNA region 5' gRNA region 3'
1 2 3 4 16 Targeting vector
Targeted allele
Constitutive KO allele (After Cre recombination)
Legends Exon of mouse Padi1 Homology arm cKO region loxP site
Page 2 of 7 https://www.alphaknockout.com
Overview of the Dot Plot Window size: 10 bp
Forward Reverse Complement
Sequence 12
Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.
Overview of the GC Content Distribution Window size: 300 bp
Sequence 12
Summary: Full Length(7181bp) | A(24.04% 1726) | C(25.37% 1822) | T(21.19% 1522) | G(29.4% 2111)
Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.
Page 3 of 7 https://www.alphaknockout.com
BLAT Search Results (up)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 - 140832723 140835722 3000 browser details YourSeq 65 12 82 3000 95.8% chr5 - 71722535 71722605 71 browser details YourSeq 43 28 76 3000 95.8% chr1 + 87962883 87962938 56 browser details YourSeq 42 31 76 3000 95.7% chr1 + 78594397 78594442 46 browser details YourSeq 41 2674 2930 3000 57.2% chr7 - 28064655 28064771 117 browser details YourSeq 37 38 76 3000 97.5% chr8 + 122788701 122788739 39 browser details YourSeq 37 43 81 3000 97.5% chr13 + 34101590 34101628 39 browser details YourSeq 33 43 75 3000 100.0% chr5 - 122485881 122485913 33 browser details YourSeq 31 2861 2895 3000 94.3% chr6 - 83781032 83781066 35 browser details YourSeq 29 2706 2734 3000 100.0% chr6 + 8733989 8734017 29 browser details YourSeq 29 46 76 3000 96.8% chr1 + 69140438 69140468 31 browser details YourSeq 26 2571 2598 3000 96.5% chr1 + 35914674 35914701 28 browser details YourSeq 23 2546 2571 3000 83.4% chr2 + 73608488 73608511 24 browser details YourSeq 23 54 76 3000 100.0% chr1 + 60921209 60921231 23 browser details YourSeq 22 54 75 3000 100.0% chr1 + 89044320 89044341 22 browser details YourSeq 20 111 136 3000 88.5% chr3 - 90846351 90846376 26
Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.
BLAT Search Results (down)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 - 140829042 140832041 3000 browser details YourSeq 118 1436 1673 3000 96.2% chrX - 74259155 74259562 408 browser details YourSeq 114 1549 1716 3000 94.6% chr9 - 78517083 78517642 560 browser details YourSeq 113 1549 1674 3000 96.0% chr2 + 151992459 151992644 186 browser details YourSeq 112 1550 1675 3000 96.0% chr1 + 87335959 87336255 297 browser details YourSeq 111 1549 1673 3000 95.9% chr5 - 114167574 114167762 189 browser details YourSeq 110 1549 1674 3000 94.4% chr17 + 23627120 23627283 164 browser details YourSeq 109 1549 1673 3000 95.1% chr11 - 98191544 98191726 183 browser details YourSeq 109 1549 1675 3000 95.2% chr11 + 82767933 82768135 203 browser details YourSeq 109 1550 1673 3000 95.1% chr1 + 112576081 112576262 182 browser details YourSeq 108 1549 1672 3000 95.1% chr19 - 3595477 3595660 184 browser details YourSeq 107 1549 1674 3000 93.6% chr13 - 91440614 91440897 284 browser details YourSeq 107 1549 1675 3000 94.4% chr1 - 77033103 77314218 281116 browser details YourSeq 107 1556 1675 3000 95.8% chr1 - 74408099 74408288 190 browser details YourSeq 107 1549 1675 3000 95.0% chr5 + 123724921 123725109 189 browser details YourSeq 107 1549 1676 3000 94.3% chr17 + 45620859 45621050 192 browser details YourSeq 107 1549 1678 3000 92.2% chr16 + 48341805 48341988 184 browser details YourSeq 107 1556 1672 3000 96.6% chr15 + 103210082 103210256 175 browser details YourSeq 107 1553 1685 3000 94.3% chr12 + 102755946 102756131 186 browser details YourSeq 106 1549 1673 3000 95.0% chr11 - 84927727 84927884 158
Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.
Page 4 of 7 https://www.alphaknockout.com
Gene and protein information: Padi1 peptidyl arginine deiminase, type I [ Mus musculus (house mouse) ] Gene ID: 18599, updated on 12-Aug-2019
Gene summary
Official Symbol Padi1 provided by MGI Official Full Name peptidyl arginine deiminase, type I provided by MGI Primary source MGI:MGI:1338893 See related Ensembl:ENSMUSG00000025329 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Pdi1; AV236283 Expression Biased expression in ovary adult (RPKM 15.4), subcutaneous fat pad adult (RPKM 3.0) and 1 other tissueS ee more Orthologs human all
Genomic context
Location: 4 D3; 4 72.62 cM See Padi1 in Genome Data Viewer
Exon count: 16
Annotation release Status Assembly Chr Location
108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (140812981..140845778, complement)
Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (140368896..140401693, complement)
Chromosome 4 - NC_000070.6
Page 5 of 7 https://www.alphaknockout.com
Transcript information: This gene has 1 transcript
Gene: Padi1 ENSMUSG00000025329
Description peptidyl arginine deiminase, type I [Source:MGI Symbol;Acc:MGI:1338893] Gene Synonyms Pad type 1, Pdi1 Location Chromosome 4: 140,812,983-140,845,778 reverse strand. GRCm38:CM000997.2 About this gene This gene has 1 transcript (splice variant), 93 orthologues, 4 paralogues and is a member of 1 Ensembl protein family. Transcripts
Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags
Padi1-201 ENSMUST00000026378.3 3754 662aa ENSMUSP00000026378.3 Protein coding CCDS18856 Q544I4 Q9Z185 TSL:1 GENCODE basic APPRIS P1
52.80 kb Forward strand
140.81Mb 140.82Mb 140.83Mb 140.84Mb 140.85Mb Genes Gm13032-201 >lncRNA (Comprehensive set...
Contigs AL807805.7 > Genes (Comprehensive set... < Padi3-201protein coding < Padi1-201protein coding
< Padi3-202protein coding
Regulatory Build
140.81Mb 140.82Mb 140.83Mb 140.84Mb 140.85Mb Reverse strand 52.80 kb
Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site
Gene Legend Protein Coding
merged Ensembl/Havana Ensembl protein coding
Non-Protein Coding
RNA gene
Page 6 of 7 https://www.alphaknockout.com
Transcript: ENSMUST00000026378
< Padi1-201protein coding
Reverse strand 32.80 kb
ENSMUSP00000026... Low complexity (Seg) Superfamily Cupredoxin Protein-arginine deiminase, central domain superfamily
SSF55909 Pfam Protein-arginine deiminase (PAD) N-terminal Protein-arginine deiminase, C-terminal
Protein-arginine deiminase (PAD), central domain PIRSF Protein-arginine deiminase
PANTHER Protein-arginine deiminase
PTHR10837:SF11 Gene3D PAD, N-terminal domain superfamily 3.75.10.10
Protein-arginine deiminase, central domain superfamily
All sequence SNPs/i... Sequence variants (dbSNP and all other sources)
Variant Legend missense variant splice region variant synonymous variant
Scale bar 0 60 120 180 240 300 360 420 480 540 600 662
We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.
Page 7 of 7