Mouse Insm1 Conditional Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
http://www.alphaknockout.com/ Mouse Insm1 Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Insm1 conditional knockout mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Insm1 gene ( NCBI Reference Sequence: NM_016889 ; Ensembl: ENSMUSG00000068154 ) is located on mouse chromosome 2. 1 exon is identified , with the ATG start codon in exon 1 and the TAG stop codon in exon 1 (Transcript: ENSMUST00000089257). Exon 1 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the mouse Insm1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-114P5 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null allele display perinatal and neonatal lethality, respiratory failure, and impaired pancreatic and intestinal endocrine cell development. Exon 1 starts from the start codon. The knockout of Exon 1 cover 100% of the coding region. The size of effective cKO region: ~3232 bp. This strategy is designed based on genetic information in existing databases. Due to the complexity of biological processes, all risk of loxP insertion on gene transcription, RNA splicing and protein translation cannot be predicted at existing technological level. Page 1 of 7 http://www.alphaknockout.com/ Overview of the Targeting Strategy gRNA region Wildtype allele A T 5' G gRNA region 3' 1 Targeting vector A T G Targeted allele A T G Constitutive KO allele (After Cre recombination) Legends Homology arm Exon of mouse Insm1 cKO region loxP site Page 2 of 7 http://www.alphaknockout.com/ Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution Window size: 300 bp Sequence 12 Summary: Full Length(9100bp) | A(22.49% 2047) | C(27.49% 2502) | G(26.26% 2390) | T(23.75% 2161) Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector. Page 3 of 7 http://www.alphaknockout.com/ BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 146218922 146221921 3000 browser details YourSeq 32 2942 2976 3000 97.1% chr5 - 114796937 114796971 35 browser details YourSeq 31 2936 2968 3000 97.0% chr10 + 42018384 42018416 33 browser details YourSeq 30 2954 2998 3000 89.2% chr11 + 94328216 94328265 50 browser details YourSeq 30 2941 2976 3000 90.7% chr10 + 126895714 126895748 35 browser details YourSeq 29 1411 1442 3000 96.9% chr2 + 114926731 114926764 34 browser details YourSeq 29 2935 2967 3000 83.9% chr10 + 94450628 94450658 31 browser details YourSeq 24 2945 2968 3000 100.0% chr1 + 157554870 157554893 24 Note: The 3000 bp section upstream of Exon 1 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 146225022 146228021 3000 browser details YourSeq 30 1625 1661 3000 94.0% chrX - 116507847 116507891 45 browser details YourSeq 24 893 916 3000 100.0% chr10 - 60368341 60368364 24 browser details YourSeq 23 1915 1937 3000 100.0% chr14 - 64924035 64924057 23 browser details YourSeq 21 2584 2604 3000 100.0% chr6 - 7755906 7755926 21 browser details YourSeq 21 2259 2279 3000 100.0% chr13 - 119753326 119753346 21 Note: The 3000 bp section downstream of Exon 1 is BLAT searched against the genome. No significant similarity is found. Page 4 of 7 http://www.alphaknockout.com/ Gene and protein information: Insm1 insulinoma-associated 1 [ Mus musculus (house mouse) ] Gene ID: 53626, updated on 17-Nov-2020 Gene summary Official Symbol Insm1 provided by MGI Official Full Name insulinoma-associated 1 provided by MGI Primary source MGI:MGI:1859980 See related Ensembl:ENSMUSG00000068154 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as IA; IA-1 Orthologs human all NEW Try the new Gene table Try the new Transcript table Genomic context Location: 2; 2 G1 See Insm1 in Genome Data Viewer Exon count: 1 Annotation release Status Assembly Chr Location 109 current GRCm39 (GCF_000001635.27) 2 NC_000068.8 (146063917..146066940) 108.20200622 previous assembly GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (146221997..146225020) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (146047733..146050756) Chromosome 2 - NC_000068.8 Page 5 of 7 http://www.alphaknockout.com/ Transcript information: This gene has 1 transcript Gene: Insm1 ENSMUSG00000068154 Description insulinoma-associated 1 [Source:MGI Symbol;Acc:MGI:1859980] Gene Synonyms IA-1 Location Chromosome 2: 146,063,841-146,066,940 forward strand. GRCm39:CM000995.3 About this gene This gene has 1 transcript (splice variant), 208 orthologues, 1 paralogue and is associated with 36 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Match Flags Insm1-201 ENSMUST00000089257.6 3100 521aa ENSMUSP00000092048.4 Protein coding CCDS16830 Q05BD7 Q63ZV0 TSL:NA GENCODE basic APPRIS P1 23.10 kb Forward strand 146.055Mb 146.060Mb 146.065Mb 146.070Mb 146.075Mb Genes (Comprehensive set... Cfap61-206 >protein coding Insm1-201 >protein coding Cfap61-209 >processed transcript Cfap61-208 >processed transcript Cfap61-210 >processed transcript Cfap61-214 >protein coding Contigs AL935056.18 > Regulatory Build 146.055Mb 146.060Mb 146.065Mb 146.070Mb 146.075Mb Reverse strand 23.10 kb Regulation Legend CTCF Promoter Promoter Flank Gene Legend Protein Coding merged Ensembl/Havana Ensembl protein coding Non-Protein Coding processed transcript Page 6 of 7 http://www.alphaknockout.com/ Transcript: ENSMUST00000089257 3.10 kb Forward strand Insm1-201 >protein coding ENSMUSP00000092... MobiDB lite Low complexity (Seg) Superfamily Zinc finger C2H2 superfamily SMART Zinc finger C2H2-type Pfam Zinc finger C2H2-type PROSITE profiles Zinc finger C2H2-type PROSITE patterns Zinc finger C2H2-type PANTHER Insulinoma-associated protein 1/2 PTHR15065:SF5 Gene3D 3.30.160.60 All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend frameshift variant missense variant synonymous variant Scale bar 0 60 120 180 240 300 360 420 521 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder. Page 7 of 7.