Mouse Hpn Conditional Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Hpn Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Hpn conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Hpn gene (NCBI Reference Sequence: NM_001110252 ; Ensembl: ENSMUSG00000001249 ) is located on Mouse chromosome 7. 14 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 14 (Transcript: ENSMUST00000108102). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Hpn gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-29F19 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null mutation are hypothyroidic and develop profound hearing loss associated with structural changes in the tectorial membrane and a myelination defect affecting the compaction of spiral ganglion neurons. Exon 2 starts from about 100% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 391 bp, and the size of intron 2 for 3'-loxP site insertion: 3904 bp. The size of effective cKO region: ~458 bp. The cKO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele gRNA region 5' gRNA region 3' 5 1 2 14 Targeting vector Targeted allele Constitutive KO allele (After Cre recombination) Legends Exon of mouse Scn1b Homology arm Exon of mouse Hpn cKO region loxP site Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution Window size: 300 bp Sequence 12 Summary: Full Length(6903bp) | A(21.4% 1477) | C(26.81% 1851) | T(22.45% 1550) | G(29.34% 2025) Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector. Page 3 of 8 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr7 - 31114886 31117885 3000 browser details YourSeq 31 1715 1757 3000 86.1% chr1 - 6531240 6531282 43 browser details YourSeq 24 1859 1882 3000 100.0% chr2 + 33066576 33066599 24 browser details YourSeq 22 1667 1691 3000 96.0% chr12 - 71577866 71577892 27 browser details YourSeq 22 521 542 3000 100.0% chr10 - 123092421 123092442 22 browser details YourSeq 22 991 1016 3000 92.4% chr13 + 42711886 42711911 26 browser details YourSeq 21 2555 2575 3000 100.0% chr6 - 94700416 94700436 21 browser details YourSeq 21 2562 2582 3000 100.0% chr5 + 141079461 141079481 21 Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr7 - 31111428 31114427 3000 browser details YourSeq 207 1437 1716 3000 94.5% chr7 - 16181058 16264640 83583 browser details YourSeq 200 1490 1716 3000 94.8% chr7 - 30304283 30304993 711 browser details YourSeq 200 1490 1848 3000 89.8% chr5 + 140011957 140012417 461 browser details YourSeq 200 1490 1743 3000 92.4% chr5 + 130480767 130481210 444 browser details YourSeq 197 1490 1717 3000 94.3% chr7 + 24375294 24375583 290 browser details YourSeq 197 1491 1763 3000 93.5% chr5 + 136649813 136650393 581 browser details YourSeq 196 1490 1716 3000 93.9% chr7 + 28553753 28638775 85023 browser details YourSeq 195 1487 1716 3000 93.4% chr5 + 144034262 144034497 236 browser details YourSeq 193 1490 1716 3000 95.0% chr7 - 19763075 19867624 104550 browser details YourSeq 193 1490 1716 3000 94.6% chr7 - 6201815 6202050 236 browser details YourSeq 192 1490 1716 3000 95.0% chr5 + 134001482 134375302 373821 browser details YourSeq 191 1490 1716 3000 92.9% chr7 - 35610942 35611330 389 browser details YourSeq 191 1490 1716 3000 94.5% chr7 - 19351329 19351564 236 browser details YourSeq 191 1490 1716 3000 94.5% chr7 + 29003994 29004229 236 browser details YourSeq 191 1491 1716 3000 94.5% chr7 + 27297003 27390425 93423 browser details YourSeq 191 1490 1716 3000 92.5% chr7 + 25111469 25111701 233 browser details YourSeq 191 1490 1718 3000 94.1% chr5 + 142037560 142330923 293364 browser details YourSeq 191 1490 1716 3000 94.5% chr5 + 140484087 140484322 236 browser details YourSeq 190 1490 1716 3000 92.6% chr7 - 35631982 35632217 236 Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found. Page 4 of 8 https://www.alphaknockout.com Gene and protein information: Hpn hepsin [ Mus musculus (house mouse) ] Gene ID: 15451, updated on 12-Aug-2019 Gene summary Official Symbol Hpn provided by MGI Official Full Name hepsin provided by MGI Primary source MGI:MGI:1196620 See related Ensembl:ENSMUSG00000001249 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Hlb320 Summary This gene encodes a type II transmembrane serine protease that may function in diverse processes, including regulation of Expression cell growth. Deficiency in this gene results in hearing loss. The protein is cleaved into a catalytic serine protease chain and a non-catalytic scavenger receptor cysteine-rich chain, which associate via a single disulfide bond. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. [provided by RefSeq, Jan 2013] Orthologs Biased expression in liver adult (RPKM 275.8), kidney adult (RPKM 197.4) and 4 other tissuesS ee more human all Genomic context Location: 7; 7 B1 See Hpn in Genome Data Viewer Exon count: 15 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (31098725..31115326, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (31883744..31900309, complement) Chromosome 7 - NC_000073.6 Page 5 of 8 https://www.alphaknockout.com Transcript information: This gene has 11 transcripts Gene: Hpn ENSMUSG00000001249 Description hepsin [Source:MGI Symbol;Acc:MGI:1196620] Gene Synonyms Hlb320 Location Chromosome 7: 31,098,725-31,115,290 reverse strand. GRCm38:CM001000.2 About this gene This gene has 11 transcripts (splice variants), 179 orthologues, 20 paralogues, is a member of 1 Ensembl protein family and is associated with 18 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Hpn- ENSMUST00000108102.8 1830 436aa ENSMUSP00000103737.2 Protein coding CCDS52188 O35453 TSL:1 202 GENCODE basic Hpn- ENSMUST00000168884.7 1762 416aa ENSMUSP00000131658.1 Protein coding CCDS52187 G3UWE8 TSL:1 209 GENCODE basic APPRIS P1 Hpn- ENSMUST00000039435.14 1743 445aa ENSMUSP00000038149.8 Protein coding CCDS71936 E9Q5P0 TSL:1 201 GENCODE basic Hpn- ENSMUST00000164929.2 1110 82aa ENSMUSP00000127229.1 Protein coding - E9Q3X9 CDS 3' 204 incomplete TSL:3 Hpn- ENSMUST00000171259.1 621 179aa ENSMUSP00000132307.1 Protein coding - F6W6S4 CDS 5' 211 incomplete TSL:5 Hpn- ENSMUST00000171225.1 469 32aa ENSMUSP00000130966.1 Protein coding - F7C9T6 CDS 5' 210 incomplete TSL:3 Hpn- ENSMUST00000165124.7 1861 137aa ENSMUSP00000145624.1 Nonsense mediated - A0A0U1RNM3 TSL:2 205 decay Hpn- ENSMUST00000164340.1 920 No - Retained intron - - TSL:5 203 protein Hpn- ENSMUST00000167719.7 679 No - Retained intron - - TSL:3 207 protein Hpn- ENSMUST00000165480.1 541 No - Retained intron - - TSL:3 206 protein Hpn- ENSMUST00000168623.1 351 No - Retained intron - - TSL:3 208 protein Page 6 of 8 https://www.alphaknockout.com 36.57 kb Forward strand 31.09Mb 31.10Mb 31.11Mb 31.12Mb Contigs AC158993.2 > Genes (Comprehensive set... < Hpn-202protein coding < Scn1b-201protein coding < Hpn-209protein coding < Scn1b-203protein coding < Hpn-205nonsense mediated decay < Scn1b-202retained intron < Hpn-201protein coding < Hpn-211protein coding < Hpn-204protein coding < Hpn-210protein coding < Hpn-208retained intron < Hpn-203retained intron < Hpn-207retained intron < Hpn-206retained intron Regulatory Build 31.09Mb 31.10Mb 31.11Mb 31.12Mb Reverse strand 36.57 kb Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding processed transcript Page 7 of 8 https://www.alphaknockout.com Transcript: ENSMUST00000108102 < Hpn-202protein coding Reverse strand 16.57 kb ENSMUSP00000103... Transmembrane heli... MobiDB lite Superfamily SRCR-like domain superfamily Peptidase S1, PA clan SMART SRCR-like domain Serine proteases, trypsin domain Prints Peptidase S1A, chymotrypsin family Pfam Hepsin, SRCR domain Serine proteases, trypsin domain PROSITE profiles Serine proteases, trypsin domain PROSITE patterns Serine proteases, trypsin family, histidine active site Serine proteases, trypsin family, serine active site PANTHER Hepsin PTHR24253 Gene3D SRCR-like domain superfamily 2.40.10.10 CDD Serine proteases, trypsin domain All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend missense variant synonymous variant Scale bar 0 40 80 120 160 200 240 280 320 360 436 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 8 of 8.