Mouse Hs2st1 Conditional Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Hs2st1 Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Hs2st1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Hs2st1 gene (NCBI Reference Sequence: NM_011828 ; Ensembl: ENSMUSG00000040151 ) is located on Mouse chromosome 3. 7 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 7 (Transcript: ENSMUST00000043325). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Hs2st1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-32G13 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: A mutation in this gene causes bilateral renal agenesis, bone defects, eye development abnormalities and cataracts in homozygous mice. Exon 2 starts from about 11.7% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 104448 bp, and the size of intron 2 for 3'-loxP site insertion: 11009 bp. The size of effective cKO region: ~739 bp. The cKO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele gRNA region 5' gRNA region 3' 1 2 7 Targeting vector Targeted allele Constitutive KO allele (After Cre recombination) Legends Exon of mouse Hs2st1 Homology arm cKO region loxP site Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector. Overview of the GC Content Distribution Window size: 300 bp Sequence 12 Summary: Full Length(7239bp) | A(26.63% 1928) | C(20.82% 1507) | T(31.86% 2306) | G(20.69% 1498) Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 8 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr3 - 144465536 144468535 3000 browser details YourSeq 144 1414 1651 3000 87.9% chr11 - 94888089 94888341 253 browser details YourSeq 143 1420 1636 3000 88.3% chrX + 105225534 105225752 219 browser details YourSeq 139 1413 1653 3000 88.9% chr10 + 74319987 74320241 255 browser details YourSeq 136 1414 1663 3000 84.7% chr11 - 25890878 25891126 249 browser details YourSeq 134 1413 1652 3000 85.8% chr16 - 52407321 52407563 243 browser details YourSeq 132 1422 1652 3000 88.4% chr1 - 67790068 67790318 251 browser details YourSeq 130 927 1615 3000 78.6% chr14 + 87011357 87011610 254 browser details YourSeq 129 1412 1647 3000 86.1% chr5 + 66821725 66821957 233 browser details YourSeq 125 1413 1646 3000 88.5% chr5 + 141339476 141339716 241 browser details YourSeq 123 1411 1651 3000 83.9% chr2 - 123359857 123360373 517 browser details YourSeq 121 1424 1647 3000 87.5% chr3 + 67649182 67649438 257 browser details YourSeq 120 1416 1652 3000 91.1% chr15 - 101467555 101467832 278 browser details YourSeq 120 1414 1788 3000 84.4% chr12 + 92811102 92811836 735 browser details YourSeq 120 1427 1663 3000 87.9% chr11 + 91370774 91371022 249 browser details YourSeq 119 1420 1652 3000 92.4% chr14 - 72191045 72191278 234 browser details YourSeq 119 1425 1652 3000 91.7% chr10 + 14799219 14799450 232 browser details YourSeq 118 1421 1636 3000 82.4% chrX - 135137139 135137342 204 browser details YourSeq 118 1421 1636 3000 82.4% chrX + 135534852 135535055 204 browser details YourSeq 117 1421 1652 3000 84.0% chr18 + 52432823 52433038 216 Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN -------------------------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr3 - 144461797 144464796 3000 browser details YourSeq 94 2836 2990 3000 83.1% chr1 + 85280889 85281262 374 browser details YourSeq 88 2864 2990 3000 87.4% chr11 - 101868996 101869518 523 browser details YourSeq 86 2865 2990 3000 84.8% chr10 - 67242622 67242746 125 browser details YourSeq 83 2836 2963 3000 86.1% chr1_GL456221_random - 26172 26518 347 browser details YourSeq 83 2864 2990 3000 82.7% chr10 - 59833424 59833550 127 browser details YourSeq 82 2841 2961 3000 80.2% chr15 - 100796920 100797030 111 browser details YourSeq 81 2864 2976 3000 85.9% chr1 + 179150696 179150808 113 browser details YourSeq 80 2865 2990 3000 81.8% chr19 - 57330486 57330611 126 browser details YourSeq 80 2870 2961 3000 93.5% chr11 + 96589147 96589238 92 browser details YourSeq 79 2857 2969 3000 85.9% chr15 - 37949866 38133479 183614 browser details YourSeq 78 2864 2977 3000 84.3% chr4 - 134723694 134723807 114 browser details YourSeq 78 2870 2976 3000 87.0% chr18 + 80039590 80039697 108 browser details YourSeq 78 2864 2961 3000 89.8% chr12 + 108662689 108662786 98 browser details YourSeq 77 2857 2961 3000 87.4% chr4 - 133797132 133797428 297 browser details YourSeq 77 2864 2966 3000 87.4% chr1 - 180720895 180720997 103 browser details YourSeq 76 2864 2968 3000 81.9% chr11 + 4028329 4028416 88 browser details YourSeq 75 2880 2985 3000 82.3% chr12 - 108331432 108331521 90 browser details YourSeq 75 2855 2960 3000 86.5% chr11 - 94195866 94195972 107 browser details YourSeq 75 2864 2954 3000 91.3% chr10 + 61893622 61893712 91 Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found. Page 4 of 8 https://www.alphaknockout.com Gene and protein information: Hs2st1 heparan sulfate 2-O-sulfotransferase 1 [ Mus musculus (house mouse) ] Gene ID: 23908, updated on 27-Aug-2019 Gene summary Official Symbol Hs2st1 provided by MGI Official Full Name heparan sulfate 2-O-sulfotransferase 1 provided by MGI Primary source MGI:MGI:1346049 See related Ensembl:ENSMUSG00000040151 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 2OST; Hs2st; AW214369; mKIAA0448 Expression Ubiquitous expression in lung adult (RPKM 8.3), whole brain E14.5 (RPKM 6.1) and 28 other tissues See more Orthologs human all Genomic context Location: 3; 3 H2 See Hs2st1 in Genome Data Viewer Exon count: 8 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (144429701..144570216, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (144094071..144233180, complement) Chromosome 3 - NC_000069.6 Page 5 of 8 https://www.alphaknockout.com Transcript information: This gene has 3 transcripts Gene: Hs2st1 ENSMUSG00000040151 Description heparan sulfate 2-O-sulfotransferase 1 [Source:MGI Symbol;Acc:MGI:1346049] Gene Synonyms Hs2st Location Chromosome 3: 144,429,706-144,570,181 reverse strand. GRCm38:CM000996.2 About this gene This gene has 3 transcripts (splice variants), 250 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 16 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Hs2st1- ENSMUST00000043325.8 6177 356aa ENSMUSP00000043066.7 Protein coding CCDS17883 Q8R3H7 TSL:1 201 GENCODE basic APPRIS P1 Hs2st1- ENSMUST00000160690.1 598 75aa ENSMUSP00000123816.1 Nonsense mediated - E0CYX6 TSL:3 202 decay Hs2st1- ENSMUST00000199680.1 2321 No - Retained intron - - TSL:NA 203 protein 160.48 kb Forward strand 144.42Mb 144.44Mb 144.46Mb 144.48Mb 144.50Mb 144.52Mb 144.54Mb 144.56Mb 144.58Mb Genes Selenof-206 >retained intron (Comprehensive set... Selenof-201 >protein coding Selenof-202 >protein coding Selenof-204 >protein coding Selenof-203 >lncRNA Contigs AC159976.4 > < AC123880.18 Genes (Comprehensive set... < Gm5857-201processed pseudogene < Gm43707-201lncRNA < Gm43560-201lncRNA < Hs2st1-203retained intron < Hs2st1-201protein coding < Hs2st1-202nonsense mediated decay Regulatory Build 144.42Mb 144.44Mb 144.46Mb 144.48Mb 144.50Mb 144.52Mb 144.54Mb 144.56Mb 144.58Mb Reverse strand 160.48 kb Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site Gene Legend Protein Coding merged Ensembl/Havana Ensembl protein coding Non-Protein Coding processed transcript RNA gene pseudogene Page 6 of 8 https://www.alphaknockout.com Page 7 of 8 https://www.alphaknockout.com Transcript: ENSMUST00000043325 < Hs2st1-201protein coding Reverse strand 140.48 kb ENSMUSP00000043... Low complexity (Seg) Coiled-coils (Ncoils) Superfamily P-loop containing nucleoside triphosphate hydrolase Pfam Sulfotransferase PANTHER PTHR12129:SF14 Heparan sulphate 2-O-sulfotransferase Gene3D 3.40.50.300 All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend missense variant synonymous variant stop retained variant Scale bar 0 40 80 120 160 200 240 280 356 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 8 of 8.