https://www.alphaknockout.com
Mouse Hgsnat Conditional Knockout Project (CRISPR/Cas9)
Objective: To create a Hgsnat conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.
Strategy summary: The Hgsnat gene (NCBI Reference Sequence: NM_029884 ; Ensembl: ENSMUSG00000037260 ) is located on Mouse chromosome 8. 18 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 18 (Transcript: ENSMUST00000037609). Exon 3~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Hgsnat gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-294F8 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit progressive storage pathology in the CNS and peripheral organs, glycosaminoglycan accumulation in brain and most somatic organs, lysosomal distension and dysfunction, astrocytosis, microgliosis, hepatosplenomegaly, behavioral deficits and premature death.
Exon 3 starts from about 15.45% of the coding region. The knockout of Exon 3~4 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 1139 bp, and the size of intron 4 for 3'-loxP site insertion: 2614 bp. The size of effective cKO region: ~1156 bp. The cKO region does not have any other known gene.
Page 1 of 7 https://www.alphaknockout.com
Overview of the Targeting Strategy
Wildtype allele 5' gRNA region gRNA region 3'
1 2 3 4 18 Targeting vector
Targeted allele
Constitutive KO allele (After Cre recombination)
Legends Exon of mouse Hgsnat Homology arm cKO region loxP site
Page 2 of 7 https://www.alphaknockout.com
Overview of the Dot Plot Window size: 10 bp
Forward Reverse Complement
Sequence 12
Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.
Overview of the GC Content Distribution Window size: 300 bp
Sequence 12
Summary: Full Length(7656bp) | A(25.54% 1955) | C(22.13% 1694) | T(28.2% 2159) | G(24.14% 1848)
Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.
Page 3 of 7 https://www.alphaknockout.com
BLAT Search Results (up)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 - 25971953 25974952 3000 browser details YourSeq 304 2375 2993 3000 88.2% chr11 - 101675897 101676558 662 browser details YourSeq 262 2419 2960 3000 90.8% chr7 + 141397070 141397779 710 browser details YourSeq 248 2356 2692 3000 92.0% chr1 + 87525375 87525732 358 browser details YourSeq 239 2375 2663 3000 93.2% chr2 - 21346648 21346975 328 browser details YourSeq 233 2496 2969 3000 92.7% chr18 + 42258981 42259608 628 browser details YourSeq 231 2355 2692 3000 95.0% chr12 + 69663538 69664004 467 browser details YourSeq 228 2375 2668 3000 91.0% chr2 - 155441579 155441872 294 browser details YourSeq 225 2355 2673 3000 92.9% chr5 + 21670903 21671253 351 browser details YourSeq 222 2385 2669 3000 91.2% chr15 + 37012734 37013363 630 browser details YourSeq 220 2375 2671 3000 93.4% chr11 - 51803257 51803666 410 browser details YourSeq 213 2375 2692 3000 89.9% chr2 - 119055957 119056478 522 browser details YourSeq 212 2375 2679 3000 90.7% chr11 + 100322896 100323441 546 browser details YourSeq 209 2375 2679 3000 87.5% chr11 - 102524069 102524352 284 browser details YourSeq 209 2375 2692 3000 93.4% chr16 + 22106644 22217234 110591 browser details YourSeq 206 2411 2692 3000 93.3% chr17 - 65622710 65623058 349 browser details YourSeq 203 2403 2688 3000 94.4% chr9 + 72917412 72917981 570 browser details YourSeq 201 2411 2692 3000 91.8% chr10 + 20131202 20131686 485 browser details YourSeq 200 2391 2672 3000 91.7% chr6 - 88475183 88475491 309 browser details YourSeq 196 2421 2674 3000 86.9% chr11 + 53831787 53832032 246
Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.
BLAT Search Results (down)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 - 25967797 25970796 3000 browser details YourSeq 341 848 1188 3000 100.0% chr8 - 25969556 25969896 341 browser details YourSeq 336 904 1239 3000 100.0% chr8 - 25969611 25969946 336 browser details YourSeq 286 850 1135 3000 100.0% chr8 - 25969556 25969841 286 browser details YourSeq 55 921 1145 3000 69.9% chr1 - 189032823 189032911 89 browser details YourSeq 52 420 501 3000 85.4% chr12 - 3392053 3392140 88 browser details YourSeq 46 719 776 3000 94.4% chr6 - 137511914 137511988 75 browser details YourSeq 42 707 768 3000 83.0% chr9 - 120924636 120924693 58 browser details YourSeq 41 715 766 3000 95.7% chr1 - 119385728 119385782 55 browser details YourSeq 41 698 761 3000 88.0% chr1 + 161567974 161568036 63 browser details YourSeq 40 718 760 3000 97.7% chr8 - 107235808 107235851 44 browser details YourSeq 40 720 766 3000 93.7% chr2 + 59714326 59714374 49 browser details YourSeq 39 718 766 3000 97.7% chr9 - 54905469 54905524 56 browser details YourSeq 36 718 761 3000 92.9% chr1 - 89411096 89411140 45 browser details YourSeq 36 713 761 3000 95.0% chr5 + 114440398 114440447 50 browser details YourSeq 35 718 761 3000 94.9% chr12 - 86462273 86462317 45 browser details YourSeq 35 398 439 3000 97.3% chr11 - 74980264 74980307 44 browser details YourSeq 35 430 500 3000 94.9% chr10 - 81645523 81645604 82 browser details YourSeq 35 868 986 3000 61.6% chr1 - 189032823 189032873 51 browser details YourSeq 35 720 766 3000 95.0% chr5 + 125193390 125193439 50
Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.
Page 4 of 7 https://www.alphaknockout.com
Gene and protein information: Hgsnat heparan-alpha-glucosaminide N-acetyltransferase [ Mus musculus (house mouse) ] Gene ID: 52120, updated on 10-Oct-2019
Gene summary
Official Symbol Hgsnat provided by MGI Official Full Name heparan-alpha-glucosaminide N-acetyltransferase provided by MGI Primary source MGI:MGI:1196297 See related Ensembl:ENSMUSG00000037260 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Tmem76; AW208455; D8Ertd354e; 9430010M12Rik Expression Ubiquitous expression in cerebellum adult (RPKM 12.4), subcutaneous fat pad adult (RPKM 10.2) and 28 other tissues See Orthologs more human all
Genomic context
Location: 8 A2; 8 14.22 cM See Hgsnat in Genome Data Viewer
Exon count: 18
Annotation release Status Assembly Chr Location
108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (25944459..25976744, complement)
Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (27054931..27087216, complement)
Chromosome 8 - NC_000074.6
Page 5 of 7 https://www.alphaknockout.com
Transcript information: This gene has 4 transcripts
Gene: Hgsnat ENSMUSG00000037260
Description heparan-alpha-glucosaminide N-acetyltransferase [Source:MGI Symbol;Acc:MGI:1196297] Gene Synonyms 9430010M12Rik, D8Ertd354e, Tmem76 Location Chromosome 8: 25,944,453-25,976,753 reverse strand. GRCm38:CM001001.2 About this gene This gene has 4 transcripts (splice variants), 257 orthologues, is a member of 1 Ensembl protein family and is associated with 54 phenotypes. Transcripts
Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags
Hgsnat-201 ENSMUST00000037609.7 2694 656aa ENSMUSP00000040356.6 Protein coding CCDS40309 Q3UDW8 TSL:1 GENCODE basic APPRIS P1
Hgsnat-204 ENSMUST00000211550.1 485 119aa ENSMUSP00000147675.1 Protein coding - A0A1B0GRV1 CDS 3' incomplete TSL:5
Hgsnat-203 ENSMUST00000210894.1 870 No protein - Retained intron - - TSL:2
Hgsnat-202 ENSMUST00000209420.1 489 No protein - lncRNA - - TSL:3
52.30 kb Forward strand 25.94Mb 25.95Mb 25.96Mb 25.97Mb 25.98Mb Genes Kcnu1-201 >protein coding (Comprehensive set...
Kcnu1-202 >protein coding
Contigs AC122752.10 > < AC093366.9 Genes (Comprehensive set... < Hgsnat-201protein coding < Pomk-201protein coding
< Hgsnat-203retained intron < Hgsnat-204protein coding
< Hgsnat-202lncRNA
Regulatory Build
25.94Mb 25.95Mb 25.96Mb 25.97Mb 25.98Mb Reverse strand 52.30 kb
Regulation Legend CTCF Open Chromatin Promoter Promoter Flank
Gene Legend Protein Coding
Ensembl protein coding merged Ensembl/Havana
Non-Protein Coding
RNA gene processed transcript
Page 6 of 7 https://www.alphaknockout.com
Transcript: ENSMUST00000037609
< Hgsnat-201protein coding
Reverse strand 32.30 kb
ENSMUSP00000040... Transmembrane heli... MobiDB lite Low complexity (Seg) Pfam Domain of unknown function DUF1624
PANTHER Heparan-alpha-glucosaminide N-acetyltransferase
PTHR31061
All sequence SNPs/i... Sequence variants (dbSNP and all other sources)
Variant Legend
inframe deletion missense variant splice region variant synonymous variant
Scale bar 0 60 120 180 240 300 360 420 480 540 656
We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.
Page 7 of 7