Browsing Genes and Genomes with Ensembl

Total Page:16

File Type:pdf, Size:1020Kb

Load more

The Bioinformatics Roadshow
Copenhagen, Denmark
16 & 17 June 2011

BROWSING GENES AND GENOMES
WITH ENSEMBL

EXERCISES AND ANSWERS

Note: These exercises are based on Ensembl version 62 (13 April 2011). After in future a new version has gone live, version 62 will still be available at http://e62.ensembl.org. If your answer doesn’t correspond with the given answer, please consult the instructor.

______________________________________________________________ BROWSER ______________________________________________________________

Exercise 1 – Exploring a gene

(a) Find the human F9 (Coagulation factor IX) gene. On which chromosome and which strand of the genome is this gene located? How many transcripts (splice variants) have been annotated for it?

(b) What is the longest transcript? How long is the protein it encodes? How many exons does it have? Are any of the exons completely or partially untranslated?

(c) Have a look at the external references for ENST00000218099. What is the function of F9?

(d) Is it possible to monitor expression of ENST00000218099 with the CodeLink microarray? If so, can it also be used to monitor expression of the other two transcripts?

(e) In which part (i.e. the N-terminal or C-terminal half) of the protein encoded by ENST00000218099 does its peptidase activity, responsible for the cleavage of factor X to yield its active form factor Xa, reside?

(f) How many non-synonymous variants have been discovered for the protein encoded by ENST00000218099?

(g) Is there a mouse ortholog predicted for the human F9 gene? (h) On which cytogenetic band and on which contig is the F9 gene located? Is there a BAC clone that contains the complete F9 gene?

(i) If you have yourself a gene of interest, explore what information Ensembl displays about it!

______________________________________________________________

Answer

(a)

 Go to the Ensembl homepage (http://www.ensembl.org).  Select ‘Search: Human’ and type ‘f9’ or ‘factor IX’ in the ‘for’ text box.  Click [Go].  Click on ‘Gene’ on the page with search results.  Click on ‘Homo sapiens’.  Click on ‘F9 [ Ensembl/Havana merge gene: ENSG00000101981 ]’.

The human F9 gene is located on the X chromosome on the forward strand. There have been three transcripts annotated for this gene, ENST00000218099 (F9-001), ENST00000394090 (F9-201) and ENST00000479617 (F9-002).

(b) The longest transcript is ENST00000218099. The length of this transcript is 2780 base pairs and the length of the encoded protein is 461 amino acids.

 Click on the Ensembl Transcript ID ‘ENST00000218099’ in the list of transcripts.

It has eight exons.

 Click on ‘Sequence - Exons’ in the side menu.

The first and last exon are partially untranslated (sequence shown in purple). (c)

 Click on ‘External References - General identifiers’ in the side menu.  Explore some of the links (good places to start are usually ‘UniProtKB/Swiss-Prot’ and ‘WikiGene’).  Do the same for ‘Ontology – Ontology table’.

Factor IX is a vitamin K-dependent plasma protein that participates in the intrinsic pathway of blood coagulation by converting factor X to its active form in the presence of Ca2+ ions, phospholipids, and factor VIIIa (this is the function description as found in UniProtKB/Swiss-Prot).

(d)

 Click on ‘External References - Oligo probes’ in the side menu.

The CodeLink microarray contains one probe, GE80895, that maps to ENST00000218099, so it is possible to monitor expression of this transcript using this array.

 Click on ‘ENST00000394090’ and ‘ENST00000479617’ in the list of transcripts.

The probe GE80895 also maps to ENST00000394090. There is no CodeLink probe that maps to ENST00000479617 though, so expression of this transcript cannot be monitored using this array.

(e)

 Click on ‘ENST00000218099’ in case you are not already on the ‘Transcript: F9-001’ tab.  Click on ‘Protein Information – Protein summary’ in the side menu.  Click on ‘Protein Information - Domains & features’ in the side menu.

The peptidase activity of the protein resides in the peptidase domain that is located in the C-terminal half of the protein.

(f)

 Click on ‘Protein Information - Variations’ in the side menu.

For the protein encoded by ENST00000218099 738 variants are shown. The vast majority of these are mutations from the Human Gene Mutation Database (HGMD). As the HGMD is a subscription database, Ensembl is only allowed to show the ID and the position of these mutations (if you want to get more information, you have to register at the HGMD). Apart from these a small number of variants from other sources are shown with more comprehensive information. Four of these are non-synonymous, i.e. rs104894807, rs6048, rs1801202 and rs4149751.

(g)

 Click on the ‘Gene: F9’ tab.  Click on ‘Comparative Genomics - Orthologues’ in the side menu.  Type ‘mouse’ in the ‘Filter’ text box at the top of the ‘Orthologues’ table.

There is one mouse ortholog predicted for human F9, ENSMUSG00000031138.

(h)

 Click on the ‘Location: X:138,612,917-138,645,617’ tab.

The F9 gene is located on cytogenetic band q27.1 and on contig AL033403.1.

 Click [Configure this page] in the side menu.  Type ‘clone’ in the ‘Find a track’ text box.  Select ‘1 Mb clone set’, ‘30k clone set’, ‘32k clone set’ and ‘Tilepath’.  Click ().

There are several BAC clones that contain the complete F9 gene, e.g. the tilepath clone RP6-88D7 and the clones RP11-1144I12 and RP13-520K15.

______________________________________________________________ Exercise 2 – Exploring a region

(a) Go to the region from bp 52,600,000 to 53,400,000 on human chromosome 4. How many contigs make up this portion of the assembly (contigs are contiguous stretches of DNA sequence that have been assembled solely based on direct sequencing information)? What does the open (i.e. non-blue) part in the ‘Contigs’ track represent?

(b) Do the tilepath clones (i.e. the BAC clones that were sequenced for the human genome assembly) correspond with the contigs?

(c) Zoom in on the SGCB transcripts, including a bit of flanking sequence on both sides.

(d) CpG islands are genomic regions that contain a high frequency of CG dinucleotides and are often located near the promoter of mammalian genes. Is there a CpG island located at the 5’ end of the SGCB transcripts? Possible transcription start sites can be predicted using the Eponine program

(http://www.sanger.ac.uk/Software/analysis/eponine/). Is there a transcription

start site predicted by Eponine annotated for the SGCB transcripts? (e) Export the genomic sequence of the region you are looking at in FASTA format.

(f) If you have yourself a genomic region of interest, explore what information Ensembl displays about it!

______________________________________________________________

Answer

(a)

 Go to the Ensembl homepage (http://www.ensembl.org).  Select ‘Search: Human’ and type ‘4:52600000-53400000’ in the ‘for’ text box (or alternatively leave the ‘Search’ drop-down list like it is and type ’human 4:52600000-53400000’ in the ‘for’ text box).  Click [Go].

This genomic region is made up of seven contigs, indicated by the alternating light and dark blue colored bars in the ‘Contigs’ track. One of them, AC107402.4, located between AC093858.2 and AC093880.4, is very small and possibly not visible. The open part in the ‘Contigs’ track represents a gap in the genome assembly (although the human genome is called ‘finished’, there are still gaps!). Note that this region is very close to the centromere of the chromosome.

If you cannot see the very tiny contig AC107402.4, you may want to increase the width of your display as follows:

 Click [Configure this page] in the side menu.  Click on the ‘Configure page’ tab.  Select a larger pixels value or ‘best fit’ from the ‘Width of image’ drop-down list.  Click ().

(b) If the tilepath clones are not already shown:

 Click [Configure this page] in the side menu.  Type ‘tilepath’ in the ‘Find a track’ text box.  Select ‘Tilepath’.  Click ().

The tilepath clones do correspond to the contigs and it is easy to see from which BAC clone which contig sequence in the assembly is derived, e.g. AC027271.7 is derived from RP11-365H22, AC104784.5 is derived from RP11-61F5 etc.

(c)

 Draw with your mouse a box around the SGCB transcripts.  Click on ‘Jump to region’ in the pop-up menu.

(d)

 Click [Configure this page] in the side menu.  Type ‘cpg’ in the ‘Find a track’ text box.  Select ‘CpG islands’.  Click ()

There is indeed a CpG island located at the 5’ end of the SGCB-001, SGCB- 201 and SGCB-002 transcripts. Note that you also clearly can see this island in the ‘%GC’ track, that should be shown by default.

 Click [Configure this page] in the side menu.  Type ‘eponine’ in the ‘Find a track’ text box.  Select ‘TSS (Eponine)’.  Click ().

Eponine TSS-finder predicts a transcription start site close to the starts of the SGCB-001, SGCB-201 and SGCB-002 transcripts.

(e)

 Click [Export data] in the side menu.  Click [Next>].  Click on ‘Text’.

Note that the sequence has a header that provides information about the genome assembly (GRCh37), the chromosome, the start and end coordinates and the strand. For example:

>4 dna:chromosome chromosome:GRCh37:4:52886013:52905920:1

______________________________________________________________ ______________________________________________________________ BIOMART ______________________________________________________________

Exercise 1

Generate a list of all human genes that are located on the Y chromosome, that are protein-coding and that encode for a protein containing one or more transmembrane domains. Do a count after selection of each filter to check the number of genes remaining in your dataset.

______________________________________________________________

Answer

 Go to the Ensembl homepage (http://www.ensembl.org).  Click on the ‘BioMart’ link on the toolbar.

Start with all human Ensembl genes:

 Choose the ‘Ensembl Genes 62’ database.  Choose the ‘Homo sapiens genes (GRCh37.p3)’ dataset.

Now filter for the genes on the Y chromosome:

 Click on ‘Filters’ in the left panel.  Expand the ‘REGION’ section by clicking on the + box.  Select ‘Chromosome – Y’. Make sure the check box in front of the filter is ticked, otherwise the filter won’t work.  Click the [Count] button on the toolbar.

This should give you 511 / 53334 Genes. Now filter further for genes that are protein-coding:

 Expand the ‘GENE’ section by clicking on the + box.  Select ‘Gene type – protein_coding’.  Click the [Count] button on the toolbar.

This should give you 55 / 53334 Genes. Finally filter for genes that encode proteins containing one or more transmembrane domains:

 Expand the ‘PROTEIN DOMAINS’ section by clicking on the + box.  Select ‘Transmembrane domains – Only’.  Click the [Count] button on the toolbar.

This should give you 5 / 53334 Genes. Specify the attributes to be included in the output (note that a number of attributes will already be default selected):

 Click on ‘Attributes’ in the left panel.  Expand the ‘GENE’ section by clicking on the + box.  Select, in addition to the attributes ‘Ensembl Gene ID’ and ‘Ensembl Transcript ID’ that are already default selected, for instance ‘Associated Gene Name’ and ‘Description’.

Have a look at a preview of the results (only 10 rows of the results will be shown):

 Click the [Results] button on the toolbar.

If you are happy with how the results look in the preview, output all the results:

 Select ‘View All rows as HTML’ or export all results to a file.

Note: When you select ‘View All rows as HTML’, your results will be shown under a new tab or in a new window in your internet browser.

Although you have filtered for only five genes, your results will contain more than five rows. This is because several of the genes have more than one transcript and consequently the results contain a separate row for each transcript.

______________________________________________________________ Exercise 2

BioMart is a very handy tool when you want to map between different databases. The following is a list of 19 accession numbers from the UniProtKB/Swiss-Prot database (http://www.uniprot.org/) of human proteins that are supposedly involved in the sensory perception of pain:

Q99608, P34913, P28482, P28223, Q96LB1, P10997, P01210, P25929, P17481, P43681, P29460, Q9HC23, P20366, Q9Y2W7, Q99572, Q99571, Q00535, P01138, P17787

Generate a list that shows to which Ensembl Gene IDs and to which HGNC symbols these UniProtKB/Swiss-Prot accession numbers map. Also include the gene description.

Hint: Instead of doing a lot of typing (which is not only a waste of time but also introduces the risk of making typos), copy-paste the above list from the digital version of this document, which you can find at

http://www.ebi.ac.uk/training/roadshow/110616_copenhagen.html.

______________________________________________________________

Answer

 Go to the Ensembl homepage (http://www.ensembl.org).  Click on the ‘BioMart’ link on the toolbar.

 Choose the ‘Ensembl Genes 62’ database.  Choose the ‘Homo sapiens genes (GRCh37.p3)’ dataset.

 Click on ‘Filters’ in the left panel.  Expand the ‘GENE’ section by clicking on the + box.  Select ‘ID list limit – UniProt/Swissprot Accession(s)’.  Enter the list of IDs in the text box (either comma separated or as a list).

 Click on ‘Attributes’ in the left panel.  Expand the ‘GENE’ section by clicking on the + box.  Deselect ‘Ensembl Transcript ID’.  Select ‘Description’.  Expand the ‘EXTERNAL’ section by clicking on the + box.  Select ‘HGNC symbol’ and ‘UniProt/SwissProt Accession’.

 Click the [Results] button on the toolbar.  Select ‘View All rows as HTML’ or export all results to a file. Tick the box ‘Unique results only’.

Note: BioMart is ‘transcript-centric’, which means that it will often give a separate row of output for each transcript of a gene, even if you don’t include the Ensembl Transcript ID in your output. To get rid of redundant rows, use the ‘Unique results only’ option.

Your results should show 19 genes.

______________________________________________________________ Exercise 3

The paper ‘Fine mapping of the usher syndrome type IC to chromosome 11p14 and identification of flanking markers by haplotype analysis’ (Ayyagari et al. Mol Vis. 1995 Oct 25;1:2) describes the mapping of the human Usher Syndrome type I C to the genomic region between the markers D11S1397 and D11S1310.

Confirm this finding by generating a list of the genes located in this region. Include the Ensembl Gene ID, name and description.

______________________________________________________________

Answer

 Go to the Ensembl homepage (http://www.ensembl.org).  Click on the ‘BioMart’ link on the toolbar.

 Choose the ‘Ensembl Genes 62’ database.  Choose the ‘Homo sapiens genes (GRCh37.p3)’ dataset.

 Click on ‘Filters’ in the left panel.  Expand the ‘REGION’ section by clicking on the + box.  Enter ‘Marker Start: d11s1397’ and ‘Marker End: d11s1310’.

 Click on ‘Attributes’ in the left panel.  Expand the ‘GENE’ section by clicking on the + box.  Deselect ‘Ensembl Transcript ID’.  Select ‘Associated Gene Name’ and ‘Description’.

 Click the [Results] button on the toolbar.  Select ‘View All rows as HTML’ or export all results to a file.

Your results should show 32 genes. Among these there should be one gene (ENSG00000006611) named ‘USH1C’ with the description ‘Usher syndrome 1C (autosomal recessive, severe) [Source:HGNC Symbol;Acc:12597]’. This confirms that Ayyagari et al. correctly mapped Usher Syndrome type I C to this genomic region.

______________________________________________________________ Exercise 4

In the paper ‘Discovery of novel biomarkers by microarray analysis of peripheral blood mononuclear cell gene expression in benzene-exposed workers’ (Forrest et al. Environ Health Perspect. 2005 June; 113(6): 801–807) the effect of benzene exposure on peripheral blood mononuclear cell gene expression in a population of shoe factory workers with well-characterized occupational exposures was examined using microarrays. The microarray used was the Affymetrix U133A/B GeneChip (also called ‘U133 plus 2’). The top 25 probe sets up-regulated by benzene exposure were:

207630_s_at, 221840_at, 219228_at, 204924_at 227613_at, 223454_at, 228962_at, 214696_at, 210732_s_at, 212371_at, 225390_s_at, 227645_at, 226652_at, 221641_s_at, 202055_at, 226743_at, 228393_s_at, 225120_at, 218515_at, 202224_at, 200614_at, 212014_x_at, 223461_at, 209835_x_at, 213315_x_at (a) Generate a list of the genes to which these probesets map. Include the Ensembl Gene ID, name and description as well as the probeset name.

(b) As a first step in order to be able to analyse them for possible regulatory features they have in common, retrieve the 250 bp upstream of the transcripts of these genes. Include the Ensembl Gene and Transcript ID, name and description in the sequence header.

(c) In order to be able to study these human genes in mouse, generate a list of the human genes and their mouse orthologs. Include the Ensembl Gene ID for both the human and mouse genes in your list.

(d) Generate the same list as in (c), but now also include the name and description of both the human and mouse genes. As the name and description of the mouse genes are not available as attributes in the Ensembl human genes dataset, you have to add the Ensembl mouse genes as a second dataset.

______________________________________________________________

Answer

(a)

 Go to the Ensembl homepage (http://www.ensembl.org).  Click on the ‘BioMart’ link on the toolbar.

 Choose the ‘Ensembl Genes 62’ database.  Choose the ‘Homo sapiens genes (GRCh37.p3)’ dataset.

 Click on ‘Filters’ in the left panel.  Expand the ‘GENE’ section by clicking on the + box.  Select ‘ID list limit - Affy hg u133 plus 2 ID(s)’.  Enter the list of probeset IDs in the text box (either comma separated or as a list).

 Click on ‘Attributes’ in the left panel.  Expand the ‘GENE’ section by clicking on the + box.  Deselect ‘Ensembl Transcript ID’.  Select ‘Associated Gene Name’ and ‘Description’.  Expand the ‘EXTERNAL’ section by clicking on the + box.  Select ‘Affy HG U133-PLUS-2’.

 Click the [Results] button on the toolbar.  Select ‘View All rows as HTML’ or export all results to a file. Tick the box ‘Unique results only’.

Your results should show 23 genes. In most cases one probeset maps to one gene. Exceptions are 227613_at and 219228_at, that both map to ENSG00000130844, 212014_x_at and 209835_x_at, that both map to ENSG00000026508, and 213315_x_at, that maps to both ENSG00000197620 and ENSG00000197021. One probeset (226652_at) doesn’t map to an Ensembl gene at all and is therefore not shown in the list of results.

(b) You can leave the dataset and filters the same, so you can directly specify the attributes:

 Click on ‘Attributes’ in the left panel.  Select the ‘Sequences’ attributes page.  Expand the ‘SEQUENCES’ section by clicking on the + box.  Select ‘Flank (Transcript)’.  Enter ‘250’ in the ‘Upstream flank’ text box.  Expand the ‘Header Information’ section by clicking on the + box.  Select ‘Associated Gene Name’ and ‘Description’.

Note: ‘Flank (Transcript)’ will give the flanks for all the transcripts of a gene with multiple transcripts. ‘Flank (Gene)’ will only give the flank for the transcript with the outermost 5’ (or 3’) end.

 Click the [Results] button on the toolbar.  Select ‘View All rows as FASTA’ or export all results to a file.

(c) You can leave the dataset and filters the same, so you can directly specify the attributes:

 Click on ‘Attributes’ in the left panel.  Select the ‘Homologs’ attributes page.  Expand the ‘GENE’ section by clicking on the + box.  Deselect ‘Ensembl Transcript ID’.  Expand the ‘ORTHOLOGS’ section by clicking on the + box.  Select ‘Mouse Ensembl Gene ID’.

 Click the [Results] button on the toolbar.  Select ‘View All rows as HTML’ or export all results to a file. Tick the box ‘Unique results only’.

Your results should show that for 18 out of the 23 human genes one mouse ortholog has been identified, while ENSG00000123130 has two mouse orthologs and ENSG00000172716 has three mouse orthologs. For three human genes (ENSG00000186594, ENSG00000130844 and ENSG00000089335) no mouse ortholog has been identified.

(d) You can leave the Ensembl human genes dataset and filters the same, so you can directly specify the attributes:

 Click on ‘Attributes’ in the left panel.  Select the ‘Features’ attributes page.  Make sure that in the ‘GENE’ section the attributes ‘Ensembl Gene ID’, ‘Associated Gene Name’ and ‘Description’ are selected.  Deselect ‘Affy HG U133-PLUS-2’ in the ‘EXTERNAL’ section.

Add the Ensembl mouse genes as a second dataset:

 Click on ‘Dataset’ at the bottom of the left panel.  Choose the ‘[Ensembl Genes 62] Mus musculus genes (NCBIM37)’ dataset.

Specify the same attributes for mouse as for human:

 Click on ‘Attributes’ in the left panel.  Expand the ‘GENE’ section by clicking on the + box.  Deselect ‘Ensembl Transcript ID’.  Select ‘Associated Gene Name’ and ‘Description’.

 Click the [Results] button on the toolbar.  Select ‘View All rows as HTML’ or export all results to a file. Tick the box ‘Unique results only’.

Your results should show the same list as in (c), but now with the name and description added for both the human and mouse genes.

______________________________________________________________ Exercise 5

Recommended publications
  • Elucidating Biological Roles of Novel Murine Genes in Hearing Impairment in Africa

    Elucidating Biological Roles of Novel Murine Genes in Hearing Impairment in Africa

    Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 19 September 2019 doi:10.20944/preprints201909.0222.v1 Review Elucidating Biological Roles of Novel Murine Genes in Hearing Impairment in Africa Oluwafemi Gabriel Oluwole,1* Abdoulaye Yal 1,2, Edmond Wonkam1, Noluthando Manyisa1, Jack Morrice1, Gaston K. Mazanda1 and Ambroise Wonkam1* 1Division of Human Genetics, Department of Pathology, Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa. 2Department of Neurology, Point G Teaching Hospital, University of Sciences, Techniques and Technology, Bamako, Mali. *Correspondence to: [email protected]; [email protected] Abstract: The prevalence of congenital hearing impairment (HI) is highest in Africa. Estimates evaluated genetic causes to account for 31% of HI cases in Africa, but the identification of associated causative genes mutations have been challenging. In this study, we reviewed the potential roles, in humans, of 38 novel genes identified in a murine study. We gathered information from various genomic annotation databases and performed functional enrichment analysis using online resources i.e. genemania and g.proflier. Results revealed that 27/38 genes are express mostly in the brain, suggesting additional cognitive roles. Indeed, HERC1- R3250X had been associated with intellectual disability in a Moroccan family. A homozygous 216-bp deletion in KLC2 was found in two siblings of Egyptian descent with spastic paraplegia. Up to 27/38 murine genes have link to at least a disease, and the commonest mode of inheritance is autosomal recessive (n=8). Network analysis indicates that 20 other genes have intermediate and biological links to the novel genes, suggesting their possible roles in HI.
  • Identification of Key Genes and Pathways in Pancreatic Cancer

    Identification of Key Genes and Pathways in Pancreatic Cancer

    G C A T T A C G G C A T genes Article Identification of Key Genes and Pathways in Pancreatic Cancer Gene Expression Profile by Integrative Analysis Wenzong Lu * , Ning Li and Fuyuan Liao Department of Biomedical Engineering, College of Electronic and Information Engineering, Xi’an Technological University, Xi’an 710021, China * Correspondence: [email protected]; Tel.: +86-29-86173358 Received: 6 July 2019; Accepted: 7 August 2019; Published: 13 August 2019 Abstract: Background: Pancreatic cancer is one of the malignant tumors that threaten human health. Methods: The gene expression profiles of GSE15471, GSE19650, GSE32676 and GSE71989 were downloaded from the gene expression omnibus database including pancreatic cancer and normal samples. The differentially expressed genes between the two types of samples were identified with the Limma package using R language. The gene ontology functional and pathway enrichment analyses of differentially-expressed genes were performed by the DAVID software followed by the construction of a protein–protein interaction network. Hub gene identification was performed by the plug-in cytoHubba in cytoscape software, and the reliability and survival analysis of hub genes was carried out in The Cancer Genome Atlas gene expression data. Results: The 138 differentially expressed genes were significantly enriched in biological processes including cell migration, cell adhesion and several pathways, mainly associated with extracellular matrix-receptor interaction and focal adhesion pathway in pancreatic cancer. The top hub genes, namely thrombospondin 1, DNA topoisomerase II alpha, syndecan 1, maternal embryonic leucine zipper kinase and proto-oncogene receptor tyrosine kinase Met were identified from the protein–protein interaction network.
  • Namyoung Jung

    Namyoung Jung

    EPIGENETIC BASIS OF STEM CELL IDENTITY IN NORMAL AND MALIGNANT HEMATOPOIETIC DEVELOPMENT by Namyoung Jung A dissertation submitted to Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy Baltimore, Maryland July, 2015 © 2015 Namyoung Jung All Rights Reserved Abstract Acute myeloid leukemia (AML) is a heterogeneous hematologic malignancy characterized by subpopulations of leukemia-initiating or leukemia stem cells (LSC) that give rise to clonally related non-stem leukemic blasts. The LSC model proposes that since LSC and their blast progeny are clonally related, their functional properties must be due to epigenetic differences. In addition, the cell of origin of LSC among normal hematopoietic stem and progenitor cells (HSPCs) has yet to be clearly demonstrated. In order to investigate the role of epigenetics in LSC function and hematopoietic development, we profiled DNA methylation and gene expression of CD34+CD38-, CD34+CD38+ and CD34- cells from 15 AML patients, along with 6 well-defined HSPC populations from 5 normal bone marrows using Illumina Infinium HumanMethylation450 BeadChip and Affymetrix Human Genome U133 Plus 2.0 Array. To define LSC and blast functionally, we performed engraftment assays on the three subpopulations from 15 AML patients and defined 20 LSCs and 24 blast samples. We identified the key functional LSC epigenetic signature able to distinguish LSC from blasts that consisted of 84 differential methylations regions (DMRs) in 70 genes that correlated with differential gene expression. HOXA cluster genes were enriched within the LSC epigenetic signature. We found that most of these DMRs involve epigenetic alteration independent of underlying mutations, although several are downstream targets of genetic mutation in epigenome modifying enzymes and upstream regulators.
  • Oral Administration of Lactobacillus Plantarum 299V

    Oral Administration of Lactobacillus Plantarum 299V

    Genes Nutr (2015) 10:10 DOI 10.1007/s12263-015-0461-7 RESEARCH PAPER Oral administration of Lactobacillus plantarum 299v modulates gene expression in the ileum of pigs: prediction of crosstalk between intestinal immune cells and sub-mucosal adipocytes 1 1,4 1,5 1 Marcel Hulst • Gabriele Gross • Yaping Liu • Arjan Hoekman • 2 1,3 1,3 Theo Niewold • Jan van der Meulen • Mari Smits Received: 19 November 2014 / Accepted: 28 March 2015 / Published online: 11 April 2015 Ó The Author(s) 2015. This article is published with open access at Springerlink.com Abstract To study host–probiotic interactions in parts of ileum. A higher expression level of several B cell-specific the intestine only accessible in humans by surgery (je- transcription factors/regulators was observed, suggesting junum, ileum and colon), pigs were used as model for that an influx of B cells from the periphery to the ileum humans. Groups of eight 6-week-old pigs were repeatedly and/or the proliferation of progenitor B cells to IgA-com- orally administered with 5 9 1012 CFU Lactobacillus mitted plasma cells in the Peyer’s patches of the ileum was plantarum 299v (L. plantarum 299v) or PBS, starting with stimulated. Genes coding for enzymes that metabolize a single dose followed by three consecutive daily dosings leukotriene B4, 1,25-dihydroxyvitamin D3 and steroids 10 days later. Gene expression was assessed with pooled were regulated in the ileum. Bioinformatics analysis pre- RNA samples isolated from jejunum, ileum and colon dicted that these metabolites may play a role in the scrapings of the eight pigs per group using Affymetrix crosstalk between intestinal immune cells and sub-mucosal porcine microarrays.
  • Calorie Restriction and the Aging Muscle: a Multispecies Approach

    Calorie Restriction and the Aging Muscle: a Multispecies Approach

    CALORIE RESTRICTION AND THE AGING MUSCLE: A MULTISPECIES APPROACH Melissa C. Orenduff A dissertation submitted to the faculty at the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Nutrition in the Gillings School of Global Public Health. Chapel Hill 2020 Approved by: Stephen D. Hursting Melinda A. Beck Saame Raza Shaikh Kim M. Huffman Jane F. Pendergast © 2020 Melissa C. Orenduff ALL RIGHTS RESERVED ii ABSTRACT Melissa C. Orenduff: Calorie Restriction and the Aging Muscle: A Multispecies Approach (Under the direction of Stephen D. Hursting) Aging is associated with a progressive decline in muscle mass, strength, and physical function termed sarcopenia. Sarcopenia is an important risk factor for loss of independence, reduced quality of life, and increased mortality. Calorie restriction (CR) is an effective dietary treatment for several age-related conditions, including sarcopenia. Recent work conducted in our laboratory suggests insulin-like growth factor-1 (IGF-1) is involved in the anti-aging effects of CR. The aim of this work is to understand the role of circulating IGF-1 in CR-induced effects on aging skeletal muscle. Using a multispecies approach, we sought to determine if IGF-1- mediated effects of CR are conserved among mice, nonhuman primates, and humans. As an extension into our investigation of CR and IGF-1 on aging muscle, we conducted an in vivo experiment to examine the role of a CR mimetic, rapamycin, on age-related loss of muscle strength and function. Our results suggest that the effects of CR on aging skeletal muscle are realized through biological changes consistent with regenerative and repair-related mechanisms and a shift in transcriptional profiles towards improved metabolic and inflammatory signaling pathways that promote enhanced mitochondrial function and biogenesis.
  • Hyperactivity Disorder

    Hyperactivity Disorder

    Molecular Psychiatry (2011) 16, 491–503 & 2011 Macmillan Publishers Limited All rights reserved 1359-4184/11 www.nature.com/mp ORIGINAL ARTICLE Genome-wide copy number variation analysis in attention-deficit/hyperactivity disorder: association with neuropeptide Y gene dosage in an extended pedigree K-P Lesch1,6, S Selch1,6, TJ Renner2,6, C Jacob1, TT Nguyen3, T Hahn1, M Romanos2, S Walitza2, S Shoichet4, A Dempfle3, M Heine1, A Boreatti-Hu¨mmer1, J Romanos1, S Gross-Lesch1, H Zerlaut3, T Wultsch1, S Heinzel1, M Fassnacht5, A Fallgatter1, B Allolio5, H Scha¨fer4, A Warnke2, A Reif1, H-H Ropers4 and R Ullmann4 1ADHD Clinical Research Network, Unit for Molecular Psychiatry, Laboratory of Translational Neuroscience, Department of Psychiatry, Psychosomatics and Psychotherapy, University of Wuerzburg, Wuerzburg, Germany; 2ADHD Clinical Research Network, Department of Child and Adolescent Psychiatry, Psychosomatics, and Psychotherapy, University of Wuerzburg, Wuerzburg, Germany; 3Institute of Medical Biometry and Epidemiology, University of Marburg, Marburg, Germany; 4Max Planck Institute for Molecular Genetics, Berlin, Germany and 5Department of Endocrinology and Diabetes, Internal Medicine I, University of Wuerzburg, Wuerzburg, Germany Attention-deficit/hyperactivity disorder (ADHD) is a common, highly heritable neurodevelop- mental syndrome characterized by hyperactivity, inattention and increased impulsivity. To detect micro-deletions and micro-duplications that may have a role in the pathogenesis of ADHD, we carried out a genome-wide screen for copy number variations (CNVs) in a cohort of 99 children and adolescents with severe ADHD. Using high-resolution array comparative genomic hybridization (aCGH), a total of 17 potentially syndrome-associated CNVs were identified. The aberrations comprise 4 deletions and 13 duplications with approximate sizes ranging from 110 kb to 3 Mb.
  • Molecular Targeting and Enhancing Anticancer Efficacy of Oncolytic HSV-1 to Midkine Expressing Tumors

    Molecular Targeting and Enhancing Anticancer Efficacy of Oncolytic HSV-1 to Midkine Expressing Tumors

    University of Cincinnati Date: 12/20/2010 I, Arturo R Maldonado , hereby submit this original work as part of the requirements for the degree of Doctor of Philosophy in Developmental Biology. It is entitled: Molecular Targeting and Enhancing Anticancer Efficacy of Oncolytic HSV-1 to Midkine Expressing Tumors Student's name: Arturo R Maldonado This work and its defense approved by: Committee chair: Jeffrey Whitsett Committee member: Timothy Crombleholme, MD Committee member: Dan Wiginton, PhD Committee member: Rhonda Cardin, PhD Committee member: Tim Cripe 1297 Last Printed:1/11/2011 Document Of Defense Form Molecular Targeting and Enhancing Anticancer Efficacy of Oncolytic HSV-1 to Midkine Expressing Tumors A dissertation submitted to the Graduate School of the University of Cincinnati College of Medicine in partial fulfillment of the requirements for the degree of DOCTORATE OF PHILOSOPHY (PH.D.) in the Division of Molecular & Developmental Biology 2010 By Arturo Rafael Maldonado B.A., University of Miami, Coral Gables, Florida June 1993 M.D., New Jersey Medical School, Newark, New Jersey June 1999 Committee Chair: Jeffrey A. Whitsett, M.D. Advisor: Timothy M. Crombleholme, M.D. Timothy P. Cripe, M.D. Ph.D. Dan Wiginton, Ph.D. Rhonda D. Cardin, Ph.D. ABSTRACT Since 1999, cancer has surpassed heart disease as the number one cause of death in the US for people under the age of 85. Malignant Peripheral Nerve Sheath Tumor (MPNST), a common malignancy in patients with Neurofibromatosis, and colorectal cancer are midkine- producing tumors with high mortality rates. In vitro and preclinical xenograft models of MPNST were utilized in this dissertation to study the role of midkine (MDK), a tumor-specific gene over- expressed in these tumors and to test the efficacy of a MDK-transcriptionally targeted oncolytic HSV-1 (oHSV).
  • The Genetics of Heterotaxy Syndrome

    The Genetics of Heterotaxy Syndrome

    The Genetics of Heterotaxy Syndrome A dissertation submitted to the Division of Graduate Studies and Research, University of Cincinnati in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Molecular and Developmental Biology by Jason R. Cowan Master of Science, University of British Columbia, Vancouver, 2007 Committee Chair: Stephanie M. Ware, M.D., Ph.D. Robert B. Hinton Jr., M.D. Linda M. Parysek, Ph.D. S. Steven Potter, Ph.D. Aaron M. Zorn, Ph.D. Molecular & Developmental Biology Graduate Program College of Medicine, University of Cincinnati Division of Developmental Biology, Cincinnati Children’s Hospital Medical Center Cincinnati, Ohio, USA, 2015 ABSTRACT Congenital heart defects (CHDs) are the greatest cause of infant morbidity and mortality worldwide, occurring in roughly 8 per 1000 live births (~1%). Heterotaxy, a multiple congenital anomaly syndrome resulting from failure to establish left-right (L-R) asymmetry, is characterized by diverse, complex CHDs. Heterogeneous in presentation and etiology, heterotaxy serves as a complex and growing focal point for cardiovascular genetic research. In the two decades since the zinc finger transcription factor, ZIC3, was first identified as a cause of X-linked heterotaxy, mutations in nearly twenty genes with L-R patterning functions have been detected among patients with heterotaxy. Nevertheless, despite considerable progress, genetic causes for heterotaxy remain largely uncharacterized. With an estimated 70-80% of heterotaxy cases still unexplained, there remains enormous potential for novel gene discovery. In this dissertation, we have balanced gene discovery efforts aimed at identifying and characterizing novel causes of heterotaxy with studies into the mechanisms governing ZIC3- related heterotaxy.
  • WO 2014/089124 Al 12 June 20 14 ( 12.06.20 14) W P O P C T

    WO 2014/089124 Al 12 June 20 14 ( 12.06.20 14) W P O P C T

    (12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization International Bureau (10) International Publication Number (43) International Publication Date WO 2014/089124 Al 12 June 20 14 ( 12.06.20 14) W P O P C T (51) International Patent Classification: BZ, CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, A61K 48/00 (2006.01) A61P 37/00 (2006.01) DO, DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, A61K 39/395 (2006.01) HN, HR, HU, ID, IL, IN, IR, IS, JP, KE, KG, KN, KP, KR, KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD, ME, (21) International Application Number: MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, PCT/US2013/072938 OM, PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, (22) International Filing Date: SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, 3 December 2013 (03.12.2013) TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW. (25) Filing Language: English (84) Designated States (unless otherwise indicated, for every (26) Publication Language: English kind of regional protection available): ARIPO (BW, GH, (30) Priority Data: GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, SZ, TZ, 61/732,746 3 December 2012 (03. 12.2012) US UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, RU, TJ, 61/747,653 31 December 2012 (3 1. 12.2012) US TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, ΓΓ, LT, LU, LV, (71) Applicants: CENEXYS, INC.
  • Supplemental Data Inter-Individual Variability in Gene Expression

    Supplemental Data Inter-Individual Variability in Gene Expression

    DMD #42028 Supplemental data Inter-individual variability in gene expression profiles in human hepatocytes and comparison with HepaRG cells Alexandra ROGUE, Carine LAMBERT, Catherine SPIRE, Nancy CLAUDE and André GUILLOUZO Drug and metabolism disposition Supplemental Table 3: Genes expressed only in HepaRG cells located on the chromosome 7 Gene symbol Gene description SEPT13 septin 13 SEPT14 hCG_18833 ACHE acetylcholinesterase (Yt blood group) AMPH amphiphysin ANKIB1 ankyrin repeat and IBR domain containing 1 ANKMY2 ankyrin repeat and MYND domain containing 2 ANLN anillin, actin binding protein AP1S1 adaptor-related protein complex 1, sigma 1 subunit ATXN7L1 ataxin 7-like 1 C1GALT1 core 1 synthase, glycoprotein-N-acetylgalactosamine 3-beta-galactosyltransferase, 1 C7orf31 hCG_39028 C7orf38 chromosome 7 open reading frame 38 C7orf41 chromosome 7 open reading frame 41 C7orf53 chromosome 7 open reading frame 53 CADPS2 Ca++-dependent secretion activator 2 CALCR calcitonin receptor CALN1 calneuron 1 CASD1 CAS1 domain containing 1 CBLL1 Cas-Br-M (murine) ecotropic retroviral transforming sequence-like 1 CCDC132 coiled-coil domain containing 132 CD36 CD36 molecule (thrombospondin receptor) CDC14C CDC14 cell division cycle 14 homolog C (S. cerevisiae) CHN2 chimerin (chimaerin) 2 CHRM2 cholinergic receptor, muscarinic 2 CLDN15 claudin 15 CLIP2 CAP-GLY domain containing linker protein 2 CNPY4 canopy 4 homolog (zebrafish) COX19 COX19 cytochrome c oxidase assembly homolog (S. cerevisiae) CROT carnitine O-octanoyltransferase DBF4 DBF4 homolog (S. cerevisiae)
  • T-, B-And NK-Lymphoid, but Not Myeloid Cells Arise from Human

    T-, B-And NK-Lymphoid, but Not Myeloid Cells Arise from Human

    Leukemia (2007) 21, 311–319 & 2007 Nature Publishing Group All rights reserved 0887-6924/07 $30.00 www.nature.com/leu ORIGINAL ARTICLE T-, B- and NK-lymphoid, but not myeloid cells arise from human CD34 þ CD38ÀCD7 þ common lymphoid progenitors expressing lymphoid-specific genes I Hoebeke1,3, M De Smedt1, F Stolz1,4, K Pike-Overzet2, FJT Staal2, J Plum1 and G Leclercq1 1Department of Clinical Chemistry, Microbiology and Immunology, Ghent University Hospital, Ghent, Belgium and 2Department of Immunology, Erasmus University Medical Center, Rotterdam, The Netherlands Hematopoietic stem cells in the bone marrow (BM) give rise to share a direct common progenitor either, as CLPs were not all blood cells. According to the classic model of hematopoi- found in the fetal liver.5 Instead, fetal B and T cells would esis, the differentiation paths leading to the myeloid and develop through B/myeloid and T/myeloid intermediates. lymphoid lineages segregate early. A candidate ‘common 6 lymphoid progenitor’ (CLP) has been isolated from The first report of a human CLP came from Galy et al. who À þ CD34 þ CD38À human cord blood cells based on CD7 expres- showed that a subpopulation of adult and fetal BM Lin CD34 sion. Here, we confirm the B- and NK-differentiation potential of cells expressing the early B- and T-cell marker CD10 is not þ À þ CD34 CD38 CD7 cells and show in addition that this capable of generating monocytic, granulocytic, erythroid or population has strong capacity to differentiate into T cells. As megakaryocytic cells, but can differentiate into dendritic cells, CD34 þ CD38ÀCD7 þ cells are virtually devoid of myeloid B, T and NK cells.
  • 2011 Kornaus.Pdf (3.272Mb)

    2011 Kornaus.Pdf (3.272Mb)

    COVER SHEET TITLE: Validation of Optical Structural Alterations in Two Oligodendroglioma Tumors AUTHOR’S NAME: Andrew Kornaus MAJOR: Biology DEPARTMENT: Biology MENTOR: Prof. D. C. Schwartz DEPARTMENT: Genetics YEAR: 2011 The author hereby grants to University of Wisconsin-Madison the permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter created. ABSTRACT Validation of Optical Structural Alterations in Two Oligodendroglioma Tumors Many large- and small-scale changes occur in the genomes of tumor cells. The optical mapping system was used previously to discover structural variations (OSAs; Optical Structural Alterations) in tumor genomes. OS As are structural variants which include single basepair changes, resulting in missing or extra restriction enzyme sites in the genome, or small insertions and deletions, which can change the local size of individual restriction fragments as well as addition or deletion of entire restriction fragments at a time. OSAs were validated by polymerase chain reaction (PCR) of the regions that differ from reference human sequence and subsequent restriction digest of amplicons to verify the restriction map reported by optical mapping. This study looks at OSAs found in two oligodendrogliomas-tumors that have arisen from oligodendrocytes, the cells that insulate axons within the brain. 4,_J re-u1 /<ort'\ti l-l-S / G~oloj / Author Name/Major ~~ Author Signature Date Kornaus 1 Validation of Optical Structural Alterations in Two Oligodendroglioma Tumors Andy Kornaus Mentor: Prof. D. C. Schwartz Laboratory for Molecular and Computational Genomics Laboratory of Genetics, Department of Chemistry Introduction Tumor cells are known to have lost the ability to control the cell cycle as well as many other normal cellular functions, such as cell adhesion, metabolic rate, and expression of cell membrane antigens (Suzuki et al.