The Design and Realisation of the Myexperiment Virtual Research Environment for Social Sharing of Workflows
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Data Curation+Process Curation^Data Integration+Science
BRIEFINGS IN BIOINFORMATICS. VOL 9. NO 6. 506^517 doi:10.1093/bib/bbn034 Advance Access publication December 6, 2008 Data curation 1 process curation^data integration 1 science Carole Goble, Robert Stevens, Duncan Hull, Katy Wolstencroft and Rodrigo Lopez Submitted: 16th May 2008; Received (in revised form): 25th July 2008 Abstract In bioinformatics, we are familiar with the idea of curated data as a prerequisite for data integration. We neglect, often to our cost, the curation and cataloguing of the processes that we use to integrate and analyse our data. Downloaded from https://academic.oup.com/bib/article/9/6/506/223646 by guest on 27 September 2021 Programmatic access to services, for data and processes, means that compositions of services can be made that represent the in silico experiments or processes that bioinformaticians perform. Data integration through workflows depends on being able to know what services exist and where to find those services. The large number of services and the operations they perform, their arbitrary naming and lack of documentation, however, mean that they can be difficult to use. The workflows themselves are composite processes that could be pooled and reused but only if they too can be found and understood. Thus appropriate curation, including semantic mark-up, would enable processes to be found, maintained and consequently used more easily.This broader view on semantic annotation is vital for full data integration that is necessary for the modern scientific analyses in biology.This article will brief the community on the current state of the art and the current challenges for process curation, both within and without the Life Sciences. -
Data Management in Systems Biology I
Data management in systems biology I – Overview and bibliography Gerhard Mayer, University of Stuttgart, Institute of Biochemical Engineering (IBVT), Allmandring 31, D-70569 Stuttgart Abstract Large systems biology projects can encompass several workgroups often located in different countries. An overview about existing data standards in systems biology and the management, storage, exchange and integration of the generated data in large distributed research projects is given, the pros and cons of the different approaches are illustrated from a practical point of view, the existing software – open source as well as commercial - and the relevant literature is extensively overviewed, so that the reader should be enabled to decide which data management approach is the best suited for his special needs. An emphasis is laid on the use of workflow systems and of TAB-based formats. The data in this format can be viewed and edited easily using spreadsheet programs which are familiar to the working experimental biologists. The use of workflows for the standardized access to data in either own or publicly available databanks and the standardization of operation procedures is presented. The use of ontologies and semantic web technologies for data management will be discussed in a further paper. Keywords: MIBBI; data standards; data management; data integration; databases; TAB-based formats; workflows; Open Data INTRODUCTION the foundation of a new journal about biological The large amount of data produced by biological databases [24], the foundation of the ISB research projects grows at a fast rate. The 2009 (International Society for Biocuration) and special edition of the annual Nucleic Acids Research conferences like DILS (Data Integration in the Life database issue mentions 1170 databases [1]; alone Sciences) [25]. -
Semantic Web: a Review of the Field Pascal Hitzler [email protected] Kansas State University Manhattan, Kansas, USA
Semantic Web: A Review Of The Field Pascal Hitzler [email protected] Kansas State University Manhattan, Kansas, USA ABSTRACT which would probably produce a rather different narrative of the We review two decades of Semantic Web research and applica- history and the current state of the art of the field. I therefore do tions, discuss relationships to some other disciplines, and current not strive to achieve the impossible task of presenting something challenges in the field. close to a consensus – such a thing seems still elusive. However I do point out here, and sometimes within the narrative, that there CCS CONCEPTS are a good number of alternative perspectives. The review is also necessarily very selective, because Semantic • Information systems → Graph-based database models; In- Web is a rich field of diverse research and applications, borrowing formation integration; Semantic web description languages; from many disciplines within or adjacent to computer science, Ontologies; • Computing methodologies → Description log- and a brief review like this one cannot possibly be exhaustive or ics; Ontology engineering. give due credit to all important individual contributions. I do hope KEYWORDS that I have captured what many would consider key areas of the Semantic Web field. For the reader interested in obtaining amore Semantic Web, ontology, knowledge graph, linked data detailed overview, I recommend perusing the major publication ACM Reference Format: outlets in the field: The Semantic Web journal,1 the Journal of Pascal Hitzler. 2020. Semantic Web: A Review Of The Field. In Proceedings Web Semantics,2 and the proceedings of the annual International of . ACM, New York, NY, USA, 7 pages. -
Distributed Computing Environments and Workflows for Systems
Going with the Flow Distributed Computing for Systems Biology Using Taverna Prof Carole Goble The University of Manchester, UK http://www.mygrid.org.uk http://www.omii.ac.uk Data pipelines in bioinformatics Genscan Resources /Services EMBL BLAST Clustal-W Multiple Query Protein sequences Clustal-W sequence alignment 2 Example in silico experiment: Investigate© the evolutionary relationships between proteins [Peter Li] z Manual creation z Semi-automation using bespoke software 12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt z Issues: 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt 12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt z Volatility of data12541 ttcttataag in tctgtggtttlife ttatattaatsciences gtttttattg atgactgttt tttacaattg 12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga 12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc 12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa z Data and metadata12781 taacccattt tctgtctcta storage tggatttgcc tgttctggat attcatatta atagaatcaa z Integration of heterogeneous biological data z Visualisation of models z Brittleness 3 © 12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct -
Mygrid: a Collection of Web Pages
Home The myGrid team produce and use a suite of tools designed to “help e-Scientists get on with science and get on with scientists”. The tools support the creation of e-laboratories and have been used in domains as diverse as systems biology, social science, music, astronomy, multimedia and chemistry. The tools have been adopted by a large number of projects and institutions. The team has developed tools and infrastructure to allow: • the design, editing and execution of workflows in Taverna • the sharing of workflows and related data by myExperiment • the cataloguing and annotation of services in BioCatalogue and Feta • the creation of user-friendly rich clients such as UTOPIA These tools help to form the basis for the team’s work on e-Labs. Taverna Workbench 2.1.2 available for download The myGrid team have released Taverna Workbench 2.1.2. It supports secure access to data on the web, secure web services and secured private workflows. Taverna 2.1.2 includes • Copy/paste, shortcuts, undo/redo, drag and drop • Animated workflow diagram • Remembers added/removed services • Secure Web services support • Secure access to resources on the web • Up-to-date R support • Intermediate values during workflow runs • myExperiment integration • Excel and csv spreadsheet support Hosted at the School of Computer Science, University of Manchester, UK. Copyright 2008 - All Rights Reserved Sponsored by Microsoft, EPSRC, BBSRC, ESRC and JISC From www.mygrid.org.uk/ 1 27 May 2010 Outreach >Outreach The myGrid consortium put a lot of effort into reaching out to users. The team do not just “preach to the converted” but seek new users by offering Tutorials and giving talks, papers and posters in a very wide variety of places. -
Hosts: Monash Eresearch Centre and Messagelab Seminar :The Long Tail Scientist Presenter: Prof Carole Goble, Computer Science
Hosts: Monash eResearch Centre and MessageLab Seminar :The Long tail Scientist Presenter: Prof Carole Goble, Computer Science, University of Manchester Venue: Seminar Room 135, Building 26 Clayton Time and Date: Wed 3 August 2011, 5-6pm Abstract Big science with big, coordinated and collaborative programmes – the Large Hadron Collider, the Sloan Sky Survey, the Human Genome and its successor the 1000 Genomes project – hogs headlines and fascinates funders. But this big science makes up a small fraction of research being done. Whilst the big journals – Nature, Science - are often the first to publish breakthrough research, work in a vast array of smaller journals still contributes to scientific knowledge. Every day, PhD students and post-docs are slaving away in small labs building up the bulk of scientific data. In disciplines like chemistry, biology and astronomy they are taking advantage of the multitude of public datasets and analytical tools to make their own investigations. Jim Downing at the Unilever Centre for Molecular Informatics was one of the first to coin the term of “Long Tail Science” – that large numbers of small researcher units is an important concept. We are not just standing on the shoulders of a few giants but standing on the shoulders of a multitude of the average sized. Ten years ago I started up the myGrid e- Science project (http://www.mygrid.org.uk ) specifically to help the long tail bioinformatician, and later other long tail scientists from other disciplines. And it turns out that the software, services and methods we develop and deploy (the Taverna workflow system, myExperiment, BioCatalogue, SysMO-SEEK, MethodBox) apply just as well to the big science projects. -
Scientific Social Objects the Social Objects and Multidimensional Network of the Myexperiment Website
Scientific Social Objects The Social Objects and Multidimensional Network of the myExperiment Website David De Roure Sean Bechhofer and Carole Goble David Newman Oxford e-Research Centre School of Computer Science Electronics and Computer Science University of Oxford The University of Manchester University of Southampton Oxford, UK Manchester, UK Southampton, UK [email protected] [email protected] [email protected] Abstract— Scientific research is increasingly conducted digitally content, and with nearly 2000 workflows it provides the largest and online, and consequently we are seeing the emergence of new collection available. It is, however, characteristically a digital objects shared as part of the conduct and discourse of ‘boutique’ site with a specialist audience. science. These Scientific Social Objects are more than lumps of domain-specific data: they may comprise multiple components We propose that workflows, and myExperiment as a which can also be shared separately and independently, and some resource, provide a useful case study in scientific social contain descriptions of scientific processes from which new objects. Workflows are indeed social objects that are shared objects will be generated. Using the myExperiment social website and used by researchers, but significantly they are composite as a case study we explore Scientific Social Objects and discuss objects containing heterogeneous components which can be their evolution. shared separately and independently. Since they capture process they are also prescriptions for the creation of other Keywords—Scientific Social Object; workflow; Research Object objects. This distinguishes them from an object type such as a photo or collection of photos. I. -
Conceptual Open Hypermedia = Semantic Web?
Conceptual Open Hypermedia = Semantic Web? Carole Goble, Sean Bechhofer Information Management Group University of Manchester, UK Leslie Carr, Wendy Hall, David De Roure Intelligence, Agents, Multimedia Group University of Southampton, UK Take Home Message • The Semantic Web is a Web – a collection of linked nodes. • People will use it as well as agents, and for them, navigation of links is a key mechanism for exploring the space. Question • Can we use semantic metadata to support the construction of our hypertexts… • …and if so, does it help? Technologies for the Semantic Web Three pieces needed for implementing the Semantic Web: • Ontologies • Hypertext Architecture • Web Framework Ontologies Hypertext Web Framework Architecture Ontologies • The Semantic Web relies on the provision of Semantics. • Representations and tools are thus required for the – delivery; – construction; Ontologies – maintenance; – management of ontologies Hypertext Web Framework Architecture Hypertext Architecture • The Semantic Web is a Web. • Thus we need an underlying architecture supporting the notion of nodes and links. Ontologies Hypertext Web Framework Architecture Web Framework • A delivery mechanism that: – conforms to existing standards; – can scale Ontologies Hypertext Web Framework Architecture Historical Combinations • Open Hypermedia Systems and Link Services. – DLS – XLink • Ontology Services for document metadata. Ontologies – DAML+OIL, RDFS – SHOE, On2Broker • Conceptual Hypermedia – Nanard Hypertext – Tudhope Web Framework Architecture • The Semantic Web? COHSE Philosophy • Metadata can provide a mechanism not only for the support of resource discovery, but also for the provision of source anchors. • Annotation allows both linking into and out of a resource. Resource Discovery Link Construction COHSE Prototype System • A software agent that generates and presents links on behalf of both an author and a reader. -
James A. Hendler, Phd
James A. Hendler, PhD Tetherless World Professor & Director, Rensselaer Institute for Data Exploration and Applications Rensselaer Polytechnic Institute [email protected] US citizen, Security Clearance (on request) 518-276-4401 phone Social Security # (on request) 518-276-4464 Fax http://www.cs.rpi.edu/~hendler Twitter: @jahendler Education May, 1986 Department of Computer Science, Brown University, Providence, Rhode Island. PhD -- Computer Science, Artificial Intelligence. Thesis title: Integrating Marker-Passing and Problem-Solving: A Spreading Activation Approach to Improved Choice in Planning. May, 1983 Department of Computer Science, Brown University, Providence, Rhode Island. ScM -- Computer Science, Artificial Intelligence. Thesis project: Implementation of a pseudo-parallel marker-passing system within a frame representation language. May, 1982 Psychology Department, Southern Methodist University, Dallas, Tx. MS -- Cognitive Psychology, Human Factors Engineering. Thesis title: The Effects of a System-Imposed Grammatical Restriction on Interactive Natural Language Dialog May, 1978 Department of Computer Science, Yale University, New Haven, Ct. BS -- Computer Science, Artificial Intelligence. Experience in Higher Education 2007–pres. Rensselaer Polytechnic University (RPI) 1/07 – pres, Tetherless World Constellation Chair (Endowed), joint positions in computer and cognitive science departments 9/10 – 7/12, Program Director, Information Technology and Web Science 7/12 – 9/13, Department Head, Computer Science 9/13 –pres, Director, Institute for Data Exploration and Applications Affiliate faculty, Experimental Media and Performing Arts Center (2009- ) Affiliate, Industrial and System Engineering (2015 - ) Affiliate faculty, Center for Materials, Devices, and Integrated Systems (2015- ) 1 1986-12/06 University of Maryland, College Park 8/99-12/06. Full Professor, 8/92-7/99. Associate Professor 1/86-8/92, Assistant Professor Director, Joint Institute for Knowledge Discovery, 1/04-12/06. -
Lessons from Myexperiment: Two Insights Into Emerging E-Research Practice
Lessons from myExperiment: Two insights into emerging e-Research practice Lessons from myExperiment: Two insights into emerging e-Research practice David De Roure Carole Goble University of Southampton University of Manchester Southampton, UK Manchester, UK [email protected] [email protected] Abstract. The design of the myExperiment social web site for scientists adopted the Web 2.0 design principles, investigating the question: does Web 2.0 work for scientists? Two years later the site has thousands of users and the largest public collection of scientific workflows of its kind. Here we reflect on two of the Web 2.0 design principles and the insights we have gained into Web 2.0 for researchers. They reveal new forms of sharable object and new ways of assembling data and services. 1. Introduction The myExperiment Virtual Research Environment [1] has successfully adopted a Web 2.0 approach in delivering a social web site where scientists can safely publish their scientific workflows and other artefacts, share them with groups and find those of others. While it shares many characteristics with other Web 2.0 sites, the myExperiment’s distinctive features to meet the needs of its research user base are support for credit, attributions and licensing, fine control over privacy, a federation model and the ability to execute workflows. The myExperiment design was consistent with the Web 2.0 design patterns described by O’Reilly [2], a process we reported at the outset of the project [3]. A prototype was launched in July 2007 to gather requirements, and the current www.myexperiment.org site went live after a further 5 months of development and has since evolved substantially in “perpetual beta”. -
A Web 2.0 Virtual Research Environment
my Experiment – A Web 2.0 Virtual Research Environment David De Roure Carole Goble Electronics and Computer Science School of Computer Science University of Southampton, UK The University of Manchester, UK [email protected] [email protected] +44 23 8059 2418 +44 161 275 6195 ABSTRACT established as a means of automating the processing of e-Science has given rise to new forms of digital object in scientific data in a scalable and reusable way. the Virtual Research Environment which can usefully be my shared amongst collaborating scientists to assist in The Experiment Virtual Research Environment (VRE) generating new scientific results. In particular, descriptions provides a personalised environment which enables users to of Scientific Workflows capture pieces of scientific share, re-use and repurpose experiments. Our vision is that knowledge which may transcend their immediate scientists should be able to swap workflows and other scientific objects as easily as citizens can share documents, application and can be shared and reused in other my experiments. We are building the my Experiment Virtual photos and videos on the Web. Hence Experiment owes Research Environment to support scientists in sharing and far more to social networking websites such as MySpace collaboration with workflows and other objects. Rather than (www.myspace.com) and YouTube (www.youtube.com) adopting traditional methods prevalent in the e-Science than to the traditional portals of Grid computing, and is developer community, our approach draws upon the social immediately familiar to the new generation of scientists. software techniques characterised as Web 2.0. In this paper Where many e-Science projects have focused on bringing we report on the preliminary design work of my Experiment. -
Conceptual Open Hypermedia = the Semantic Web?
Conceptual Open Hypermedia = The Semantic Web? Carole Goble, Sean Bechhofer Les Carr, David De Roure, Wendy Hall Information Management Group Intelligence, Agents, Multimedia Department of Computer Science Department of Electronics and Computer Science, University of Manchester University of Southampton, Oxford Road Southampton SO17 1BJ, UK Manchester M13 9PL, UK +44 23 8059 3255 +44 161 275 2000 {carole, seanb}@cs.man.ac.uk {lac, dder, wh}@ecs.soton.ac.uk ABSTRACT 1. INTRODUCTION The Semantic Web is still a web, a collection of linked nodes. The Semantic Web—a web of data defined and linked in such Navigation of links is currently, and will remain for humans if a way that its meaning is explicitly interpretable by software not machines, a key mechanism for exploring the space. The processes rather than just being implicitly interpretable by Semantic Web is viewed by many as a knowledge base, a humans—is a vision held by many and articulated by Tim database or an indexed and searchable document collection; in Berners Lee [5]. The issues surrounding the implementation the work discussed here we view it as a hypertext. of the Semantic Web are the focus of much current research. Much of this work, however, focuses on particular issues such The aim of the COHSE project is to research into methods to as the implementation and representation of metadata, for improve significantly the quality, consistency and breadth of example the OIL and DAML web languages [12, 15] or the linking of Web documents at retrieval time (as readers browse activity of resource discovery, i.e.