Oracle Health Sciences Omics Data Bank Programmer’S Guide Release 3.0.2.1 E35680-12
Total Page:16
File Type:pdf, Size:1020Kb
Oracle® Health Sciences Omics Data Bank Programmer’s Guide Release 3.0.2.1 E35680-12 March 2016 Oracle Health Sciences Omics Data Bank Programmer’s Guide Release 3.0.2.1 E35680-12 Copyright © 2013, 2016, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government. This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group. This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services, except as set forth in an applicable agreement between you and Oracle. Contents Preface ................................................................................................................................................................. xi Audience....................................................................................................................................................... xi Disclaimer Regarding Third Party Data .................................................................................................. xi Documentation Accessibility..................................................................................................................... xi Finding Information and Patches on My Oracle Support..................................................................... xi Finding Documentation on Oracle Technology Network................................................................... xiii Related Documents ................................................................................................................................... xiii Conventions ............................................................................................................................................... xiv 1 Omics Data Model 1.1 Introduction ................................................................................................................................. 1-1 1.1.1 Reference Data...................................................................................................................... 1-2 1.1.2 Result Data............................................................................................................................ 1-3 1.2 Logical Data Model..................................................................................................................... 1-3 1.3 Reference Data Tables ................................................................................................................ 1-5 1.4 Result Data Tables ................................................................................................................... 1-12 1.4.1 Result Tables for Qualifier Metadata............................................................................. 1-16 1.4.2 Result Tables for Differential Expression...................................................................... 1-17 1.5 Table for Logging..................................................................................................................... 1-18 1.6 Aggregate Tables for Gene Expression................................................................................. 1-18 1.7 File and File Load Tables ........................................................................................................ 1-18 2 Prerequisites for Loading Data 2.1 Setting Up a Directory Object ................................................................................................... 2-1 2.2 Setting Up an Oracle Wallet ...................................................................................................... 2-2 2.3 Setting Up User Privileges for Querying or Loading Data................................................... 2-3 2.4 Integration with Oracle Health Sciences Cohort Explorer Data Model or Another External Data Model 2-4 2.4.1 Specimen and Vendor Number Requirement ................................................................ 2-6 2.5 Migrating W_EHA_RSLT_STUDY, W_EHA_SPEC_PATIENT and W_EHA_SPEC_ SUBJECT Tables 2-6 2.6 Reference Version Compatibility.............................................................................................. 2-7 2.7 Handling Newline Characters in Input Files.......................................................................... 2-8 2.8 Periodically Purge the Recycle Bin........................................................................................... 2-8 v 3 Loaders for Reference Data 3.1 Ensembl and SwissProt Loaders............................................................................................... 3-1 3.1.1 Installing the Loaders.......................................................................................................... 3-1 3.1.2 Files to Load.......................................................................................................................... 3-2 3.1.3 Loading the Data.................................................................................................................. 3-3 3.1.4 Running the Embl/Swissprot Loader with Named Command-Line Arguments..... 3-6 3.1.5 Index-Organized Tables Loader ........................................................................................ 3-8 3.1.6 Gathering Optimizer Statistics........................................................................................... 3-9 3.2 HUGO Loader ............................................................................................................................. 3-9 3.2.1 Description and Files to Load ............................................................................................ 3-9 3.2.2 Running the Loader............................................................................................................. 3-9 3.2.3 Command-Line Argument List ...................................................................................... 3-10 3.3 GVF Ensembl Loader .............................................................................................................. 3-11 3.3.1 Description and Files to Load ......................................................................................... 3-12 3.3.2 Running the Loader.......................................................................................................... 3-12 3.3.3 Command-Line Argument List ...................................................................................... 3-12 3.3.4 Gathering Optimizer Statistics........................................................................................ 3-14 3.4 Pathway Loader ......................................................................................................................