Architecting Hbase Applications a GUIDEBOOK for SUCCESSFUL DEVELOPMENT and DESIGN

Total Page:16

File Type:pdf, Size:1020Kb

Architecting Hbase Applications a GUIDEBOOK for SUCCESSFUL DEVELOPMENT and DESIGN Architecting HBase Applications A GUIDEBOOK FOR SUCCESSFUL DEVELOPMENT AND DESIGN Jean-Marc Spaggiari & Kevin O'Dell www.allitebooks.com www.allitebooks.com Architecting HBase Applications A Guidebook for Successful Development and Design Jean-Marc Spaggiari and Kevin O’Dell Beijing Boston Farnham Sebastopol Tokyo www.allitebooks.com Architecting HBase Applications by Jean-Marc Spaggiari and Kevin O’Dell Copyright © 2016 Jean-Marc Spaggiari and Kevin O’Dell. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://safaribooksonline.com). For more information, contact our corporate/ institutional sales department: 800-998-9938 or [email protected]. Editor: Marie Beaugureau Indexer: WordCo Indexing Services, Inc. Production Editor: Nicholas Adams Interior Designer: David Futato Copyeditor: Jasmine Kwityn Cover Designer: Karen Montgomery Proofreader: Amanda Kersey Illustrator: Rebecca Demarest August 2016: First Edition Revision History for the First Edition 2016-07-14: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781491915813 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Architecting HBase Applications, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. 978-1-491-91581-3 [LSI] www.allitebooks.com To my father. I wish you could have seen it... —Jean-Marc Spaggiari To my mother, who I think about every day; my father, who has always been there for me; and my beautiful wife Melanie and daughter Scotland, for putting up with all my com‐ plaining and the extra long hours. —Kevin O’Dell www.allitebooks.com www.allitebooks.com Table of Contents Foreword. xi Preface. xiii Part I. Introduction to HBase 1. What Is HBase?. 3 Column-Oriented Versus Row-Oriented 5 Implementation and Use Cases 5 2. HBase Principles. 7 Table Format 7 Table Layout 8 Table Storage 9 Internal Table Operations 15 Compaction 15 Splits (Auto-Sharding) 17 Balancing 19 Dependencies 19 HBase Roles 20 Master Server 21 RegionServer 21 Thrift Server 22 REST Server 22 3. HBase Ecosystem. 25 Monitoring Tools 25 v www.allitebooks.com Cloudera Manager 26 Apache Ambari 28 Hannibal 32 SQL 33 Apache Phoenix 33 Apache Trafodion 33 Splice Machine 34 Honorable Mentions (Kylin, Themis, Tephra, Hive, and Impala) 34 Frameworks 35 OpenTSDB 35 Kite 36 HappyBase 37 AsyncHBase 37 4. HBase Sizing and Tuning Overview. 39 Hardware 40 Storage 40 Networking 41 OS Tuning 42 Hadoop Tuning 43 HBase Tuning 44 Different Workload Tuning 46 5. Environment Setup. 49 System Requirements 50 Operating System 50 Virtual Machine 50 Resources 52 Java 53 HBase Standalone Installation 53 HBase in a VM 56 Local Versus VM 57 Local Mode 57 Virtual Linux Environment 58 QuickStart VM (or Equivalent) 58 Troubleshooting 59 IP/Name Configuration 59 Access to the /tmp Folder 59 Environment Variables 59 Available Memory 60 First Steps 61 Basic Operations 61 vi | Table of Contents www.allitebooks.com Import Code Examples 62 Testing the Examples 66 Pseudodistributed and Fully Distributed 68 Part II. Use Cases 6. Use Case: HBase as a System of Record. 73 Ingest/Pre-Processing 74 Processing/Serving 75 User Experience 79 7. Implementation of an Underlying Storage Engine. 83 Table Design 83 Table Schema 84 Table Parameters 85 Implementation 87 Data conversion 88 Generate Test Data 88 Create Avro Schema 89 Implement MapReduce Transformation 89 HFile Validation 94 Bulk Loading 95 Data Validation 96 Table Size 97 File Content 98 Data Indexing 100 Data Retrieval 104 Going Further 105 8. Use Case: Near Real-Time Event Processing. 107 Ingest/Pre-Processing 110 Near Real-Time Event Processing 111 Processing/Serving 112 9. Implementation of Near Real-Time Event Processing. 115 Application Flow 117 Kafka 117 Flume 118 HBase 118 Lily 120 Solr 120 Table of Contents | vii www.allitebooks.com Implementation 121 Data Generation 121 Kafka 122 Flume 123 Serializer 130 HBase 134 Lily 136 Solr 138 Testing 139 Going Further 140 10. Use Case: HBase as a Master Data Management Tool. 141 Ingest 142 Processing 143 11. Implementation of HBase as a Master Data Management Tool. 147 MapReduce Versus Spark 147 Get Spark Interacting with HBase 148 Run Spark over an HBase Table 148 Calling HBase from Spark 148 Implementing Spark with HBase 149 Spark and HBase: Puts 150 Spark on HBase: Bulk Load 154 Spark Over HBase 156 Going Further 160 12. Use Case: Document Store. 161 Serving 163 Ingest 164 Clean Up 166 13. Implementation of Document Store. 167 MOBs 167 Storage 169 Usage 170 Too Big 170 Consistency 172 Going Further 173 viii | Table of Contents www.allitebooks.com Part III. Troubleshooting 14. Too Many Regions. 177 Consequences 177 Causes 178 Misconfiguration 178 Misoperation 179 Solution 179 Before 0.98 179 Starting with 0.98 185 Prevention 187 Regions Size 187 Key and Table Design 189 15. Too Many Column Families. 191 Consequences 192 Memory 192 Compactions 193 Split 193 Causes, Solution, and Prevention 193 Delete a Column Family 194 Merge a Column Family 194 Separate a Column Family into a New Table 196 16. Hotspotting. 197 Consequences 197 Causes 198 Monotonically Incrementing Keys 198 Poorly Distributed Keys 198 Small Reference Tables 199 Applications Issues 200 Meta Region Hotspotting 200 Prevention and Solution 200 17. Timeouts and Garbage Collection. 203 Consequences 203 Causes 206 Storage Failure 206 Power-Saving Features 206 Network Failure 207 Solutions 207 Prevention 207 Table of Contents | ix Reduce Heap Size 208 Off-Heap BlockCache 208 Using the G1GC Algorithm 209 Configure Swappiness to 0 or 1 210 Disable Environment-Friendly Features 211 Hardware Duplication 211 18. HBCK and Inconsistencies. 213 HBase Filesystem Layout 213 Reading META 214 Reading HBase on HDFS 215 General HBCK Overview 217 Using HBCK 218 Index. 223 x | Table of Contents Foreword Throughout my 25 years in the software industry, I have experienced many disruptive changes: the Internet, the World Wide Web, mainframes, the client–server model, and more. I once worked on a team that was implementing software to make oil refineries safer. Our 40-person team shared a single DEC VAX—a machine no more powerful than the cell phone I use today. I still remember a day in the early 90s when we were scheduled to receive new machines. The machines being replaced were located on the third floor, and were large, heavy, washing machine–sized behemoths. A bunch of us waited inside the “machine room” to see how they would heave the new machines up all those flights of stairs. We imagined a giant crane, the street outside being blocked off…in short, a big operation! But what actually happened was quite different. A man entered the room carrying a small box under his arm. He placed it on top of one of the old “washing machines,” switched some cables around, did some tests, and left. That was it? Wow. Things change! That is the joy of being part of the tech industry: if we are willing to learn new things and move with it, we will never be bored, and will never cease to be amazed. What seemed impossible just a few years ago is suddenly commonplace. Big data is such a change. Big data is everywhere. The revolution started by Google with the Google File System and BigTable has now reached almost every tech com‐ pany, bank, government, and startup. Directly or indirectly, for better or for worse, these systems touch the lives of almost every human being on the planet. Apache HBase, like BigTable before it, has a unique place in this ecosystem: it pro‐ vides an updatable view into essentially unlimited datasets stored in an immutable, distributed filesystem. As such, it bridges the gap between pure file storage and OLTP/OLAP databases. xi HBase is everywhere: Facebook, Apple, Salesforce.com, Adobe, Yahoo!, Bloomberg, Huawei, The Gap, and many other companies use it. Google adopted the HBase API for its public cloud BigTable offering, a testament to the popularity of HBase. Despite its ubiquity, HBase is not plug and play. Distributed systems are hard. Terms such as partition tolerance, consistency, and availability inevitably creep into every dis‐ cussion, soon followed by even more esoteric terms such as hotspotting and salting. Scaling to hundreds or thousands of machines requires painful trade-offs, and these trade-offs make it harder to use these systems optimally. HBase is no exception. In my years in the HBase and Hadoop communities, I have experienced these chal‐ lenges firsthand. Use cases must be designed and architected carefully in order to play to the strengths of HBase. This book, written by two insiders who have been on the ground supporting custom‐ ers, is a much needed guide that details how to architect applications that will work well with HBase and that will scale to thousands of machines. If you are building or are planning to build new applications that need highly scalable and reliable storage, this book is for you.
Recommended publications
  • Modern Technologies of Bigdata Analytics: Case Study on Hadoop Platform Dharminder Yadav1, Umesh Chandra2
    International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected] Volume 6, Issue 4, July- August 2017 ISSN 2278-6856 Modern Technologies of BigData Analytics: Case study on Hadoop Platform Dharminder Yadav1, Umesh Chandra2 1Research Scholar, Computer Science Department, Glocal University, Saharanpur, UP, India 2PhD, Assistant Professor, Computer Science Department, Glocal University, Saharanpur, UP, India Abstract exchange, banking, on-line and on-site procuring [2]. Data is growing in the worldwide by daily activities, by using the Enormous Information as an examination subject from a hand-held devices, the Internet, and social media sites.This few focuses course of events, paper main discusses about data processing by using various geographic yield, disciplinary output, types of distributed tool of Hadoop.This present study cover most of the tools used papers, topical and theoretical advancement. The Big Data in Hadoop that help in parallel processing and MapReduce. challenges define in 6V's that are variety, velocity, volume, The day since BigData term introduced to database world , Hadoop act like a savior for most of the large, small value, veracity, and volatility [23]. organization. Researchers will definitely found a way through Hadoop to work huge data concept and most of the researchers are being done in the field of BigData analytics and data mining with the help of Hadoop. Keywords— Big Data, Hadoop, HDFS (Hadoop Distributed File System), NOSQL 1.INTRODUCTION Big Data provide storage and data processing facilities to Cloud computing [26]. Big data comes around 2005 but now it is used everywhere in daily life, which alludes to an expansive scope of informational collections practically difficult to manage, handle and prepare utilizing accessible Volume: Data is growing exponentially by daily activities regular apparatuses and information administration which we handle.
    [Show full text]
  • Provisioning Guide Version 2.3.0 Table of Contents
    Provisioning Guide Version 2.3.0 Table of Contents 1. About This Document . 3 1.1. Intended Audience . 3 1.2. New and Changed Information . 3 1.3. Notation Conventions . 4 1.4. Comments Encouraged . 6 2. Quick Start . 8 2.1. Download Binaries . 8 2.2. Unpack Installer and Server package . 9 2.3. Collect Information . 10 2.3.1. Java Location . 10 2.3.2. Data Nodes . 11 2.3.3. Distribution Manager URL . 11 2.4. Run Installer . 12 3. Introduction . 13 3.1. Security Considerations . 13 3.2. Provisioning Options . 14 3.3. Provisioning Activities . 14 3.4. Provisioning Master Node . 15 3.5. Trafodion Installer . 15 3.5.1. Usage . 16 3.5.2. Install vs. Upgrade . 17 3.5.3. Guided Setup . 17 3.5.4. Automated Setup . 17 3.6. Trafodion Provisioning Directories . 20 4. Requirements . 22 4.1. General Cluster and OS Requirements and Recommendations . 22 4.1.1. Hardware Requirements and Recommendations . 22 4.1.2. OS Requirements and Recommendations . 23 4.1.3. IP Ports . 24 4.2. Prerequisite Software . 25 4.2.1. Hadoop Software . 25 4.2.2. Software Packages . 25 4.3. Trafodion User IDs and Their Privileges . 26 4.3.1. Trafodion Runtime User . 26 4.3.2. Trafodion Provisioning User . 26 4.4. Recommended Configuration Changes . 28 4.4.1. Recommended Security Changes . 29 4.4.2. Recommended HDFS Configuration Changes . 29 4.4.3. Recommended HBase Configuration Changes . 29 5. Prepare . 31 5.1. Install Optional Workstation Software . 31 5.2. Configure Installation User ID .
    [Show full text]
  • IBM Big SQL (With Hbase), Splice Major Contributor to the Apache Be a Major Determinant“ Machine (Which Incorporates Hbase Madlib Project
    MarketReport Market Report Paper by Bloor Author Philip Howard Publish date December 2017 SQL Engines on Hadoop It is clear that“ Impala, LLAP, Hive, Spark and so on, perform significantly worse than products from vendors with a history in database technology. Author Philip Howard” Executive summary adoop is used for a lot of these are discussed in detail in this different purposes and one paper it is worth briefly explaining H major subset of the overall that SQL support has two aspects: the Hadoop market is to run SQL against version supported (ANSI standard 1992, Hadoop. This might seem contrary 1999, 2003, 2011 and so on) plus the to Hadoop’s NoSQL roots, but the robustness of the engine at supporting truth is that there are lots of existing SQL queries running with multiple investments in SQL applications that concurrent thread and at scale. companies want to preserve; all the Figure 1 illustrates an abbreviated leading business intelligence and version of the results of our research. analytics platforms run using SQL; and This shows various leading vendors, SQL skills, capabilities and developers and our estimates of their product’s are readily available, which is often not positioning relative to performance and The key the case for other languages. SQL support. Use cases are shown by the differentiators“ However, the market for SQL engines on colour of each bubble but for practical between products Hadoop is not mono-cultural. There are reasons this means that no vendor/ multiple use cases for deploying SQL on product is shown for more than two use are the use cases Hadoop and there are more than twenty cases, which is why we describe Figure they support, their different SQL on Hadoop platforms.
    [Show full text]
  • Esgyndb 版本说明2.4.2
    EsgynDB 版本说明 2.4.2 2018 年 7 月 版权 © Copyright 2018 Esgyn 公告 本文档包含的信息如有更改,恕不另行通知。 保留所有权利。除非版权法允许,否则在未经 Esgyn 预先书面许可的情况下, 严禁改编或翻译本手册的内容。Esgyn 对于本文中所包含的技术或编辑错误、遗 漏概不负责。 Esgyn 产品和服务附带的正式担保声明中规定的担保是该产品和服务享有的唯 一担保。本文中的任何信息均不构成额外的保修条款。 声明 Microsoft® 和 Windows® 是美国微软公司的注册商标。Java® 和 MySQL® 是 Oracle 及其子公司的注册商标。Bosun 是 Stack Exchange 的商标。Apache®、 Hadoop®、HBase®、Hive®、openTSDB®、Sqoop® 和 Trafodion® 是 Apache 软 件基金会的商标。Esgyn 和 EsgynDB 是 Esgyn 的商标。 目 录 1. 功能 ........................................................................................................ 2 EsgynDB 2.4.2 ................................................................................................... 2 EsgynDB 2.4.1 ................................................................................................... 2 EsgynDB 2.4.0 ................................................................................................... 2 2. 迁移要点................................................................................................ 3 2.1 在 EsgynDB 2.3.0 的基础上升级 ..................................................................... 3 2.1.1 系统 .......................................................................................................... 3 2.1.2 应用程序 .................................................................................................. 4 2.2 在 EsgynDB 2.2.0 或更早版本的基础上升级 ................................................. 5 2.2.1 系统 .......................................................................................................... 5 2.2.2 TRAF_HOME ..........................................................................................
    [Show full text]
  • Diplomová Práce
    Vysoká škola ekonomická v Praze Fakulta informatiky a statistiky Komparace distribucí frameworku Apache Hadoop DIPLOMOVÁ PRÁCE Studijní program: Aplikovaná informatika Studijní obor: Podniková informatika Autor: Mgr. Petr Todorov Vedoucí diplomové práce: doc. Ing. Ota Novotný, Ph.D. Praha, duben 2020 Prohlášení Prohlašuji, že jsem diplomovou práci Komparace distribucí frameworku Apache Hadoop vypracoval samostatně za použití v práci uvedených pramenů a literatury. V Praze dne 27. dubna 2020 ........................................................ Petr Todorov Poděkování Na tomto místě bych rád poděkoval především doc. Ing. Otovi Novotnému, Ph.D., za cenné rady, konzultace a metodické vedení při tvorbě této diplomové práce. Současně bych chtěl poděkovat své rodině za podporu nejen při tvorbě této práce, ale po celou dobu studia na Vysoké škole ekonomické. Abstrakt Práce se zaměřuje na komparaci distribucí frameworku pro zpracování big data Apache Hadoop. Teoretická část přináší stručný vhled do oblasti big data, detailní popis frameworku a ekosystému Apache Hadoop. Práce rovněž poskytuje přehled o situaci na trhu distribucí frameworku pro zpracování big data Apache Hadoop. Praktická část práce představuje možnosti zpracování big data v reálném čase v rámci vybraných distribucí frameworku Apache Hadoop formou realizace typové úlohy příjmu a zpracování příspěvků ze sociální sítě Twitter. Na základě zjištěných informací a výsledků provedení příjmu a zpracování big data je následně provedena komparace vybraných distribucí frameworku Apache Hadoop. Informace, které práce přináší, lze využít pro rychlou orientaci na trhu distribucí frameworku Apache Hadoop a výběr distribuce frameworku Apache Hadoop vhodné pro zpracování big data v reálném čase. Klíčová slova Big data, Apache Hadoop, Cloudera, Hortonworks, MapR, zpracování big data v reálném čase. JEL klasifikace C88 Other Computer Software, L86 Information and Internet Services • Computer Software, M15 IT Management.
    [Show full text]
  • Stored Procedures in Java (Spjs) Guide Version 2.1.0 Table of Contents
    Stored Procedures in Java (SPJs) Guide Version 2.1.0 Table of Contents 1. About This Document . 4 1.1. Intended Audience . 4 1.2. Document Organization . 4 1.3. New and Changed Information . 5 1.4. Notation Conventions . 5 1.5. Comments Encouraged . 8 2. Introduction . 9 2.1. What Is an SPJ? . 9 2.2. Benefits of SPJs . 9 2.2.1. Java Methods Callable From SQL . 10 2.2.2. Common Packaging Technique . 10 2.2.3. Security . 10 2.2.4. Increased Productivity . 10 2.2.5. Portability . 11 2.3. Use SPJs . 12 3. Get Started . 15 3.1. Required Client Software . 15 3.1.1. Java Development Kit . 15 3.2. Recommended Client Software . 16 3.2.1. Trafodion Command Interface (trafci) . 16 3.2.2. HP JDBC Type 4 Driver . 16 4. Develop SPJ Methods . 17 4.1. Guidelines for Writing SPJ Methods . 17 4.1.1. Signature of the Java Method . 17 4.1.2. Returning Output Values From the Java Method . 19 4.1.3. Returning Stored Procedure Result Sets . 20 4.1.4. Using the main() Method . 22 4.1.5. Null Input and Output . 23 4.1.6. Static Java Variables . 23 4.1.7. Nested Java Method Invocations . 23 4.2. Accessing Trafodion . 24 4.2.1. Use of java.sql.Connection Objects . 24 4.2.2. Using JDBC Method Calls . 26 4.2.3. Referring to Database Objects in an SPJ Method . 27 4.2.4. Using the SESSION_USER or CURRENT_USER Function in an SPJ Method . 29 4.2.5.
    [Show full text]
  • Provisioning Guide Version 2.0.0 Table of Contents
    Provisioning Guide Version 2.0.0 Table of Contents 1. About This Document . 3 1.1. Intended Audience . 3 1.2. New and Changed Information . 3 1.3. Notation Conventions . 4 1.4. Comments Encouraged . 6 2. Introduction . 8 2.1. Security Considerations . 8 2.2. Provisioning Options . 9 2.3. Provisioning Activities . 9 2.4. Provisioning Master Node . 10 2.5. Trafodion Installer . 10 2.5.1. Usage . 11 2.5.2. Install vs. Upgrade . 12 2.5.3. Guided Setup . 12 2.5.4. Automated Setup . 12 2.6. Trafodion Provisioning Directories . 17 3. Requirements . 18 3.1. General Cluster and OS Requirements and Recommendations . 18 3.1.1. Hardware Requirements and Recommendations . 18 3.1.2. OS Requirements and Recommendations . 19 3.1.3. IP Ports . 20 3.2. Prerequisite Software . 21 3.2.1. Hadoop Software . 21 3.2.2. Software Packages . 22 3.3. Trafodion User IDs and Their Privileges . 23 3.3.1. Trafodion Runtime User . 23 3.3.2. Trafodion Provisioning User . 23 3.4. Required Configuration Changes . 25 3.4.1. Operating System Changes . 25 3.4.2. ZooKeeper Changes . 26 3.4.3. HDFS Changes . 26 3.4.4. HBase Changes . 27 3.5. Recommended Configuration Changes . 28 3.5.1. Recommended Security Changes . 28 3.5.2. Recommended HDFS Configuration Changes . 28 3.5.3. Recommended HBase Configuration Changes . 29 4. Prepare . 30 4.1. Install Optional Workstation Software . 30 4.2. Configure Installation User ID . 30 4.3. Disable requiretty . 31 4.4. Verify OS Requirements and Recommendations . 31 4.5.
    [Show full text]
  • Load and Transform Guide Version 2.2.0 Table of Contents
    Load and Transform Guide Version 2.2.0 Table of Contents 1. About This Document . 5 1.1. Intended Audience . 5 1.2. New and Changed Information . 5 1.3. Notation Conventions . 6 1.4. Comments Encouraged . 8 2. Introduction . 9 2.1. Load Methods . 9 2.1.1. Insert Types . 11 2.2. Unload . 12 3. Tables and Indexes . 13 3.1. Choose Primary Key . 13 3.2. Salting . 13 3.3. Compression and Encoding . 14 3.4. Create Tables and Indexes . 14 3.5. Update Statistics . 15 3.5.1. Default Sampling . 15 3.6. Generate Single-Column and Multi-Column Histograms From One Statement . 16 3.6.1. Enable Update Statistics Automation . 16 3.6.2. Regenerate Histograms . 17 4. Bulk Load . 18 4.1. Load Data From Trafodion Tables . 18 4.1.1. Example . 18 4.2. Load Data From HDFS Files . 18 4.2.1. Example . 19 4.3. Load Data From Hive Tables . 20 4.3.1. Example . 21 4.4. Load Data From External Databases . 21 4.4.1. Install Required Software . 22 4.4.2. Sample Sqoop Commands . 22 4.4.3. Example . 23 5. Trickle Load . 24 5.1. Improving Throughput . 24 5.2. odb . 25 5.2.1. odb Throughput . 26 5.2.2. odb Load . 27 5.2.3. odb Copy . 29 5.2.4. odb Extract . 30 5.2.5. odb Transform . 32 6. Bulk Unload . 35 7. Monitor Progress . 36 7.1. INSERT and UPSERT . 36 7.2. UPSERT USING LOAD . 36 7.3. LOAD . 36 8.
    [Show full text]
  • Client Installation Guide Version 1.3.0 Table of Contents
    Client Installation Guide Version 1.3.0 Table of Contents 1. About This Document . 2 1.1. Intended Audience . 2 1.2. New and Changed Information . 2 1.3. Notation Conventions . 2 1.4. Comments Encouraged . 5 2. Introduction . 6 2.1. Client Summary . 6 2.1.1. JDBC-Based Clients . 6 2.1.2. ODBC-Based Clients . 6 2.2. Download Installation Package . 7 2.2.1. Windows Download . 7 2.2.2. Linux Download . 8 3. Install JDBC Type-4 Driver . 9 3.1. Installation Requirements . 9 3.1.1. Java Environment . 9 3.2. Installation Instructions . 14 3.2.1. Install JDBC Type-4 Driver . 14 3.3. Set Up Client Environment . 15 3.3.1. Java Development . 15 3.3.2. Configure Applications . 16 3.4. Test Programs . 17 3.5. Uninstall JDBC Type-4 Driver . 19 3.6. Reinstall JDBC Type-4 Driver . 21 4. Install trafci . 22 4.1. Installation Requirements . 22 4.1.1. Install Perl or Python . 22 4.2. Installation Instructions . 23 4.2.1. Run Executable JAR Installer . 23 4.3. Post-Installation Instructions . 35 4.3.1. Verify Installed Software Files . 35 4.4. Test Launching trafci . 36 4.4.1. Windows Example . 37 4.4.2. Linux Example . 38 4.5. Uninstall trafci . 39 5. Configure DbVisualizer . 40 5.1. Prerequisite Software . 40 5.2. Configuration Instructions . 40 5.2.1. Disable Connection Validation Select Option . 40 5.2.2. Register JDBC Type-4 Driver . 42 5.2.3. Connect to Trafodion . 43 6. Configure SQuirreL Client . 44 6.1.
    [Show full text]
  • ASF FY2021 Annual Report
    0 Contents The ASF at-a-Glance 4 President’s Report 6 Treasurer’s Report 8 FY2021 Financial Statement 12 Fundraising 14 Legal Affairs 19 Infrastructure 21 Security 22 Data Privacy 25 Marketing & Publicity 26 Brand Management 40 Conferences 43 Community Development 44 Diversity & Inclusion 46 Projects and Code 48 Contributions 65 ASF Members 72 Emeritus Members 77 Memorial 78 Contact 79 FY2021 Annual Report Page 1 The ASF at-a-Glance "The Switzerland of Open Source..." — Matt Asay, InfoWorld The World’s Largest Open Source Foundation The Apache Software Foundation (ASF) incorporated in 1999 with the mission of providing software for the common good. Today the ASF is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing $22B+ worth of software to the public at 100% no cost. ASF projects are integral to nearly every aspect of modern computing, benefitting billions worldwide. Change Agents The ASF was founded by developers of the Apache HTTP Server to protect the core interests of those contributing to and using our open source projects. The ASF’s all-volunteer community now includes over 8,200 committers, involved in over 350 projects that have been organized by about 200 independent project management committees, and is overseen by 850+ ASF members. The Foundation is a globally-distributed, virtual organization with contributors on every continent. Apache projects power countless mission-critical solutions worldwide, and have spearheaded industry breakthroughs in dozens of categories, from Big Data to Web Frameworks. More than three dozen future projects and their communities are currently being mentored in the Apache Incubator.
    [Show full text]
  • Provisioning Guide Version 2.1.0 Table of Contents
    Provisioning Guide Version 2.1.0 Table of Contents 1. About This Document . 2 1.1. Intended Audience . 2 1.2. New and Changed Information . 2 1.3. Notation Conventions . 3 1.4. Comments Encouraged . 5 2. Quick Start . 7 2.1. Download Binaries . 7 2.2. Unpack Installer . 8 2.3. Collect Information . 9 2.3.1. Location of Trafodion Server-Side Binary . 9 2.3.2. Java Location . 9 2.3.3. Data Nodes . 10 2.3.4. Trafodion Runtime User Home Directory . 10 2.3.5. Distribution Manager URL . 10 2.4. Run Installer . 11 3. Introduction . 20 3.1. Security Considerations . 20 3.2. Provisioning Options . 21 3.3. Provisioning Activities . 22 3.4. Provisioning Master Node . 22 3.5. Trafodion Installer . 23 3.5.1. Usage . 24 3.5.2. Install vs. Upgrade . 25 3.5.3. Guided Setup . 25 3.5.4. Automated Setup . 25 3.6. Trafodion Provisioning Directories . 31 4. Requirements . 32 4.1. General Cluster and OS Requirements and Recommendations . 32 4.1.1. Hardware Requirements and Recommendations . 32 4.1.2. OS Requirements and Recommendations . 33 4.1.3. IP Ports . 34 4.2. Prerequisite Software . 35 4.2.1. Hadoop Software . 35 4.2.2. Software Packages . 36 4.3. Trafodion User IDs and Their Privileges . 37 4.3.1. Trafodion Runtime User . 37 4.3.2. Trafodion Provisioning User . 37 4.4. Required Configuration Changes . 39 4.4.1. Operating System Changes . 40 4.4.2. ZooKeeper Changes . 41 4.4.3. HDFS Changes . 41 4.4.4.
    [Show full text]
  • Esgyndb版本说明2.4.1 (中文版)
    EsgynDB 版本说明 2.4.1 2018 年 5 月 版权 © Copyright 2018 Esgyn 公告 本文档包含的信息如有更改,恕不另行通知。 保留所有权利。除非版权法允许,否则在未经 Esgyn 预先书面许可的情况下, 严禁改编或翻译本手册的内容。Esgyn 对于本文中所包含的技术或编辑错误、遗 漏概不负责。 Esgyn 产品和服务附带的正式担保声明中规定的担保是该产品和服务享有的唯 一担保。本文中的任何信息均不构成额外的保修条款。 声明 Microsoft® 和 Windows® 是美国微软公司的注册商标。Java® 和 MySQL® 是 Oracle 及其子公司的注册商标。Bosun 是 Stack Exchange 的商标。Apache®、 Hadoop®、HBase®、Hive®、openTSDB®、Sqoop® 和 Trafodion® 是 Apache 软 件基金会的商标。Esgyn 和 EsgynDB 是 Esgyn 的商标。 目 录 1. 功能 ............................................................................................................... 2 EsgynDB 2.4.1 ................................................................................................... 2 EsgynDB 2.4.0 ................................................................................................... 2 2. 迁移要点 ...................................................................................................... 3 2.1 在 EsgynDB 2.3.0 的基础上升级 ..................................................................... 3 2.1.1 系统 .......................................................................................................... 3 2.1.2 应用程序 .................................................................................................. 4 2.2 在 EsgynDB 2.2.0 或更早版本的基础上升级 ................................................. 5 2.2.1 系统 .......................................................................................................... 5 2.2.2 TRAF_HOME ........................................................................................... 5 2.2.3 日志文件 .................................................................................................
    [Show full text]