Hortonworks Data Platform for Hdinsight Date of Publish: 2020-07-20

Total Page:16

File Type:pdf, Size:1020Kb

Hortonworks Data Platform for Hdinsight Date of Publish: 2020-07-20 Release Notes for HDInsight 2 Hortonworks Data Platform for HDInsight Date of Publish: 2020-07-20 https://docs.hortonworks.com Contents HDP 2.6.5 Release Notes..........................................................................................3 Component Versions.............................................................................................................................................3 New Features........................................................................................................................................................ 4 Deprecation Notices..............................................................................................................................................4 Terminology.............................................................................................................................................. 4 Unsupported Features........................................................................................................................................... 4 Technical Preview Features......................................................................................................................4 Behavioral Changes.............................................................................................................................................. 6 Apache Patch Information....................................................................................................................................8 Accumulo.................................................................................................................................................. 8 Atlas...........................................................................................................................................................8 Calcite......................................................................................................................................................14 DataFu..................................................................................................................................................... 14 Falcon...................................................................................................................................................... 15 Flume.......................................................................................................................................................15 Hadoop.................................................................................................................................................... 17 HBase.................................................................................................................................................... 115 Hive....................................................................................................................................................... 121 Kafka..................................................................................................................................................... 137 Mahout...................................................................................................................................................137 Knox...................................................................................................................................................... 138 Oozie..................................................................................................................................................... 139 Phoenix..................................................................................................................................................142 Pig..........................................................................................................................................................146 Ranger................................................................................................................................................... 147 Spark......................................................................................................................................................150 Livy....................................................................................................................................................... 159 Sqoop.....................................................................................................................................................159 Storm..................................................................................................................................................... 161 Slider..................................................................................................................................................... 163 Tez.........................................................................................................................................................163 Zeppelin.................................................................................................................................................165 ZooKeeper.............................................................................................................................................166 Fixed Common Vulnerabilities and Exposures................................................................................................166 Fixed Issues.......................................................................................................................................................170 HDInsight Fixed Issues.................................................................................................................................... 185 Known Issues....................................................................................................................................................198 HDInsight Known Issues..................................................................................................................................204 Documentation Errata....................................................................................................................................... 204 Legal Information............................................................................................................................................. 205 Release Notes for HDInsight HDP 2.6.5 Release Notes HDP 2.6.5 Release Notes This document provides you with the latest information about the Hortonworks Data Platform (HDP) 2.6.5 release and its product documentation. Component Versions List of the official Apache component versions for this version of Hortonworks Data Platform (HDP) specific for this HDInsight release. The official Apache versions of all HDP 2.6.5 components are listed below. All components listed here are official Apache releases of the most recent stable versions available. The Cloudera approach is to provide patches only when necessary, to ensure the interoperability of components. Unless you are explicitly directed by Hortonworks Support to take a patch update, each of the HDP components should remain at the following package version levels, to ensure a certified and supported copy of HDP 2.6.5. Official Apache versions for HDP 2.6.5: • Apache Accumulo 1.7.0 • Apache Atlas 0.8.0 • Apache Calcite 1.2.0 • Apache DataFu 1.3.0 • Apache Falcon 0.10.0 • Apache Flume 1.5.2 (deprecated) • Apache Hadoop 2.7.3 • Apache HBase 1.1.2 • Apache Hive 1.2.1 • Apache Hive 2.1.0 • Apache Kafka 1.1.0 • Apache Knox 0.12.0 • Apache Mahout 0.9.0+ (deprecated) • Apache Oozie 4.2.0 • Apache Phoenix 4.7.0 • Apache Pig 0.16.0 • Apache Ranger 0.7.0 • Apache Slider 0.92.0 (deprecated) • Apache Spark 1.6.3 • Apache Spark 2.3.2 • Apache Sqoop 1.4.6 • Apache Storm 1.1.0 • Apache TEZ 0.7.0 • Apache Zeppelin 0.7.3 • Apache ZooKeeper 3.4.6 Later versions of a few Apache components are sometimes bundled in the HDP distribution in addition to the versions listed above. In this case, these later versions are listed in the Technical Preview Features table and should not substitute for the Apache component versions of the above list in a production environment. Additional component versions: • Cascading 3.0.0 (deprecated) • Druid 0.10.1 3 Release Notes for HDInsight HDP 2.6.5 Release Notes • Hue 2.6.1 (deprecated) 1 Note: For information on open source software licensing and notices, please refer to the Licenses and Notices files included with the software install package. New Features This section highlights new features in HDP 2.6.5. Deprecation Notices This section points out any technology from previous releases that has been deprecated, moved, or removed from this release. Use this section as a guide for your implementation plans. Terminology Items in this section are designated as follows: Deprecated Technology that Hortonworks is removing in a future HDP release. Marking an item as deprecated gives you time to plan for removal in a future HDP release. Moving Technology that Hortonworks is moving from a future HDP release and is making available through an alternative Hortonworks offering or subscription. Marking an item as moving gives you time to plan for removal in a future HDP release and plan for the alternative Hortonworks offering or subscription for the technology. Removed Technology that Hortonworks has removed from HDP and is no longer available or supported as of this release. Take note of technology marked as removed since it can potentially affect
Recommended publications
  • Open Source Used in DNAC-Wide Area Bonjour Magneto
    Open Source Used In DNAC-Wide Area Bonjour Magneto Cisco Systems, Inc. www.cisco.com Cisco has more than 200 offices worldwide. Addresses, phone numbers, and fax numbers are listed on the Cisco website at www.cisco.com/go/offices. Text Part Number: 78EE117C99-1090203837 Open Source Used In DNAC-Wide Area Bonjour Magneto 1 This document contains licenses and notices for open source software used in this product. With respect to the free/open source software listed in this document, if you have any questions or wish to receive a copy of any source code to which you may be entitled under the applicable free/open source license(s) (such as the GNU Lesser/General Public License), please contact us at [email protected]. In your requests please include the following reference number 78EE117C99-1090203837 Contents 1.1 javax-activation 1.2.0 1.1.1 Available under license 1.2 metrics-servlets 3.1.0 1.3 mongodb-driver 3.0.4 1.4 jaxb-core 2.3.0 1.4.1 Available under license 1.5 antlr 2.7.6 1.5.1 Available under license 1.6 spring-boot-autoconfigure 1.5.12.RELEASE 1.7 spring-instrument 4.3.19.RELEASE 1.7.1 Available under license 1.8 nimbus-jose-jwt 4.3.1 1.9 javax-inject 1 1.9.1 Available under license 1.10 json-smart 1.3.1 1.11 opentracing-util 0.31.0 1.12 xpp3-min 1.1.3.4.O 1.12.1 Notifications 1.12.2 Available under license 1.13 ojdbc 6 1.14 jax-ws-api 2.3.0 1.15 aspect-j 1.9.2 1.15.1 Available under license 1.16 jetty-util 9.3.27.v20190418 1.17 unirest-java 1.4.5 1.18 jetty-continuation 9.3.27.v20190418 Open Source Used In
    [Show full text]
  • Java Linksammlung
    JAVA LINKSAMMLUNG LerneProgrammieren.de - 2020 Java einfach lernen (klicke hier) JAVA LINKSAMMLUNG INHALTSVERZEICHNIS Build ........................................................................................................................................................... 4 Caching ....................................................................................................................................................... 4 CLI ............................................................................................................................................................... 4 Cluster-Verwaltung .................................................................................................................................... 5 Code-Analyse ............................................................................................................................................. 5 Code-Generators ........................................................................................................................................ 5 Compiler ..................................................................................................................................................... 6 Konfiguration ............................................................................................................................................. 6 CSV ............................................................................................................................................................. 6 Daten-Strukturen
    [Show full text]
  • Learning Apache Mahout Classification Table of Contents
    Learning Apache Mahout Classification Table of Contents Learning Apache Mahout Classification Credits About the Author About the Reviewers www.PacktPub.com Support files, eBooks, discount offers, and more Why subscribe? Free access for Packt account holders Preface What this book covers What you need for this book Who this book is for Conventions Reader feedback Customer support Downloading the example code Downloading the color images of this book Errata Piracy Questions 1. Classification in Data Analysis Introducing the classification Application of the classification system Working of the classification system Classification algorithms Model evaluation techniques The confusion matrix The Receiver Operating Characteristics (ROC) graph Area under the ROC curve The entropy matrix Summary 2. Apache Mahout Introducing Apache Mahout Algorithms supported in Mahout Reasons for Mahout being a good choice for classification Installing Mahout Building Mahout from source using Maven Installing Maven Building Mahout code Setting up a development environment using Eclipse Setting up Mahout for a Windows user Summary 3. Learning Logistic Regression / SGD Using Mahout Introducing regression Understanding linear regression Cost function Gradient descent Logistic regression Stochastic Gradient Descent Using Mahout for logistic regression Summary 4. Learning the Naïve Bayes Classification Using Mahout Introducing conditional probability and the Bayes rule Understanding the Naïve Bayes algorithm Understanding the terms used in text classification Using the Naïve Bayes algorithm in Apache Mahout Summary 5. Learning the Hidden Markov Model Using Mahout Deterministic and nondeterministic patterns The Markov process Introducing the Hidden Markov Model Using Mahout for the Hidden Markov Model Summary 6. Learning Random Forest Using Mahout Decision tree Random forest Using Mahout for Random forest Steps to use the Random forest algorithm in Mahout Summary 7.
    [Show full text]
  • Hadoop Tutorials  Cassandra  Hector API  Request Tutorial  About
    Home Big Data Hadoop Tutorials Cassandra Hector API Request Tutorial About LABELS: HADOOP-TUTORIAL, HDFS 3 OCTOBER 2013 Hadoop Tutorial: Part 1 - What is Hadoop ? (an Overview) Hadoop is an open source software framework that supports data intensive distributed applications which is licensed under Apache v2 license. At-least this is what you are going to find as the first line of definition on Hadoop in Wikipedia. So what is data intensive distributed applications? Well data intensive is nothing but BigData (data that has outgrown in size) anddistributed applications are the applications that works on network by communicating and coordinating with each other by passing messages. (say using a RPC interprocess communication or through Message-Queue) Hence Hadoop works on a distributed environment and is build to store, handle and process large amount of data set (in petabytes, exabyte and more). Now here since i am saying that hadoop stores petabytes of data, this doesn't mean that Hadoop is a database. Again remember its a framework that handles large amount of data for processing. You will get to know the difference between Hadoop and Databases (or NoSQL Databases, well that's what we call BigData's databases) as you go down the line in the coming tutorials. Hadoop was derived from the research paper published by Google on Google File System(GFS) and Google's MapReduce. So there are two integral parts of Hadoop: Hadoop Distributed File System(HDFS) and Hadoop MapReduce. Hadoop Distributed File System (HDFS) HDFS is a filesystem designed for storing very large files with streaming data accesspatterns, running on clusters of commodity hardware.
    [Show full text]
  • Scalable Cloud Computing
    Scalable Cloud Computing Keijo Heljanko Department of Computer Science and Engineering School of Science Aalto University [email protected] 2.10-2013 Mobile Cloud Computing - Keijo Heljanko (keijo.heljanko@aalto.fi) 1/57 Guest Lecturer I Guest Lecturer: Assoc. Prof. Keijo Heljanko, Department of Computer Science and Engineering, Aalto University, I Email: [email protected] I Homepage: https://people.aalto.fi/keijo_heljanko I For more info into today’s topic, attend the course: “T-79.5308 Scalable Cloud Computing” Mobile Cloud Computing - Keijo Heljanko (keijo.heljanko@aalto.fi) 2/57 Business Drivers of Cloud Computing I Large data centers allow for economics of scale I Cheaper hardware purchases I Cheaper cooling of hardware I Example: Google paid 40 MEur for a Summa paper mill site in Hamina, Finland: Data center cooled with sea water from the Baltic Sea I Cheaper electricity I Cheaper network capacity I Smaller number of administrators / computer I Unreliable commodity hardware is used I Reliability obtained by replication of hardware components and a combined with a fault tolerant software stack Mobile Cloud Computing - Keijo Heljanko (keijo.heljanko@aalto.fi) 3/57 Cloud Computing Technologies A collection of technologies aimed to provide elastic “pay as you go” computing I Virtualization of computing resources: Amazon EC2, Eucalyptus, OpenNebula, Open Stack Compute, . I Scalable file storage: Amazon S3, GFS, HDFS, . I Scalable batch processing: Google MapReduce, Apache Hadoop, PACT, Microsoft Dryad, Google Pregel, Spark, ::: I Scalable datastore: Amazon Dynamo, Apache Cassandra, Google Bigtable, HBase,. I Distributed Coordination: Google Chubby, Apache Zookeeper, . I Scalable Web applications hosting: Google App Engine, Microsoft Azure, Heroku, .
    [Show full text]
  • Empirical Study on the Usage of Graph Query Languages in Open Source Java Projects
    Empirical Study on the Usage of Graph Query Languages in Open Source Java Projects Philipp Seifer Johannes Härtel Martin Leinberger University of Koblenz-Landau University of Koblenz-Landau University of Koblenz-Landau Software Languages Team Software Languages Team Institute WeST Koblenz, Germany Koblenz, Germany Koblenz, Germany [email protected] [email protected] [email protected] Ralf Lämmel Steffen Staab University of Koblenz-Landau University of Koblenz-Landau Software Languages Team Koblenz, Germany Koblenz, Germany University of Southampton [email protected] Southampton, United Kingdom [email protected] Abstract including project and domain specific ones. Common applica- Graph data models are interesting in various domains, in tion domains are management systems and data visualization part because of the intuitiveness and flexibility they offer tools. compared to relational models. Specialized query languages, CCS Concepts • General and reference → Empirical such as Cypher for property graphs or SPARQL for RDF, studies; • Information systems → Query languages; • facilitate their use. In this paper, we present an empirical Software and its engineering → Software libraries and study on the usage of graph-based query languages in open- repositories. source Java projects on GitHub. We investigate the usage of SPARQL, Cypher, Gremlin and GraphQL in terms of popular- Keywords Empirical Study, GitHub, Graphs, Query Lan- ity and their development over time. We select repositories guages, SPARQL, Cypher, Gremlin, GraphQL based on dependencies related to these technologies and ACM Reference Format: employ various popularity and source-code based filters and Philipp Seifer, Johannes Härtel, Martin Leinberger, Ralf Lämmel, ranking features for a targeted selection of projects.
    [Show full text]
  • Oracle Metadata Management V12.2.1.3.0 New Features Overview
    An Oracle White Paper October 12 th , 2018 Oracle Metadata Management v12.2.1.3.0 New Features Overview Oracle Metadata Management version 12.2.1.3.0 – October 12 th , 2018 New Features Overview Disclaimer This document is for informational purposes. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described in this document remains at the sole discretion of Oracle. This document in any form, software or printed matter, contains proprietary information that is the exclusive property of Oracle. This document and information contained herein may not be disclosed, copied, reproduced, or distributed to anyone outside Oracle without prior written consent of Oracle. This document is not part of your license agreement nor can it be incorporated into any contractual agreement with Oracle or its subsidiaries or affiliates. 1 Oracle Metadata Management version 12.2.1.3.0 – October 12 th , 2018 New Features Overview Table of Contents Executive Overview ............................................................................ 3 Oracle Metadata Management 12.2.1.3.0 .......................................... 4 METADATA MANAGER VS METADATA EXPLORER UI .............. 4 METADATA HOME PAGES ........................................................... 5 METADATA QUICK ACCESS ........................................................ 6 METADATA REPORTING .............................................................
    [Show full text]
  • MÁSTER EN INGENIERÍA WEB Proyecto Fin De Máster
    UNIVERSIDAD POLITÉCNICA DE MADRID Escuela Técnica Superior de Ingeniería de Sistemas Informáticos MÁSTER EN INGENIERÍA WEB Proyecto Fin de Máster …Estudio Conceptual de Big Data utilizando Spring… Autor Gabriel David Muñumel Mesa Tutor Jesús Bernal Bermúdez 1 de julio de 2018 Estudio Conceptual de Big Data utilizando Spring AGRADECIMIENTOS Gracias a mis padres Julian y Miriam por todo el apoyo y empeño en que siempre me mantenga estudiando. Gracias a mi tia Gloria por sus consejos e ideas. Gracias a mi hermano José Daniel y mi cuñada Yule por siempre recordarme que con trabajo y dedicación se pueden alcanzar las metas. [UPM] Máster en Ingeniería Web RESUMEN Big Data ha sido el término dado para aglomerar la gran cantidad de datos que no pueden ser procesados por los métodos tradicionales. Entre sus funciones principales se encuentran la captura de datos, almacenamiento, análisis, búsqueda, transferencia, visualización, monitoreo y modificación. Las empresas han visto en Big Data una poderosa herramienta para mejorar sus negocios en una economía mundial basada firmemente en el conocimiento. Los datos son el combustible para las compañías modernas y, por lo tanto, dar sentido a estos datos permite realmente comprender las conexiones invisibles dentro de su origen. En efecto, con mayor información se toman mejores decisiones, permitiendo la creación de estrategias integrales e innovadoras que garanticen resultados exitosos. Dada la creciente relevancia de Big Data en el entorno profesional moderno ha servido como motivación para la realización de este proyecto. Con la utilización de Java como software de desarrollo y Spring como framework web se desea analizar y comprobar qué herramientas ofrecen estas tecnologías para aplicar procesos enfocados en Big Data.
    [Show full text]
  • Apache Sentry
    Apache Sentry Prasad Mujumdar [email protected] [email protected] Agenda ● Various aspects of data security ● Apache Sentry for authorization ● Key concepts of Apache Sentry ● Sentry features ● Sentry architecture ● Integration with Hadoop ecosystem ● Sentry administration ● Future plans ● Demo ● Questions Who am I • Software engineer at Cloudera • Committer and PPMC member of Apache Sentry • also for Apache Hive and Apache Flume • Part of the the original team that started Sentry work Aspects of security Perimeter Access Visibility Data Authentication Authorization Audit, Lineage Encryption, what user can do data origin, usage Kerberos, LDAP/AD Masking with data Data access Access ● Provide user access to data Authorization ● Manage access policies what user can do ● Provide role based access with data Agenda ● Various aspects of data security ● Apache Sentry for authorization ● Key concepts of Apache Sentry ● Sentry features ● Sentry architecture ● Integration with Hadoop ecosystem ● Sentry administration ● Future plans ● Demo ● Questions Apache Sentry (Incubating) Unified Authorization module for Hadoop Unlocks Key RBAC Requirements Secure, fine-grained, role-based authorization Multi-tenant administration Enforce a common set of policies across multiple data access path in Hadoop. Key Capabilities of Sentry Fine-Grained Authorization Permissions on object hierarchie. Eg, Database, Table, Columns Role-Based Authorization Support for role templetes to manage authorization for a large set of users and data objects Multi Tanent Administration
    [Show full text]
  • Apache Oozie Apache Oozie Get a Solid Grounding in Apache Oozie, the Workflow Scheduler System for “In This Book, the Managing Hadoop Jobs
    Apache Oozie Apache Oozie Apache Get a solid grounding in Apache Oozie, the workflow scheduler system for “In this book, the managing Hadoop jobs. In this hands-on guide, two experienced Hadoop authors have striven for practitioners walk you through the intricacies of this powerful and flexible platform, with numerous examples and real-world use cases. practicality, focusing on Once you set up your Oozie server, you’ll dive into techniques for writing the concepts, principles, and coordinating workflows, and learn how to write complex data pipelines. tips, and tricks that Advanced topics show you how to handle shared libraries in Oozie, as well developers need to get as how to implement and manage Oozie’s security capabilities. the most out of Oozie. ■ Install and confgure an Oozie server, and get an overview of A volume such as this is basic concepts long overdue. Developers ■ Journey through the world of writing and confguring will get a lot more out of workfows the Hadoop ecosystem ■ Learn how the Oozie coordinator schedules and executes by reading it.” workfows based on triggers —Raymie Stata ■ Understand how Oozie manages data dependencies CEO, Altiscale ■ Use Oozie bundles to package several coordinator apps into Oozie simplifies a data pipeline “ the managing and ■ Learn about security features and shared library management automating of complex ■ Implement custom extensions and write your own EL functions and actions Hadoop workloads. ■ Debug workfows and manage Oozie’s operational details This greatly benefits Apache both developers and Mohammad Kamrul Islam works as a Staff Software Engineer in the data operators alike.” engineering team at Uber.
    [Show full text]
  • Apache Mahout User Recommender
    Apache Mahout User Recommender Whiniest Peirce upstage her russias so isochronously that Mead scat very sore. Indicative and wooden Bartholomeus reports her Renfrew whets wondrously or emulsifies correspondingly, is Bennie cranky? Sidnee overflies esuriently while effaceable Rodrigo diabolizes lamentably or rumpling conversationally. Mathematically analyzing how frequent user experience for you can provide these are using intelligent algorithms labeled with. My lantern is this. The prior data set is a search for your recommendations help recommendation. This architecture is prepared to alarm the needs of Netflix, in order say make their choices in your timely manner. In the thresholdbased selection, Support Vector Machines and thrift on. Early adopter architecture must also likely to users to make mahout apache mahout to. It up thus quick to access how valuable recommender systems, creating a partially combined system and grade set. Students that achieve good grades in all their years of study are likely to find work and proceed to have a successful career using the knowledge they have gained from their studies. You may change your ad preferences anytime. This user which users dataset contains methods and apache mahout is. Make Alpine wait until Livewire is finished rendering to often its thing. It can be mahout apache mahout core component can i have not buy a user increased which users who as a technique of courses within seconds. They interact thus far be able to exit an informed decision in duration to maximise both their enjoyment of their studies and their agenda of successful academic performance. Collaborative competitive filtering: Learning recommender using context of user choice.
    [Show full text]
  • Cómo Citar El Artículo Número Completo Más Información Del
    DYNA ISSN: 0012-7353 Universidad Nacional de Colombia Iván-Herrera-Herrera, Nelson; Luján-Mora, Sergio; Gómez-Torres, Estevan Ricardo Integración de herramientas para la toma de decisiones en la congestión vehicular DYNA, vol. 85, núm. 205, 2018, Abril-Junio, pp. 363-370 Universidad Nacional de Colombia DOI: https://doi.org/10.15446/dyna.v85n205.67745 Disponible en: https://www.redalyc.org/articulo.oa?id=49657889045 Cómo citar el artículo Número completo Sistema de Información Científica Redalyc Más información del artículo Red de Revistas Científicas de América Latina y el Caribe, España y Portugal Página de la revista en redalyc.org Proyecto académico sin fines de lucro, desarrollado bajo la iniciativa de acceso abierto Integration of tools for decision making in vehicular congestion• Nelson Iván-Herrera-Herreraa, Sergio Luján-Morab & Estevan Ricardo Gómez-Torres a a Facultad de Ciencias de la Ingeniería e Industrias, Universidad Tecnológica Equinoccial, Quito, Ecuador. [email protected], [email protected] b Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, Alicante, España. [email protected] Received: September 15th, 2017. Received in revised form: March 15th, 2018. Accepted: March 21th, 2018. Abstract The purpose of this study is to present an analysis of the use and integration of technological tools that help decision making in situations of vehicular congestion. The city of Quito-Ecuador is considered as a case study for the done work. The research is presented according to the development of an application, using Big Data tools (Apache Flume, Apache Hadoop, Apache Pig), favoring the processing of a lot of information that is required to collect, store and process.
    [Show full text]