Hortonworks Data Platform Apache Solr Search Installation (August 16, 2019)

Hortonworks Data Platform Apache Solr Search Installation (August 16, 2019) docs.cloudera.com Hortonworks Data Platform August 16, 2019 Hortonworks Data Platform: Apache Solr Search Installation Copyright © 2012-2019 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop, is a massively scalable and 100% open source platform for storing, processing and analyzing large volumes of data. It is designed to deal with data from many sources and formats in a very quick, easy and cost-effective manner. The Hortonworks Data Platform consists of the essential set of Apache Hadoop projects including MapReduce, Hadoop Distributed File System (HDFS), HCatalog, Pig, Hive, HBase, ZooKeeper and Ambari. Hortonworks is the major contributor of code and patches to many of these projects. These projects have been integrated and tested as part of the Hortonworks Data Platform release process and installation and configuration tools have also been included. Unlike other providers of platforms built using Apache Hadoop, Hortonworks contributes 100% of our code back to the Apache Software Foundation. The Hortonworks Data Platform is Apache-licensed and completely open source. We sell only expert technical support, training and partner-enablement services. All of our technology is, and will remain free and open source. Please visit the Hortonworks Data Platform page for more information on Hortonworks technology. For more information on Hortonworks services, please visit either the Support or Training page. Feel free to Contact Us directly to discuss your specific needs. Except where otherwise noted, this document is licensed under Creative Commons Attribution ShareAlike 4.0 License. http://creativecommons.org/licenses/by-sa/4.0/legalcode ii Hortonworks Data Platform August 16, 2019 Table of Contents 1. Introduction ................................................................................................................. 1 2. HDP Search 5.0 ............................................................................................................ 2 2.1. HDP Search 5.0 Release Notes ........................................................................... 2 2.1.1. HDP Search 5.0 New Features ................................................................ 2 2.1.2. HDP Search 5.0 Behavioral Changes ........................................................ 2 2.1.3. HDP Search 5.0 Apache Solr Version Information .................................... 2 2.1.4. HDP Search 5.0 Known Issues ................................................................. 4 2.2. HDP Search 5.0 - Getting Ready ........................................................................ 5 2.2.1. HDP Search 5.0 Minimum System Requirements ..................................... 5 2.2.2. Using a Local Repository ........................................................................ 6 2.3. Installing HDP Search 5.0 Management Pack ..................................................... 8 2.4. Installing HDP Search 5.0 Manually ................................................................... 9 3. HDP Search 4.0 .......................................................................................................... 10 3.1. HDP Search 4.0 Release Notes ......................................................................... 10 3.1.1. HDP Search 4.0 New Features .............................................................. 10 3.1.2. HDP Search 4.0 Behavioral Changes ...................................................... 10 3.1.3. HDP Search 4.0 Apache Solr Version Information .................................. 11 3.1.4. HDP Search 4.0 Known Issues ............................................................... 11 3.2. HDP Search 4.0 - Getting Ready ...................................................................... 11 3.2.1. HDP Search 4.0 Minimum System Requirements .................................... 12 3.2.2. Using a Local Repository ....................................................................... 12 3.3. Installing HDP Search 4.0 Management Pack ................................................... 15 3.4. Installing HDP Search 4.0 Manually .................................................................. 16 4. HDP Search 3.0 .......................................................................................................... 17 4.1. HDP Search 3.0 Release Notes ......................................................................... 17 4.1.1. HDP Search 3.0 New Features .............................................................. 17 4.1.2. HDP Search 3.0 Behavioral Changes ...................................................... 17 4.1.3. HDP Search 3.0 Apache Solr Version Information .................................. 18 4.1.4. HDP Search 3.0 Known Issues ............................................................... 18 4.2. HDP Search 3.0 - Getting Ready ...................................................................... 18 4.2.1. HDP Search 3.0 Minimum System Requirements .................................... 18 4.2.2. Using a Local Repository ....................................................................... 19 4.3. Installing HDP Search 3.0 Management Pack ................................................... 22 4.4. Installing HDP Search 3.0 Manually .................................................................. 23 5. Upgrading HDP Search ............................................................................................... 26 6. Applying Minor Upgrades to HDP Search ................................................................... 27 iii Hortonworks Data Platform August 16, 2019 List of Tables 3.1. HDP Search ............................................................................................................. 10 3.2. HDP Search ............................................................................................................. 10 4.1. HDP Search ............................................................................................................. 17 4.2. HDP Search ............................................................................................................. 17 iv Hortonworks Data Platform August 16, 2019 1. Introduction HDP Search is a full-text search server, designed for enterprise-level performance, flexibility, scalability, and fault-tolerance. HDP Search exposes REST-like HTTP/XML and JSON APIs for use with a wide range of programming languages. HDP Search includes: • Apache Solr 7.4.0 (for HDP Search 5.0) • Banana 1.6.12 • JARs for integration with Hadoop, Spark, Hive and Pig. Note HDP Search is a separate product, not packaged with the HDP platform. The high-level steps for using HDP search are as follows: 1. Install and deploy HDP Search, either manually or by using Ambari. 2. Ingest documents from sources such as HDFS. 3. Index the data. Documents, and updates to documents, will be available for search almost immediately after being indexed. 4. Perform a wide range of basic and advanced operations on the indexed documents. Resources: • This document describes software requirements for HDP Search, followed by installation instructions for specific operating systems. • For information about using Solr, see the Solr Reference Guide and the Solr tutorial. 1 Hortonworks Data Platform August 16, 2019 2. HDP Search 5.0 2.1. HDP Search 5.0 Release Notes The HDP Search 5.0 Release Notes summarize and describe the following information released in HDP Search 5.0: • HDP Search 5.0 New Features [2] • HDP Search 5.0 Behavioral Changes [2] • HDP Search 5.0 Apache Solr Version Information [2] • HDP Search 5.0 Known Issues [4] Note HDP Search is a separate product and does NOT come with the HDP platform. 2.1.1. HDP Search 5.0 New Features HDP 5.0 includes no significant new features. 2.1.2. HDP Search 5.0 Behavioral Changes HDP Search 5.0 introduces no significant behavioral changes. 2.1.3. HDP Search 5.0 Apache Solr Version Information HDP Search 5.0 is based on Apache Solr 7.4 (Apache Solr Release Notes). The following patches are applied on top of Apache Solr 7.4. Apache Patch Link to the Jira SOLR-12343 https://issues.apache.org/jira/browse/SOLR-12343 JSON Field Facet refinement can return incorrect counts/stats for sorted buckets. SOLR-12516 https://issues.apache.org/jira/browse/SOLR-12516 JSON range facets can incorrectly refine subfacets for buckets SOLR-12541 https://issues.apache.org/jira/browse/SOLR-12541 Metrics handler throws an error if there are transient cores. SOLR-12594 https://issues.apache.org/jira/browse/SOLR-12594 MetricsHistoryHandler.getOverseerLeader fails when hostname contains hyphen. SOLR-12683 https://issues.apache.org/jira/browse/SOLR-12683 HashQuery will throw an exception if more than 4 partitionKeys is specified. SOLR-12704 https://issues.apache.org/jira/browse/SOLR-12704 Guard AddSchemaFieldsUpdateProcessorFactory against null field names and field values. SOLR-12750 https://issues.apache.org/jira/browse/SOLR-12750 Migrate API should lock the collection instead of shared. 2 Hortonworks Data Platform August 16, 2019 Apache Patch Link to the Jira SOLR-12765 https://issues.apache.org/jira/browse/SOLR-12765 Incorrect format of JMX cache stats. SOLR-12836 https://issues.apache.org/jira/browse/SOLR-12836 ZkController creates a cloud solr client with no connection or read timeouts. SOLR-12615 https://issues.apache.org/jira/browse/SOLR-12615 HashQParserPlugin won't throw an NPE for string hash key and documents with empty

Load more