Kunpeng BoostKit for Big Data

Porting Guide (CDH)

Issue 08 Date 2021-07-13

HUAWEI TECHNOLOGIES CO., LTD.

Copyright © Huawei Technologies Co., Ltd. 2021. All rights reserved. No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd. All other trademarks and trade names mentioned in this document are the property of their respective holders.

Notice The purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. i Kunpeng BoostKit for Big Data Porting Guide (CDH) Contents

Contents

1 Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 1 1.1 Introduction...... 1 1.2 Environment Requirements...... 1 1.3 Configuring the Compilation Environment...... 2 1.3.1 Installing Basic Libraries...... 2 1.3.2 Installing OpenJDK...... 4 1.3.3 Installing Maven...... 5 1.3.4 Installing Ant...... 6 1.3.5 Installing Forrest...... 6 1.4 Compiling Avro...... 7 1.5 Troubleshooting...... 13 1.5.1 Certificate Error Reported After git clone Is Executed...... 13 1.5.2 Failed to Verify the github.com Certificate Downloaded by Using wget...... 13 2 Flume-ng-cdh6.3.0 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 15 2.1 Introduction...... 15 2.2 Environment Requirements...... 15 2.3 Configuring the Compilation Environment...... 16 2.3.1 Configuring the Local Yum Source...... 16 2.3.2 Installing OpenJDK...... 17 2.3.3 Installing Maven...... 18 2.4 Compiling Flume-NG...... 19 2.5 Troubleshooting...... 21 2.5.1 Certificate Error Reported After git clone Is Executed...... 21 2.5.2 Failed to Verify the github.com Certificate Downloaded by Using wget...... 21 3 Parquet-format-2.4.0-cdh6.3.0 Porting Guide (CentOS 7.6 & openEuler 20.03)....22 3.1 Introduction...... 22 3.2 Environment Requirements...... 22 3.3 Configuring the Compilation Environment...... 23 3.3.1 Installing Basic Libraries...... 23 3.3.2 Installing OpenJDK...... 25 3.3.3 Installing Maven...... 26 3.3.4 Installing Thrift...... 27

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. ii Kunpeng BoostKit for Big Data Porting Guide (CDH) Contents

3.4 Compiling Parquet-format...... 27 3.5 Troubleshooting...... 28 3.5.1 Certificate Error Reported After git clone Is Executed...... 29 3.5.2 Failed to Verify the github.com Certificate Downloaded by Using wget...... 29 4 Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 30 4.1 Introduction...... 30 4.2 Environment Requirements...... 30 4.3 Configuring the Compilation Environment...... 31 4.3.1 Installing Basic Libraries...... 31 4.3.2 Installing OpenJDK...... 33 4.3.3 Installing Maven...... 34 4.3.4 Installing Protobuf...... 35 4.3.5 Installing Thrift...... 36 4.4 Compiling Parquet-MR...... 36 4.5 Troubleshooting...... 39 4.5.1 Certificate Error Reported After git clone Is Executed...... 40 4.5.2 Failed to Verify the github.com Certificate Downloaded by Using wget...... 40 5 Sentry-2.1.0-cdh6.3.0 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 41 5.1 Introduction...... 41 5.2 Environment Requirements...... 41 5.3 Configuring the Compilation Environment...... 42 5.3.1 Configuring the Local Yum Source...... 42 5.3.2 Installing OpenJDK...... 43 5.3.3 Installing Maven...... 43 5.4 Compiling Sentry...... 45 5.5 Troubleshooting...... 48 5.5.1 Certificate Error Reported After git clone Is Executed...... 48 5.5.2 Failed to Verify the github.com Certificate Downloaded by Using wget...... 48 6 Solr-7.4.0-cdh6.3.0 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 49 6.1 Introduction...... 49 6.2 Environment Requirements...... 49 6.3 Configuring the Compilation Environment...... 50 6.3.1 Configuring the Local Yum Source...... 50 6.3.2 Installing OpenJDK...... 51 6.3.3 Installing Maven...... 51 6.3.4 Installing Ant...... 53 6.4 Compiling Solr...... 53 6.5 Troubleshooting...... 57 6.5.1 Certificate Error Reported After git clone Is Executed...... 57 6.5.2 Failed to Verify the github.com Certificate Downloaded by Using wget...... 57 7 Hive-1.1.0-cdh5.13.3 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 58

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. iii Kunpeng BoostKit for Big Data Porting Guide (CDH) Contents

7.1 Introduction...... 58 7.2 Environment Requirements...... 58 7.3 Configuring the Compilation Environment...... 59 7.3.1 Configuring the Local Yum Source...... 59 7.3.2 Installing OpenJDK...... 60 7.3.3 Installing Maven...... 60 7.4 Performing Porting Analysis...... 62 7.5 Compiling Hive...... 62 7.6 Troubleshooting...... 63 7.6.1 Compilation Error: "Could not find artifact com.google.protobuf:protoc:exe:linux-aarch_64:2.5.0".... 63 8 Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 65 8.1 Introduction...... 65 8.2 Environment Requirements...... 65 8.3 Configuring the Compilation Environment...... 66 8.3.1 Installing Basic Libraries...... 66 8.3.2 Installing OpenJDK...... 68 8.3.3 Installing Maven...... 69 8.3.4 Installing the R Language...... 70 8.4 Performing Porting Analysis...... 71 8.5 Compiling Spark...... 71 8.6 Troubleshooting...... 72 8.6.1 "Cannot find 'R_HOME'. Please specify 'R_HOME' or make sure R is properly installed" Is Reported During the Spark Compilation...... 73 8.6.2 "error: cannot compile a simple Fortran program" Reported During the R Language Compilation... 73 8.6.3 "configure: error: --with-x=yes (default) and X11 headers/libs are not available" Reported During the R Language Compilation...... 74 8.6.4 "/usr/bin/install: cannot stat' NEWS.': No such file or directory" Reported During the R Language Compilation...... 74 9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)...... 76 9.1 Introduction...... 76 9.2 Environment Requirements...... 76 9.3 Configuring the Compilation Environment...... 77 9.3.1 Installing Basic Libraries...... 77 9.3.2 Installing OpenJDK...... 79 9.3.3 Installing Maven...... 79 9.3.4 Installing Ant...... 80 9.3.5 Installing Forrest...... 81 9.4 Compiling Avro...... 82 9.5 Troubleshooting...... 87 9.5.1 Failed to Verify the github.com Certificate Downloaded by Using wget...... 87 10 Bigtop-jsvc-1.0.10-cdh5.12.1 Porting Guide (CentOS 7.6)...... 88 10.1 Introduction...... 88

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. iv Kunpeng BoostKit for Big Data Porting Guide (CDH) Contents

10.2 Environment Requirements...... 88 10.3 Configuring the Compilation Environment...... 89 10.3.1 Installing Basic Libraries...... 89 10.3.2 Installing OpenJDK...... 91 10.4 Compiling bigtop-jsvc...... 91 11 Flume-ng-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)...... 93 11.1 Introduction...... 93 11.2 Environment Requirements...... 93 11.3 Configuring the Compilation Environment...... 94 11.3.1 Installing Basic Libraries...... 94 11.3.2 Installing OpenJDK...... 96 11.3.3 Installing Maven...... 97 11.4 Compiling Flume-NG...... 98 11.5 Troubleshooting...... 100 11.5.1 Failed to Verify the github.com Certificate Downloaded by Using wget...... 100 11.5.2 ".lang.OutOfMemoryError: PermGen space" Is Reported During Compilation...... 100 12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)...... 102 12.1 Introduction...... 102 12.2 Environment Requirements...... 102 12.3 Configuring the Compilation Environment...... 103 12.3.1 Installing Basic Libraries...... 103 12.3.2 Installing OpenJDK...... 105 12.3.3 Installing Maven...... 105 12.3.4 Installing Ant...... 107 12.3.5 Installing Protobuf...... 107 12.4 Compiling Hadoop...... 108 12.5 Troubleshooting...... 111 12.5.1 Failed to Verify the github.com Certificate Downloaded by Using wget...... 111 12.5.2 Failed to Compile or Download Hadoop...... 112 13 HBase-1.2.0-cdh5.12.1 Porting Guide (CentOS 7.6)...... 114 13.1 Introduction...... 114 13.2 Environment Requirements...... 114 13.3 Configuring the Compilation Environment...... 115 13.3.1 Installing Basic Libraries...... 115 13.3.2 Installing OpenJDK...... 117 13.3.3 Installing Maven...... 118 13.3.4 Installing Protobuf...... 119 13.4 Compiling HBase...... 119 13.5 Troubleshooting...... 122 13.5.1 Can't find bundle for base name org.apache.jasper.resources.LocalStrings...... 122 13.5.2 could not be resolved: Could not transfer artifact XXX...... 123

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. v Kunpeng BoostKit for Big Data Porting Guide (CDH) Contents

14 HBase-indexer-1.5-cdh5.12.1 Porting Guide (CentOS 7.6)...... 124 14.1 Introduction...... 124 14.2 Environment Requirements...... 124 14.3 Configuring the Compilation Environment...... 125 14.3.1 Installing Basic Libraries...... 125 14.3.2 Installing OpenJDK...... 127 14.3.3 Installing Maven...... 127 14.4 Compiling HBase-indexer...... 128 14.5 Troubleshooting...... 131 14.5.1 org.apache.rat:apache-rat-plugin:0.8:check (default) on project hbase: Too many unapproved license...... 132 14.5.2 java.lang.OutOfMemoryError: PermGen space...... 132 15 Hive-1.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)...... 133 15.1 Introduction...... 133 15.2 Environment Requirements...... 133 15.3 Configuring the Compilation Environment...... 134 15.3.1 Installing Basic Libraries...... 134 15.3.2 Installing OpenJDK...... 136 15.3.3 Installing Maven...... 136 15.3.4 Installing Protobuf...... 137 15.4 Compiling Hive...... 138 15.5 Troubleshooting...... 140 15.5.1 Compilation Error: "Could not find artifact com.google.protobuf:protoc:exe:linux-aarch_64:2.5.0" ...... 140 16 Hue-3.9.0-cdh5.12.1 Porting Guide (CentOS 7.6)...... 141 16.1 Introduction...... 141 16.2 Environment Requirements...... 141 16.3 Configuring the Compilation Environment...... 142 16.3.1 Installing Basic Libraries...... 142 16.3.2 Installing OpenJDK...... 144 16.3.3 Installing Maven...... 144 16.3.4 Installing Dependent Components...... 146 16.4 Compiling Hue...... 147 17 Kafka-cdh5-0.10.2_2.2.0 Porting Guide (CentOS 7.6)...... 149 17.1 Introduction...... 149 17.2 Environment Requirements...... 149 17.3 Configuring the Compilation Environment...... 150 17.3.1 Installing Basic Libraries...... 150 17.3.2 Installing OpenJDK...... 152 17.3.3 Installing Maven...... 152 17.3.4 Installing ...... 154 17.4 Compiling Kafka...... 154

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. vi Kunpeng BoostKit for Big Data Porting Guide (CDH) Contents

18 Kite-1.0.0-cdh5.12.1 Porting Guide (CentOS 7.6)...... 156 18.1 Introduction...... 156 18.2 Environment Requirements...... 156 18.3 Configuring the Compilation Environment...... 157 18.3.1 Installing Basic Libraries...... 157 18.3.2 Installing OpenJDK...... 159 18.3.3 Installing Maven...... 159 18.3.4 Installing Protobuf...... 160 18.4 Compiling Kite...... 161 19 Lucene-solr-cdh5.12.1 Porting Guide (CentOS 7.6)...... 165 19.1 Introduction...... 165 19.2 Environment Requirements...... 165 19.3 Configuring the Compilation Environment...... 166 19.3.1 Installing Basic Libraries...... 166 19.3.2 Installing OpenJDK...... 168 19.3.3 Installing Maven...... 168 19.3.4 Installing Ant...... 169 19.4 Compiling Solr...... 170 20 Oozie-4.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)...... 173 20.1 Introduction...... 173 20.2 Environment Requirements...... 173 20.3 Configuring the Compilation Environment...... 174 20.3.1 Installing Basic Libraries...... 174 20.3.2 Installing OpenJDK...... 176 20.4 Porting Oozie...... 176 21 Pig-0.12.0-cdh5.12.1 Porting' Guide (CentOS 7.6)...... 182 21.1 Introduction...... 182 21.2 Environment Requirements...... 183 21.3 Configuring the Compilation Environment...... 183 21.3.1 Installing Basic Libraries...... 183 21.3.2 Installing OpenJDK...... 185 21.3.3 Installing Maven...... 186 21.3.4 Installing Ant...... 187 21.3.5 Installing Forrest...... 187 21.4 Compiling Pig...... 188 21.5 Troubleshooting...... 190 21.5.1 Failed to Download ivy-2.2.0.jar Due to Connection Timeout...... 190 21.5.2 "GC overhead limit exceeded" Is Displayed During Compilation...... 190 22 Spark-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)...... 192 22.1 Introduction...... 192 22.2 Environment Requirements...... 192

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. vii Kunpeng BoostKit for Big Data Porting Guide (CDH) Contents

22.3 Configuring the Compilation Environment...... 193 22.3.1 Installing Basic Libraries...... 193 22.3.2 Installing OpenJDK...... 195 22.3.3 Installing Maven...... 195 22.3.4 Installing the R Language...... 197 22.4 Compiling Spark...... 197 22.5 Troubleshooting...... 199 22.5.1 "Cannot find 'R_HOME'. Please specify 'R_HOME' or make sure R is properly installed" Is Reported During the Spark Compilation...... 199 22.5.2 "error: cannot compile a simple Fortran program" Reported During the R Language Compilation ...... 200 22.5.3 "configure: error: --with-x=yes (default) and X11 headers/libs are not available" Reported During the R Language Compilation...... 200 22.5.4 "/usr/bin/install: cannot stat' NEWS.pdf': No such file or directory" Reported During the R Language Compilation...... 201 23 -1.4.6-cdh5.12.1 Porting Guide (CentOS 7.6)...... 202 23.1 Introduction...... 202 23.2 Environment Requirements...... 202 23.3 Configuring the Compilation Environment...... 203 23.3.1 Installing Basic Libraries...... 203 23.3.2 Installing OpenJDK...... 205 23.3.3 Installing Maven...... 205 23.3.4 Installing Ant...... 206 23.4 Compiling Sqoop...... 207 23.5 Troubleshooting...... 209 23.5.1 Error Message "asciidoc: command not found" Is Displayed During Compilation...... 210 24 ZooKeeper-3.4.5-cdh5.12.1 Porting Guide (CentOS 7.6)...... 211 24.1 Introduction...... 211 24.2 Environment Requirements...... 211 24.3 Configuring the Compilation Environment...... 212 24.3.1 Installing Basic Libraries...... 212 24.3.2 Installing OpenJDK...... 214 24.3.3 Installing Maven...... 214 24.3.4 Installing Ant...... 215 24.4 Compiling ZooKeeper...... 216 24.5 Troubleshooting...... 220 24.5.1 Error Message "configure.ac:37: error: possibly undefined macro: AM_PATH_CPPUNIT" Is Displayed During Compilation...... 220 25 Hadoop-3.0.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 221 25.1 Introduction...... 221 25.2 Environment Requirements...... 221 25.3 Configuring the Compilation Environment...... 222 25.3.1 Installing Basic Libraries...... 222

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. viii Kunpeng BoostKit for Big Data Porting Guide (CDH) Contents

25.3.2 Installing OpenJDK...... 224 25.3.3 Installing Maven...... 225 25.3.4 Installing CMake...... 226 25.3.5 Installing Protobuf...... 227 25.4 Compiling Hadoop...... 228 26 HBase-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 232 26.1 Introduction...... 232 26.2 Environment Requirements...... 232 26.3 Configuring the Compilation Environment...... 233 26.3.1 Installing Basic Libraries...... 233 26.3.2 Installing OpenJDK...... 235 26.3.3 Installing Maven...... 236 26.3.4 Installing Protobuf...... 237 26.4 Compiling HBase...... 237 27 Hive-2.1.1-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 240 27.1 Introduction...... 240 27.2 Environment Requirements...... 240 27.3 Configuring the Compilation Environment...... 241 27.3.1 Installing Basic Libraries...... 241 27.3.2 Installing OpenJDK...... 243 27.3.3 Installing Maven...... 244 27.3.4 Installing Protobuf...... 245 27.4 Compiling Hive...... 246 28 Spark-2.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 249 28.1 Introduction...... 249 28.2 Environment Requirements...... 249 28.3 Configuring the Compilation Environment...... 250 28.3.1 Installing Basic Libraries...... 250 28.3.2 Installing OpenJDK...... 252 28.3.3 Installing Maven...... 253 28.3.4 Installing the R Language...... 254 28.4 Compiling Spark...... 255 28.5 Troubleshooting...... 256 28.5.1 "error: cannot compile a simple Fortran program" Reported During the R Language Compilation ...... 256 28.5.2 "configure: error: --with-x=yes (default) and X11 headers/libs are not available" Reported During the R Language Compilation...... 256 28.5.3 "/usr/bin/install: cannot stat' NEWS.pdf': No such file or directory" Reported During the R Language Compilation...... 257 28.5.4 "git clone: error: RPC failed; result=18, HTTP code = 200" Is Reported During Source Code Downloading by Using git clone...... 257 29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 259

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. ix Kunpeng BoostKit for Big Data Porting Guide (CDH) Contents

29.1 Introduction...... 259 29.2 Environment Requirements...... 259 29.3 Configuring the Compilation Environment...... 260 29.3.1 Configuring the Local Yum Source...... 260 29.3.2 Installing OpenJDK...... 261 29.3.3 Installing Maven...... 261 29.3.4 Installing Ant...... 263 29.4 Compiling ZooKeeper...... 263 29.5 Troubleshooting...... 269 29.5.1 Error Message "configure.ac:37: error: possibly undefined macro: AM_PATH_CPPUNIT" Is Displayed During Compilation...... 270 30 Avro-1.8.2-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 271 30.1 Introduction...... 271 30.2 Environment Requirements...... 271 30.3 Configuring the Compilation Environment...... 272 30.3.1 Installing Basic Libraries...... 272 30.3.2 Installing OpenJDK...... 274 30.3.3 Installing Maven...... 275 30.3.4 Installing Ant...... 276 30.3.5 Installing Forrest...... 276 30.4 Compiling Avro...... 277 31 Flume-ng-1.9.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 284 31.1 Introduction...... 284 31.2 Environment Requirements...... 284 31.3 Configuring the Compilation Environment...... 285 31.3.1 Configuring the Local Yum Source...... 285 31.3.2 Installing OpenJDK...... 286 31.3.3 Installing Maven...... 287 31.4 Compiling Flume-NG...... 288 31.5 Troubleshooting...... 290 31.5.1 "java.lang.OutOfMemoryError: PermGen space" Reported During Compilation...... 290 32 HBase-indexer-1.5-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03).... 291 32.1 Introduction...... 291 32.2 Environment Requirements...... 291 32.3 Configuring the Compilation Environment...... 292 32.3.1 Configuring the Local Yum Source...... 292 32.3.2 Installing OpenJDK...... 293 32.3.3 Installing Maven...... 293 32.4 Compiling HBase Indexer...... 295 33 Hue-4.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 298 33.1 Introduction...... 298

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. x Kunpeng BoostKit for Big Data Porting Guide (CDH) Contents

33.2 Environment Requirements...... 298 33.3 Configuring the Compilation Environment...... 299 33.3.1 Installing Basic Libraries...... 299 33.3.2 Installing OpenJDK...... 301 33.3.3 Installing Maven...... 302 33.3.4 Installing Dependent Components...... 303 33.3.5 Installing CMake...... 304 33.4 Compiling Hue...... 305 33.5 Troubleshooting...... 307 33.5.1 An Error Reported During Compilation...... 308 33.5.2 "cannot uninstall 'enum34'" Reported During Installation...... 308 34 Kafka-2.11-2.2.1-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 309 34.1 Introduction...... 309 34.2 Environment Requirements...... 309 34.3 Configuring the Compilation Environment...... 310 34.3.1 Configuring the Local Yum Source...... 310 34.3.2 Installing OpenJDK...... 311 34.3.3 Installing Maven...... 312 34.3.4 Installing Gradle...... 313 34.4 Compiling Kafka...... 313 35 Lucene-solr-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 316 35.1 Introduction...... 316 35.2 Environment Requirements...... 316 35.3 Configuring the Compilation Environment...... 317 35.3.1 Configuring the Local Yum Source...... 317 35.3.2 Installing OpenJDK...... 318 35.3.3 Installing Maven...... 318 35.3.4 Installing Ant...... 320 35.4 Compiling Solr...... 320 36 Oozie-5.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 324 36.1 Introduction...... 324 36.2 Environment Requirements...... 324 36.3 Configuring the Compilation Environment...... 325 36.3.1 Configuring the Local Yum Source...... 325 36.3.2 Installing OpenJDK...... 326 36.3.3 Installing Maven...... 326 36.4 Compiling Oozie...... 328 37 Parquet-format-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 331 37.1 Introduction...... 331 37.2 Environment Requirements...... 331 37.3 Configuring the Compilation Environment...... 332

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. xi Kunpeng BoostKit for Big Data Porting Guide (CDH) Contents

37.3.1 Installing Basic Libraries...... 332 37.3.2 Installing OpenJDK...... 334 37.3.3 Installing Maven...... 335 37.3.4 Installing Thrift...... 336 37.4 Compiling Parquet-format...... 336 38 Parquet-mr-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 339 38.1 Introduction...... 339 38.2 Environment Requirements...... 339 38.3 Configuring the Compilation Environment...... 340 38.3.1 Installing Basic Libraries...... 340 38.3.2 Installing OpenJDK...... 342 38.3.3 Installing Maven...... 343 38.3.4 Installing Protobuf...... 344 38.3.5 Installing Thrift...... 345 38.4 Compiling Parquet-mr...... 345 39 Pig-0.17.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 349 39.1 Introduction...... 349 39.2 Environment Requirements...... 349 39.3 Configuring the Compilation Environment...... 350 39.3.1 Configuring the Local Yum Source...... 350 39.3.2 Installing OpenJDK...... 351 39.3.3 Installing Maven...... 352 39.3.4 Installing Ant...... 353 39.3.5 Installing Forrest...... 353 39.4 Compiling Pig...... 354 39.5 Troubleshooting...... 357 39.5.1 Failed to Download ivy-2.2.0.jar Due to Connection Timeout...... 357 39.5.2 "GC overhead limit exceeded" Reported During Compilation...... 357 40 Search-1.0.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 359 40.1 Introduction...... 359 40.2 Environment Requirements...... 359 40.3 Configuring the Compilation Environment...... 360 40.3.1 Configuring the Local Yum Source...... 360 40.3.2 Installing OpenJDK...... 361 40.3.3 Installing Maven...... 361 40.4 Compiling Search...... 363 40.5 Troubleshooting...... 364 40.5.1 "java.lang.OutOfMemoryError: PermGen space" Reported During Compilation...... 364 41 Sentry-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)...... 365 41.1 Introduction...... 365 41.2 Environment Requirements...... 365

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. xii Kunpeng BoostKit for Big Data Porting Guide (CDH) Contents

41.3 Configuring the Compilation Environment...... 366 41.3.1 Configuring the Local Yum Source...... 366 41.3.2 Installing OpenJDK...... 367 41.3.3 Installing Maven...... 367 41.4 Compiling Sentry...... 369 A Change History...... 373

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. xiii Kunpeng BoostKit for Big Data 1 Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

1 Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6 & openEuler 20.03)

1.1 Introduction 1.2 Environment Requirements 1.3 Configuring the Compilation Environment 1.4 Compiling Avro 1.5 Troubleshooting

1.1 Introduction offers data serialization capabilities and provides data exchange services in big data-based systems and applications. It supports binary serialization to simply and rapidly process a large amount of data, and integrates dynamic languages to facilitate flexible processing of Avro data. This document describes how to adapt Avro components in CDH to TaiShan servers. For more information about CDH, visit https://www.cloudera.com/.

1.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 1 Kunpeng BoostKit for Big Data 1 Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Software Requirements

Item Version

JDK 1.8.0_252

Maven 3.5.4

Ant 1.7.1

Forrest 0.9

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

1.3 Configuring the Compilation Environment

1.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 2 Kunpeng BoostKit for Big Data 1 Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related . yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 3 Kunpeng BoostKit for Big Data 1 Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Use the Yum source to install dependencies.

yum install -y boost.aarch64 boost-devel.aarch64 make cmake wget openssl-devel zlib-devel automake libtool libstdc++-static glibc-static git snappy snappy-devel jansson-devel.aarch64 asciidoc.noarch doxygen 1.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 4 Kunpeng BoostKit for Big Data 1 Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 4 Check whether OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 1.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 5 Kunpeng BoostKit for Big Data 1 Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 1.3.4 Installing Ant

Step 1 Download and install the package to the specified directory. wget https://archive.apache.org/dist/ant/binaries/apache-ant-1.7.1-bin.tar.gz tar -zxf apache-ant-1.7.1-bin.tar.gz mv apache-ant-1.7.1 /opt/tools/installed/

Step 2 Modify environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export ANT_HOME=/opt/tools/installed/apache-ant-1.7.1 export PATH=$ANT_HOME/bin:$PATH

Step 3 Run the following command for the environment variables to take effect: source /etc/profile

Step 4 Check whether the configuration takes effect. ant -version version 1.7.1 compiled on June 27 2008

----End 1.3.5 Installing Forrest

Step 1 Download and install Forrest to a specified directory, for example, /opt/tools/ installed. wget http://archive.apache.org/dist/forrest/0.9/apache-forrest-0.9.tar.gz tar -zxf apache-forrest-0.9.tar.gz mv apache-forrest-0.9 /opt/tools/installed/

Step 2 Modify environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export FORREST_HOME=/opt/tools/installed/apache-forrest-0.9 export PATH=$FORREST_HOME/bin:$PATH

Step 3 Make the environment variables take effect.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 6 Kunpeng BoostKit for Big Data 1 Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

source /etc/profile

Step 4 Check whether the environment variables take effect. forrest -projecthelp

----End

1.4 Compiling Avro

NO TE

This section explains the compilation for CDH 6.3.0. Refer to this section when compiling for another version.

Prerequisites

You have downloaded and decompressed the Avro-cdh6.3.0 source package.

Procedure

Step 1 Go to the Avro-cdh6.3.0 source code directory. cd avro-cdh6.3.0-release

Step 2 Modify the pom.xml file. vim pom.xml

Add the Kunpeng Maven repository in line 62. (The Kunpeng repository must be placed in the first place.) kunpengmaven kunpeng maven https://mirrors.huaweicloud.com/kunpeng/maven

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 7 Kunpeng BoostKit for Big Data 1 Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Modify the build.sh file. vim build.sh 1. Comment out lines 89 to 100 and line 115. 89 #rm -rf build/${SRC_DIR} 90 #if [ -d .svn ]; ... 100 #fi 115 #(cd lang/py3; ./build.sh dist)

NO TE

In CentOS 7.6, the Python version is 2.7.5. You need to remove the Python 3 compilation mode from the code. 2. Enter the following command to line 88: 88 mkdir -p build/${SRC_DIR} ...

3. Comment out lines 121 to 131. 121 #(cd lang/csharp; ./build.sh dist): 122 123 #(cd lang/js; ./build.sh dist) 124 125 #(cd lang/ruby; ./build.sh dist) 126 127 #(cd lang/php; ./build.sh dist) 128 129 #mkdir -p dist/perl 130 #(cd lang/perl; perl ./Makefile.PL && make dist) 131 #cp lang/perl/Avro-$VERSION.tar.gz dist/perl/ ...

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 8 Kunpeng BoostKit for Big Data 1 Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

NO TE

C#, JavaScript, Ruby, PHP, and Perl are rarely used and do not need to be compiled. 4. Modify the docs compilation procedure in line 134 and add the Forrest configuration. 134 (cd doc; ant -Dforrest.home=/opt/tools/installed/apache-forrest-0.9)

Step 4 Modify the lang/py/build.xml file. vim lang/py/build.xml Change http://repo2.maven.org/maven2/... to https:// mirrors.huaweicloud.com/repository/maven/....

Procedure

Set http.sslVerify to false.

git config --global http.sslVerify false 1.5.2 Failed to Verify the github.com Certificate Downloaded by Using wget

Symptom

During compilation, an error is reported, indicating that the certificate issued by github.com cannot be verified. The permission of the issuer cannot be verified locally.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 13 Kunpeng BoostKit for Big Data 1 Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Procedure To connect to github.com in an insecure manner, specify the --no-check- certificate parameter in the wget command.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 14 Kunpeng BoostKit for Big Data 2 Flume-ng-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

2 Flume-ng-cdh6.3.0 Porting Guide (CentOS 7.6 & openEuler 20.03)

2.1 Introduction 2.2 Environment Requirements 2.3 Configuring the Compilation Environment 2.4 Compiling Flume-NG 2.5 Troubleshooting

2.1 Introduction Flume-NG is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. The system centrally manages and allows intelligent dynamic management. It uses a simple extensible data model that allows for online analytic application. Cloudera launched Flume, involving some landmark changes to Flume: refactoring the core components, core configuration and code architecture, and the refactored version is collectively referred to as Flume NG (next generation). Another reason for the changes is that Flume has been incorporated into Apache, and Cloudera Flume was renamed . For more information about CDH, visit https://www.cloudera.com/.

2.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 15 Kunpeng BoostKit for Big Data 2 Flume-ng-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Item Remarks

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Software Requirements Item Version

JDK 1.8.0_252

Maven 3.5.4

CentOS Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

2.3 Configuring the Compilation Environment

2.3.1 Configuring the Local Yum Source

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual .iso package name.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 16 Kunpeng BoostKit for Big Data 2 Flume-ng-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Create the /etc/yum.repos.d/Local.repo file. vim /etc/yum.repos.d/Local.repo

Add the following content to the file to configure the local Yum source:

[Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install related software. yum -y install wget git

----End 2.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 17 Kunpeng BoostKit for Big Data 2 Flume-ng-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

----End 2.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 18 Kunpeng BoostKit for Big Data 2 Flume-ng-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End

2.4 Compiling Flume-NG

NO TE

This section explains the compilation for CDH 6.3.0. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Flume-ng-cdh6.3.0 source package.

Procedure

Step 1 Go to the Flume-ng-cdh6.3.0 source code directory. cd flume-ng-cdh6.3.0-release Step 2 Modify the pom.xml file by adding remote Maven repositories. vim pom.xml To be specific, add three remote Maven repositories to the repositories tag. The Kunpeng Maven repository must be placed in the first place.

Kunpeng.repo https://mirrors.huaweicloud.com/kunpeng/maven/ Kunpeng Repositories huaweicloud.repo HuaweiCloud Repositories https://mirrors.huaweicloud.com/repository/maven wso2.repo http://maven.wso2.org/nexus/content/groups/wso2-public/ wso2 Repositories

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 19 Kunpeng BoostKit for Big Data 2 Flume-ng-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Perform the compilation. mvn package -DskipTests The compilation is successful if the following information is displayed:

The compilation result is apache-flume-1.9.0-cdh6.3.0-bin.tar.gz in flume-ng- dist/target/.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 20 Kunpeng BoostKit for Big Data 2 Flume-ng-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 4 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

NO TE

The compiled apache-flume-1.9.0-cdh6.3.0-bin.tar.gz package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see Kunpeng Porting Advisor Case Study.

----End

2.5 Troubleshooting

2.5.1 Certificate Error Reported After git clone Is Executed

Symptom In the compilation process, "fatal:unable to access 'https://github.com/ariya/ phantomjs.git/': Peer's Certificate issuer is not recognized." is displayed.

Procedure Set http.sslVerify to false.

git config --global http.sslVerify false 2.5.2 Failed to Verify the github.com Certificate Downloaded by Using wget

Symptom During compilation, an error is reported, indicating that the certificate issued by github.com cannot be verified. The permission of the issuer cannot be verified locally.

Procedure To connect to github.com in an insecure manner, specify the --no-check- certificate parameter in the wget command.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 21 Kunpeng BoostKit for Big Data 3 Parquet-format-2.4.0-cdh6.3.0 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

3 Parquet-format-2.4.0-cdh6.3.0 Porting Guide (CentOS 7.6 & openEuler 20.03)

3.1 Introduction 3.2 Environment Requirements 3.3 Configuring the Compilation Environment 3.4 Compiling Parquet-format 3.5 Troubleshooting

3.1 Introduction Parquet uses the columnar storage format and supports data nesting. The Parquet-format project implemented by Java defines all Parquet metadata objects (data types and storage formats in Parquet). Parquet metadata is serialized using and stored at the end of the Parquet file. This document describes how to adapt Parquet-format components in CDH to TaiShan servers. For more information about CDH, visit https://www.cloudera.com/.

3.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 22 Kunpeng BoostKit for Big Data 3 Parquet-format-2.4.0-cdh6.3.0 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

Software Requirements

Item Version

JDK 1.8.0_252

Maven 3.5.4

Thrift 0.9.3

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

3.3 Configuring the Compilation Environment

3.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 23 Kunpeng BoostKit for Big Data 3 Parquet-format-2.4.0-cdh6.3.0 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 24 Kunpeng BoostKit for Big Data 3 Parquet-format-2.4.0-cdh6.3.0 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Install dependencies using the Yum source.

yum -y install wget make 3.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 25 Kunpeng BoostKit for Big Data 3 Parquet-format-2.4.0-cdh6.3.0 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

java -version The installation is successful if information similar to the following is displayed:

----End 3.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 26 Kunpeng BoostKit for Big Data 3 Parquet-format-2.4.0-cdh6.3.0 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 3.3.4 Installing Thrift

Step 1 Download the Thrift package. wget http://archive.apache.org/dist/thrift/0.9.3/thrift-0.9.3.tar.gz

Step 2 Decompress the package. tar -xvf thrift-0.9.3.tar.gz

Step 3 Switch to the folder generated after the Thrift package is decompressed. cd thrift-0.9.3

Step 4 Add the execute permission for the configure file. chmod +x configure

Step 5 Install Thrift. ./configure --disable-gen-erl --disable-gen-hs --without-ruby --without-haskell --without-erlang make make install

Step 6 Run the following command to check whether Thrift has been successfully installed. thrift -version

----End

3.4 Compiling Parquet-format

NO TE

This section explains the compilation for CDH 6.3.0. Refer to this section when compiling for another version.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 27 Kunpeng BoostKit for Big Data 3 Parquet-format-2.4.0-cdh6.3.0 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

Prerequisites

You have downloaded and decompressed the Parquet-format-cdh6.3.0 source package.

Procedure

Step 1 Go to the Parquet-format-cdh6.3.0 source code directory. cd parquet-format-cdh6.3.0-release

Step 2 Perform the compilation. mvn package -DskipTests

The compilation is successful if the following information is displayed:

The compilation result is parquet-format-2.4.0-cdh6.3.0.jar in target/.

Step 3 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain the x86 .so or .jar packages.

NO TE

The generated package parquet-format-2.4.0-cdh6.3.0.jar must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

3.5 Troubleshooting

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 28 Kunpeng BoostKit for Big Data 3 Parquet-format-2.4.0-cdh6.3.0 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

3.5.1 Certificate Error Reported After git clone Is Executed

Symptom In the compilation process, "fatal:unable to access 'https://github.com/ariya/ phantomjs.git/': Peer's Certificate issuer is not recognized." is displayed.

Procedure Set http.sslVerify to false.

git config --global http.sslVerify false 3.5.2 Failed to Verify the github.com Certificate Downloaded by Using wget

Symptom During compilation, an error is reported, indicating that the certificate issued by github.com cannot be verified. The permission of the issuer cannot be verified locally.

Procedure To connect to github.com in an insecure manner, specify the --no-check- certificate parameter in the wget command.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 29 Kunpeng BoostKit for Big Data 4 Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

4 Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6 & openEuler 20.03)

4.1 Introduction 4.2 Environment Requirements 4.3 Configuring the Compilation Environment 4.4 Compiling Parquet-MR 4.5 Troubleshooting

4.1 Introduction Parquet uses the columnar storage format and supports data nesting. As a subproject of Parquet, the Parquet-MR project enables the object model converter function of Parquet, which maps external object models to internal data types of Parquet. This document describes how to adapt Parquet-MR components in CDH to TaiShan servers. For more information about CDH, visit https://www.cloudera.com/.

4.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 30 Kunpeng BoostKit for Big Data 4 Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Software Requirements

Item Version

JDK 1.8.0_252

Maven 3.5.4

Thrift 0.9.3

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

4.3 Configuring the Compilation Environment

4.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 31 Kunpeng BoostKit for Big Data 4 Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 32 Kunpeng BoostKit for Big Data 4 Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Install dependencies using the Yum source.

yum -y install wget make 4.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 33 Kunpeng BoostKit for Big Data 4 Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

java -version The installation is successful if information similar to the following is displayed:

----End 4.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 34 Kunpeng BoostKit for Big Data 4 Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 4.3.4 Installing Protobuf

CentOS

Step 1 Install Protobuf. yum install -y protobuf protobuf-devel

Step 2 Check whether Protobuf is installed successfully. protoc --version

The installation is successful if information similar to the following is displayed:

Step 3 Install Maven. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/bin/protoc

----End openEuler

Step 1 Download and decompress the source code. wget https://github.com/protocolbuffers/protobuf/releases/download/v2.5.0/protobuf-2.5.0.tar.gz tar -zxf protobuf-2.5.0.tar.gz

Step 2 Move the decompressed directory to the /opt/tools/installed/ directory. mv protobuf-2.5.0 /opt/tools/installed/

Step 3 Go to the /opt/tools/installed/ directory. cd /opt/tools/installed

Step 4 Download the protoc.zip package and decompress it to obtain the protoc.patch file whose storage path can be specified, for example, to /opt/tools/installed/.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 35 Kunpeng BoostKit for Big Data 4 Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

wget https://mirrors.huaweicloud.com/kunpeng/archive/kunpeng_solution/bigdata/Patch/protoc.zip unzip protoc.zip cp ./protoc/protoc.patch ./protobuf-2.5.0/src/google/protobuf/stubs/

Step 5 Go to the protobuf-2.5.0/src/google/protobuf/stubs/ directory and install the patch. cd protobuf-2.5.0/src/google/protobuf/stubs/ patch -p1 < protoc.patch

Step 6 Go back to the root directory of protobuf-2.5.0, compile the file, and install it in the default directory. cd /opt/tools/installed/protobuf-2.5.0 ./autogen.sh && ./configure CFLAGS='-fsigned-char' && make -j8 && make install

Step 7 Deploy Protoc in the local Maven repository. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/local/bin/protoc

----End 4.3.5 Installing Thrift

Step 1 Download the Thrift package. wget http://archive.apache.org/dist/thrift/0.9.3/thrift-0.9.3.tar.gz

Step 2 Decompress the package. tar -xvf thrift-0.9.3.tar.gz

Step 3 Switch to the folder generated after the Thrift package is decompressed. cd thrift-0.9.3

Step 4 Add the execute permission for the configure file. chmod +x configure

Step 5 Install Thrift. ./configure --disable-gen-erl --disable-gen-hs --without-ruby --without-haskell --without-erlang make make install

Step 6 Run the following command to check whether Thrift has been successfully installed. thrift -version

----End

4.4 Compiling Parquet-MR

NO TE

This section explains the compilation for CDH 6.3.0. Refer to this section when compiling for another version.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 36 Kunpeng BoostKit for Big Data 4 Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Prerequisites You have downloaded and decompressed the Parquet-mr-cdh6.3.0 source package.

Procedure

Step 1 Go to the Parquet-mr-cdh6.3.0 source code directory. cd parquet-mr-cdh6.3.0-release Step 2 Modify the pom.xml file to configure the Kunpeng repository. vim pom.xml To be specific, add the Kunpeng Maven repository to the repositories tag. The Kunpeng Maven repository must be placed first.

Kunpengmaven Kunpeng maven https://mirrors.huaweicloud.com/kunpeng/maven/ huaweicloud.repo HuaweiCloud Repositories https://mirrors.huaweicloud.com/repository/maven wso2.repo http://maven.wso2.org/nexus/content/groups/wso2-public/ wso2 Repositories pentaho-repo pentaho-repo https://public.nexus.pentaho.org/content/groups/omni/ bsdn-repo bsdn Repositories http://nexus.bsdn.org/content/repositories/public/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 37 Kunpeng BoostKit for Big Data 4 Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Perform the compilation. mvn package apache-rat:check -Drat.numUnapprovedLicenses=1 -DskipTests The compilation is successful if the following information is displayed:

Parquet-MR is a formatted JAR package. After compilation, the corresponding component package is stored in the directory of each component.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 38 Kunpeng BoostKit for Big Data 4 Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 4 Use the Kunpeng Porting Advisor to scan the JAR packages in the directory of each component shown above and ensure that no x86 .so or .jar packages are contained.

NO TE

The compiled component must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled package contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

4.5 Troubleshooting

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 39 Kunpeng BoostKit for Big Data 4 Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

4.5.1 Certificate Error Reported After git clone Is Executed

Symptom In the compilation process, "fatal:unable to access 'https://github.com/ariya/ phantomjs.git/': Peer's Certificate issuer is not recognized." is displayed.

Procedure Set http.sslVerify to false.

git config --global http.sslVerify false 4.5.2 Failed to Verify the github.com Certificate Downloaded by Using wget

Symptom During compilation, an error is reported, indicating that the certificate issued by github.com cannot be verified. The permission of the issuer cannot be verified locally.

Procedure To connect to github.com in an insecure manner, specify the --no-check- certificate parameter in the wget command.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 40 Kunpeng BoostKit for Big Data 5 Sentry-2.1.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

5 Sentry-2.1.0-cdh6.3.0 Porting Guide (CentOS 7.6 & openEuler 20.03)

5.1 Introduction 5.2 Environment Requirements 5.3 Configuring the Compilation Environment 5.4 Compiling Sentry 5.5 Troubleshooting

5.1 Introduction

Sentry is a fine-grained role-based authorization component in Hadoop. Sentry provides access control for authenticated users and applications in a Hadoop cluster. This document describes how to adapt Sentry components in CDH to TaiShan servers.

For more information about CDH, visit https://www.cloudera.com/.

5.2 Environment Requirements

Hardware Requirements

Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 41 Kunpeng BoostKit for Big Data 5 Sentry-2.1.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Software requirements

Item Version

JDK 1.8.0_252

Maven 3.5.4

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

5.3 Configuring the Compilation Environment

5.3.1 Configuring the Local Yum Source

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual .iso package name.

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 42 Kunpeng BoostKit for Big Data 5 Sentry-2.1.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Create the /etc/yum.repos.d/Local.repo file. vim /etc/yum.repos.d/Local.repo

Add the following content to the file to configure the local Yum source:

[Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install related software. yum -y install wget git

----End 5.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End 5.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 43 Kunpeng BoostKit for Big Data 5 Sentry-2.1.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 44 Kunpeng BoostKit for Big Data 5 Sentry-2.1.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

5.4 Compiling Sentry

NO TE

This section explains the compilation for CDH 6.3.0. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Sentry-2.1.0-cdh6.3.0 source package.

Procedure

Step 1 Go to the Sentry-2.1.0-cdh6.3.0 source code directory. cd sentry-cdh6.3.0-release Step 2 Modify the pom.xml file to configure the Kunpeng repository. vim pom.xml 1. To be specific, add the Kunpeng Maven repository to the repositories tag. The Kunpeng Maven repository must be placed first. Kunpengmaven Kunpeng maven https://mirrors.huaweicloud.com/kunpeng/maven/ huaweicloud.repo HuaweiCloud Repositories https://mirrors.huaweicloud.com/repository/maven wso2.repo http://maven.wso2.org/nexus/content/groups/wso2-public/ wso2 Repositories 2. Change the repository whose ID is apache to the Cloudera repository. cloudera.repo https://repository.cloudera.com/artifactory/cdh-releases-rcs/ 3. Comment out a repository that cannot be accessed.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 45 Kunpeng BoostKit for Big Data 5 Sentry-2.1.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Modify the sentry-tests/pom.xml file. vim sentry-tests/pom.xml Change the value of activeByDefault in the profile whose ID is hive-authz1 to false so that it is disabled by default.

false

Step 4 Perform the compilation. mvn package -DskipTests The compilation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 46 Kunpeng BoostKit for Big Data 5 Sentry-2.1.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

The compilation result is apache-sentry-2.1.0-cdh6.3.0-bin.tar.gz in sentry-dist/ target/.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 47 Kunpeng BoostKit for Big Data 5 Sentry-2.1.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 5 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

NO TE

The compiled apache-sentry-2.1.0-cdh6.3.0-bin.tar.gz package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled package contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see Kunpeng Porting Advisor Case Study.

----End

5.5 Troubleshooting

5.5.1 Certificate Error Reported After git clone Is Executed

Symptom In the compilation process, "fatal:unable to access 'https://github.com/ariya/ phantomjs.git/': Peer's Certificate issuer is not recognized." is displayed.

Procedure Set http.sslVerify to false.

git config --global http.sslVerify false 5.5.2 Failed to Verify the github.com Certificate Downloaded by Using wget

Symptom During compilation, an error is reported, indicating that the certificate issued by github.com cannot be verified. The permission of the issuer cannot be verified locally.

Procedure To connect to github.com in an insecure manner, specify the --no-check- certificate parameter in the wget command.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 48 Kunpeng BoostKit for Big Data 6 Solr-7.4.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

6 Solr-7.4.0-cdh6.3.0 Porting Guide (CentOS 7.6 & openEuler 20.03)

6.1 Introduction 6.2 Environment Requirements 6.3 Configuring the Compilation Environment 6.4 Compiling Solr 6.5 Troubleshooting

6.1 Introduction Solr (pronounced "solar") is an open-source enterprise-search platform from the project. Its major features include full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document handling. Providing distributed search and index replication, Solr is designed for scalability.

6.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 49 Kunpeng BoostKit for Big Data 6 Solr-7.4.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Software Requirements Item Version

JDK 1.8.0_252

Ant 1.8.4

Maven 3.5.4

CentOS Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

6.3 Configuring the Compilation Environment

6.3.1 Configuring the Local Yum Source Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual .iso package name. Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 50 Kunpeng BoostKit for Big Data 6 Solr-7.4.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Create the /etc/yum.repos.d/Local.repo file. vim /etc/yum.repos.d/Local.repo

Add the following content to the file to configure the local Yum source:

[Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install related software. yum -y install wget git

----End 6.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End 6.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 51 Kunpeng BoostKit for Big Data 6 Solr-7.4.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 52 Kunpeng BoostKit for Big Data 6 Solr-7.4.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

6.3.4 Installing Ant

Step 1 Download Ant 1.8.4. wget https://archive.apache.org/dist/ant/binaries/apache-ant-1.8.4-bin.tar.gz tar -zxf apache-ant-1.8.4-bin.tar.gz mv apache-ant-1.8.4 /opt/tools/installed/ Step 2 Modify environment variables. vim /etc/profile Add Ant configuration at the end of the /etc/profile file. export ANT_HOME=/opt/tools/installed/apache-ant-1.8.4 export PATH=$ANT_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether the configuration takes effect: ant -version

----End

6.4 Compiling Solr

NO TE

This section explains the compilation for CDH 6.3.0. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Solr-7.4.0-cdh6.3.0 source package.

Procedure

Step 1 Go to the Solr-7.4.0-cdh6.3.0 source code directory. cd lucene-solr-cdh6.3.0-release Step 2 Modify the lucene/default-nested-ivy-settings.xml file. vim lucene/default-nested-ivy-settings.xml 1. Replace http://repo1.maven.org/maven2 in the file with https:// mirrors.huaweicloud.com/repository/maven.

2. Comment out the default ivy repository. a. Comment out all include tags.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 53 Kunpeng BoostKit for Big Data 6 Solr-7.4.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

b. Comment out the default ivy repository in the default chain.

c. Comment out the default ivy repository in the cloudera chain.

3. Add the central repository configuration. a. Add a central repository to the first line of the default chain. b. Add a central repository to the first line of the cloudera chain.

Step 3 Modify the lucene/common-build.xml file. vim lucene/common-build.xml Replace http://repo1.maven.org/maven2 in the file with https:// repo1.maven.org/maven2.

Step 4 Modify the dev-tools/scripts/poll-mirrors.py file. vim dev-tools/scripts/poll-mirrors.py Replace http://repo1.maven.org/maven2 in the file with https:// repo1.maven.org/maven2. maven_url = None if args.version is None else "https://repo1.maven.org/maven2/";

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 54 Kunpeng BoostKit for Big Data 6 Solr-7.4.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 5 Modify the cloudera/templates/cdh.build.properties file. vim cloudera/templates/cdh.build.properties Modify the repository addresses in lines 4 and 5 as follows: 4 snapshots.cloudera.com=https://repository.cloudera.com/content/repositories/snapshots/ 5 releases.cloudera.com=https://repository.cloudera.com/content/groups/cdh-releases-rcs/

Step 6 Perform the compilation. 1. Perform the compilation. ant ivy-bootstrap

ant compile

NO TE

The first compilation may fail. If the following error information is displayed, modify the cdh.build.properties file.

Modify the cdh.build.properties file. vim cdh.build.properties Adjust the values of the corresponding names as follows: snapshots.cloudera.com=https://repository.cloudera.com/content/repositories/snapshots/ releases.cloudera.com=https://repository.cloudera.com/content/groups/cdh-releases-rcs/ The compilation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 55 Kunpeng BoostKit for Big Data 6 Solr-7.4.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

2. Go to the solr directory and perform the compilation. cd solr ant create-package

NO TE

The compilation result is stored in solr/package/solr-7.4.0-SNAPSHOT.tgz in the root directory of the source code.

The compilation is successful if the following information is displayed:

Step 7 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 56 Kunpeng BoostKit for Big Data 6 Solr-7.4.0-cdh6.3.0 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

NO TE

The compiled solr-7.4.0-SNAPSHOT.tgz must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled package contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see Kunpeng Porting Advisor Case Study.

----End

6.5 Troubleshooting

6.5.1 Certificate Error Reported After git clone Is Executed

Symptom In the compilation process, "fatal:unable to access 'https://github.com/ariya/ phantomjs.git/': Peer's Certificate issuer is not recognized." is displayed.

Procedure Set http.sslVerify to false.

git config --global http.sslVerify false 6.5.2 Failed to Verify the github.com Certificate Downloaded by Using wget

Symptom During compilation, an error is reported, indicating that the certificate issued by github.com cannot be verified. The permission of the issuer cannot be verified locally.

Procedure To connect to github.com in an insecure manner, specify the --no-check- certificate parameter in the wget command.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 57 Kunpeng BoostKit for Big Data 7 Hive-1.1.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

7 Hive-1.1.0-cdh5.13.3 Porting Guide (CentOS 7.6 & openEuler 20.03)

7.1 Introduction 7.2 Environment Requirements 7.3 Configuring the Compilation Environment 7.4 Performing Porting Analysis 7.5 Compiling Hive 7.6 Troubleshooting

7.1 Introduction Hive is a data warehouse tool running on Hadoop. It maps structured data files to a database table, provides simple SQL search functions, and converts SQL statements into MapReduce tasks.

7.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 58 Kunpeng BoostKit for Big Data 7 Hive-1.1.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Software Requirements

Item Version

OpenJDK 1.8.0_252

Maven 3.5.4

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

7.3 Configuring the Compilation Environment

7.3.1 Configuring the Local Yum Source

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual .iso package name.

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 59 Kunpeng BoostKit for Big Data 7 Hive-1.1.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Create the /etc/yum.repos.d/Local.repo file. vim /etc/yum.repos.d/Local.repo

Add the following content to the file to configure the local Yum source:

[Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install related software. yum -y install wget git

----End 7.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End 7.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 60 Kunpeng BoostKit for Big Data 7 Hive-1.1.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 61 Kunpeng BoostKit for Big Data 7 Hive-1.1.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

7.4 Performing Porting Analysis Use the Kunpeng Porting Advisor to scan the Hive installation package and obtain information about the third-party dependencies to be ported. For details, see the Kunpeng Porting Advisor Case Study. Table 7-1 lists the third-party dependencies to be ported.

Table 7-1 Third-party dependencies to be ported Original JAR Package SO File

jline-2.12.jar libjansi.so

leveldbjni-all-1.8.jar libleveldbjni.so

netty-all-4.0.23.Final.jar libnetty-transport-native-epoll.so

snappy-java-1.0.4.1.jar libsnappyjava.so

7.5 Compiling Hive

NO TE

This section explains the compilation for CDH 5.13.3. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Hive-cdh5.13.3 source package.

Procedure

Step 1 Go to the Hive-cdh5.13.3 source code directory. cd hive-cdh5.13.3-release Step 2 Modify the pom.xml file. vim pom.xml Add the Maven repository source in line 191.

kunpengmaven kunpeng maven https://mirrors.huaweicloud.com/kunpeng/maven

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 62 Kunpeng BoostKit for Big Data 7 Hive-1.1.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Press Esc and run :wq to save the configuration and exit. Step 4 Perform the compilation. mvn package -DskipTests -Pdist -Dtar -Phadoop-2 Obtain the TAR packages in the packaging/target directory.

Step 5 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

NO TE

The compiled apache-hive-1.1.0-cdh5.13.3-bin.tar.gz package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

7.6 Troubleshooting

7.6.1 Compilation Error: "Could not find artifact com.google.protobuf:protoc:exe:linux-aarch_64:2.5.0"

Symptom During the compilation process, "Could not find artifact com.google.protobuf:protoc:exe:linux-aarch_64:2.5.0" is displayed.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 63 Kunpeng BoostKit for Big Data 7 Hive-1.1.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Solution

Step 1 Install Protobuf. yum install -y protobuf protobuf-devel Step 2 Deploy Protoc in the local Maven repository. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/bin/protoc

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 64 Kunpeng BoostKit for Big Data 8 Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

8 Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6 & openEuler 20.03)

8.1 Introduction 8.2 Environment Requirements 8.3 Configuring the Compilation Environment 8.4 Performing Porting Analysis 8.5 Compiling Spark 8.6 Troubleshooting

8.1 Introduction

Spark is a unified analysis engine used for large-scale data processing. It features scalability and memory-based computing and has become a unified platform for quick processing of lightweight big data. Spark can be used to build the data store and running system for various applications, such as real-time stream processing, machine learning, and interactive query.

For more information about Spark, see the official Spark documentation on the official website.

8.2 Environment Requirements

Hardware Requirements

Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 65 Kunpeng BoostKit for Big Data 8 Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Item Remarks

Network Accessible to the Internet

Software Requirements

Item Version

OpenJDK 1.8.0_252

Maven 3.5.4

R 3.1.1

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

8.3 Configuring the Compilation Environment

8.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 66 Kunpeng BoostKit for Big Data 8 Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 67 Kunpeng BoostKit for Big Data 8 Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies

Install dependencies using the Yum source.

yum install -y wget git make 8.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 68 Kunpeng BoostKit for Big Data 8 Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 8.3.3 Installing Maven Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 69 Kunpeng BoostKit for Big Data 8 Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 8.3.4 Installing the R Language

Step 1 Download the R language source code package and decompress it. wget http://cran.rstudio.com/src/base/R-3/R-3.1.1.tar.gz tar -zxf R-3.1.1.tar.gz

Step 2 Go to the directory generated after the decompression. cd R-3.1.1

Step 3 Compile and install the R language to the specified directory. ./configure --enable-R-shlib --enable-R-static-lib --with-libpng --with-jpeglib --prefix=/opt/tools/ installed/R-3.1.1 make all -j8 && make install

Step 4 Configure the R language environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export R_HOME=/opt/tools/installed/R-3.1.1 export PATH=$R_HOME/bin:$PATH

Step 5 Make the environment variables take effect. source /etc/profile

Step 6 Verify the R language. R --version R version 3.1.1 (2014-07-10) -- "Sock it to Me" Copyright (C) 2014 The R Foundation for Statistical Computing Platform: aarch64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under the terms of the GNU General Public License versions 2 or 3. For more information about these matters see http://www.gnu.org/licenses/.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 70 Kunpeng BoostKit for Big Data 8 Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

8.4 Performing Porting Analysis

Use the Kunpeng Porting Advisor to scan the Spark installation package (x86 official version) and obtain information about the third-party dependencies to be ported. For details, see the Kunpeng Porting Advisor Case Study. Table 8-1 lists the third-party dependencies to be ported.

Table 8-1 Software porting analysis result

Original JAR Package SO File

chimera-0.9.2.jar libchimera.so

jline-2.10.5.jar libjansi.so

lz4-1.3.0.jar liblz4-java.so

jna-3.0.9.jar libjnidispatch.so

netty-all-4.0.29.Final.jar libnetty-transport-native-epoll.so

NO TE

The purpose is to identify the .jar packages to be recompiled so that they can be ported to the Kunpeng platform.

8.5 Compiling Spark

NO TE

This section explains the compilation for CDH 5.13.3. Refer to this section when compiling for another version.

Prerequisites

You have downloaded and decompressed the Spark-cdh5.13.3 source package.

Procedure

Step 1 Go to the Spark-cdh5.13.3 source code directory. cd spark-cdh5.13.3-release

Step 2 Modify the pom.xml file. vim pom.xml

Add the Maven repository source in line 224. kunpengmaven kunpeng maven https://mirrors.huaweicloud.com/kunpeng/maven

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 71 Kunpeng BoostKit for Big Data 8 Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Compile the source code. ./make-distribution.sh --tgz -Pyarn,hive,sparkr -DskipTests The operation is successful if the following information is displayed:

Obtain the .tgz package generated in the root directory of the source code.

Step 4 Use the Kunpeng Porting Advisor to scan the spark-1.6.0-cdh5.13.3-bin-2.6.0- cdh5.13.3.tgz package and ensure that no x86 .so or .jar packages are contained.

NO TE

The compiled directory must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

8.6 Troubleshooting

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 72 Kunpeng BoostKit for Big Data 8 Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

8.6.1 "Cannot find 'R_HOME'. Please specify 'R_HOME' or make sure R is properly installed" Is Reported During the Spark Compilation

Symptom During the Spark compilation, the error "Cannot find 'R_HOME'. Please specify'R_HOME' or make sure R is properly installed" is reported.

Cause Analysis The R language support is enabled during Spark compilation. You need to compile and install the R language in the environment.

Handling Procedure Compile and install the R language in the /opt/tools/installed directory, set R_HOME, and run the compilation command again.

export R_HOME=/opt/tools/installed/R-3.1.1 8.6.2 "error: cannot compile a simple Fortran program" Reported During the R Language Compilation

Symptom During the R language compilation, "error: cannot compile a simple Fortran program" is displayed.

Cause Analysis The gfortran package does not exist in the system.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 73 Kunpeng BoostKit for Big Data 8 Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Solution

Run the yum command to install the gfortran package in the OS image.

yum -y install gcc-gfortran.aarch64 8.6.3 "configure: error: --with-x=yes (default) and X11 headers/libs are not available" Reported During the R Language Compilation

Symptom

During the R language compilation, "configure: error: --with-x=yes (default) and X11 headers/libs are not available" is displayed.

Cause Analysis

--with-x=yes (use the X Window System) is enabled by default. Therefore, you need to install the libXt-devel module.

Solution

Run the yum command to install the module in the OS image.

yum -y install libXt-devel.aarch64 8.6.4 "/usr/bin/install: cannot stat' NEWS.pdf': No such file or directory" Reported During the R Language Compilation

Symptom

During the R language compilation, "/usr/bin/install: cannot stat' NEWS.pdf': No such file or directory" is displayed.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 74 Kunpeng BoostKit for Big Data 8 Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Cause Analysis The NEWS.pdf file is not found in the source code directory R-3.1.1. Therefore, the file cannot be copied when the make install command is executed.

Solution The NEWS text file exists in the doc directory. Therefore, copy the contents of the NEWS file in the doc directory to the NEWS.pdf file.

cat doc/NEWS > doc/NEWS.pdf

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 75 Kunpeng BoostKit for Big Data Porting Guide (CDH) 9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)

9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)

9.1 Introduction 9.2 Environment Requirements 9.3 Configuring the Compilation Environment 9.4 Compiling Avro 9.5 Troubleshooting

9.1 Introduction Apache Avro offers data serialization capabilities and provides data exchange services in big data-based systems and applications. It supports binary serialization to simply and rapidly process a large amount of data, and integrates dynamic languages to facilitate flexible processing of Avro data. This document describes how to adapt Avro components in CDH to TaiShan servers. For more information about CDH, visit https://www.cloudera.com/.

9.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 76 Kunpeng BoostKit for Big Data Porting Guide (CDH) 9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)

Software Requirements Item Version

CentOS 7.6

OS Kernel 4.14.0-115

GCC 4.8.5

OpenJDK 1.7.0_191

Maven 3.5.4

Ant 1.7.1

9.3 Configuring the Compilation Environment

9.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 77 Kunpeng BoostKit for Big Data Porting Guide (CDH) 9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)

Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 78 Kunpeng BoostKit for Big Data Porting Guide (CDH) 9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Use the Yum source to install dependencies.

yum install -y boost.aarch64 boost-devel.aarch64 make cmake wget vim openssl-devel zlib-devel automake libtool libstdc++-static glibc-static git snappy snappy-devel 9.3.2 Installing OpenJDK

Step 1 Install JDK 1.7. yum install -y java-1.7.0-openjdk java-1.7.0-openjdk-devel Step 2 Configure Java environment variables. vim /etc/profile Add the following content to the end of the file: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.aarch64 export PATH=$JAVA_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether the OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 9.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 79 Kunpeng BoostKit for Big Data Porting Guide (CDH) 9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)

Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 9.3.4 Installing Ant

Step 1 Download and install the package to the specified directory. wget https://archive.apache.org/dist/ant/binaries/apache-ant-1.7.1-bin.tar.gz tar -zxf apache-ant-1.7.1-bin.tar.gz mv apache-ant-1.7.1 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 80 Kunpeng BoostKit for Big Data Porting Guide (CDH) 9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)

Step 2 Modify environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export ANT_HOME=/opt/tools/installed/apache-ant-1.7.1 export PATH=$ANT_HOME/bin:$PATH

Step 3 Run the following command for the environment variables to take effect: source /etc/profile

Step 4 Check whether the configuration takes effect. ant -version Apache Ant version 1.7.1 compiled on June 27 2008

----End 9.3.5 Installing Forrest

Step 1 Download and install Forrest to a specified directory, for example, /opt/tools/ installed. wget http://archive.apache.org/dist/forrest/0.9/apache-forrest-0.9.tar.gz tar -zxf apache-forrest-0.9.tar.gz mv apache-forrest-0.9 /opt/tools/installed/

Step 2 Modify environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export FORREST_HOME=/opt/tools/installed/apache-forrest-0.9 export PATH=$FORREST_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether the environment variables take effect. forrest -projecthelp

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 81 Kunpeng BoostKit for Big Data Porting Guide (CDH) 9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)

9.4 Compiling Avro

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Avro-cdh5.12.1 source package.

Procedure

Step 1 Go to the Avro-cdh5.12.1 directory after the decompression. cd avro-cdh5.12.1-release

Step 2 Modify the pom.xml file. vim pom.xml

Add the Kunpeng Maven repository to line 58. kunpengmaven kunpeng maven https://mirrors.huaweicloud.com/kunpeng/maven

Step 3 Modify the build.sh file. vim build.sh 1. Comment out lines 88, 89, and 105. 88 #rm -rf build/${SRC_DIR} 89 #svn export --force . build/${SRC_DIR} ... 105 #(cd lang/py3; python3 setup.py sdist; cp -r dist ../../dist/py3)

NO TE

In CentOS 7.6, the Python version is 2.7.5. You need to remove the compilation for Python 3. 2. Enter the following command to line 90: 90 mkdir -p build/${SRC_DIR} ... 3. Comment out lines 111 to 121. 111 #(cd lang/csharp; ./build.sh dist) 112 113 #(cd lang/js; ./build.sh dist) 114 115 #(cd lang/ruby; ./build.sh dist) 116 117 #(cd lang/php; ./build.sh dist) 118 119 #mkdir -p dist/perl 120 #(cd lang/perl; perl ./Makefile.PL && make dist) 121 #cp lang/perl/Avro-$VERSION.tar.gz dist/perl/ ...

NO TE

C#, JavaScript, Ruby, PHP, and Perl are rarely used and do not need to be compiled.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 82 Kunpeng BoostKit for Big Data Porting Guide (CDH) 9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)

4. Modify the docs compilation procedure in line 124 and added the Forrest configuration. 124 (cd doc; ant -Dforrest.home=/opt/tools/installed/apache-forrest-0.9) Step 4 Perform the compilation. ./build.sh dist The operation is successful if the following information is displayed:

Step 5 The compiled Java deployment package is stored in the dist/java directory.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 83 Kunpeng BoostKit for Big Data Porting Guide (CDH) 9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)

NO TE

The compiled avro-tools-1.7.6-cdh5.12.1.jar file is stored in dist/java/avro-tools-1.7.6- cdh5.12.1.jar. Step 6 Compile and install avro-c. 1. Go to the directory and decompress the package. cd dist/c tar -zxf avro-c-1.7.6-cdh5.12.1.tar.gz 2. Go to the directory generated after the decompression. cd avro-c-1.7.6-cdh5.12.1 3. Create a compilation directory and go to the directory to perform compilation. mkdir build cd build cmake .. -DCMAKE_INSTALL_PREFIX=$PREFIX -DCMAKE_BUILD_TYPE=RelWithDebInfo make make install 4. The operation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 84 Kunpeng BoostKit for Big Data Porting Guide (CDH) 9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)

5. Go to the root directory of the source code and prepare for the next step. cd ../../../../

NO TE

For example, after the compilation is complete, you can find the Avro-related commands in the /usr/bin directory.

Find libavro.so in the /usr/lib directory.

Step 7 Compile and install avro-cpp. 1. Go to the directory and decompress the package.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 85 Kunpeng BoostKit for Big Data Porting Guide (CDH) 9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)

cd dist/cpp tar -zxf avro-cpp-1.7.6-cdh5.12.1.tar.gz 2. Go to the directory generated after the decompression. cd avro-cpp-1.7.6-cdh5.12.1 3. Perform the compilation. ./build.sh install The operation is successful if the following information is displayed:

NO TE

For example, after the compilation is complete, you can find the Avro-related commands in the /usr/bin directory.

Find libavrocpp.so in the /usr/local/lib directory.

Step 8 Use the Kunpeng Porting Advisor to scan the directory generated after compilation and ensure that the .so and .jar packages of x86 are not contained.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 86 Kunpeng BoostKit for Big Data Porting Guide (CDH) 9 Avro-1.7.6-cdh5.12.1 Porting Guide (CentOS 7.6)

NO TE

The compiled directory must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

9.5 Troubleshooting

9.5.1 Failed to Verify the github.com Certificate Downloaded by Using wget

Symptom During compilation, an error is reported, indicating that the certificate issued by github.com cannot be verified. The permission of the issuer cannot be verified locally.

Procedure To connect to github.com in an insecure manner, specify the --no-check- certificate parameter in the wget command.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 87 Kunpeng BoostKit for Big Data 10 Bigtop-jsvc-1.0.10-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

10 Bigtop-jsvc-1.0.10-cdh5.12.1 Porting Guide (CentOS 7.6)

10.1 Introduction 10.2 Environment Requirements 10.3 Configuring the Compilation Environment 10.4 Compiling bigtop-jsvc

10.1 Introduction The primary goal of Bigtop is to build a community around the packaging, deployment and interoperability testing of Hadoop-related projects. This includes testing at various levels (packaging, platform, runtime, upgrade, etc...) developed by a community with a focus on the system as a whole, rather than individual projects. JSVC is a sub-module of Bigtop.

10.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 88 Kunpeng BoostKit for Big Data 10 Bigtop-jsvc-1.0.10-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

Software Requirements

Item Version

CentOS 7.6

OS Kernel 4.14.0

GCC 4.8.5

OpenJDK 1.7.0_191

10.3 Configuring the Compilation Environment

10.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo

Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64

Step 6 Resolve the -fsigned-char problem (by modifying the GCC).

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 89 Kunpeng BoostKit for Big Data 10 Bigtop-jsvc-1.0.10-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 90 Kunpeng BoostKit for Big Data 10 Bigtop-jsvc-1.0.10-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

----End

Installing Dependencies

Use the Yum source to install dependencies.

yum install -y cppunit-devel 10.3.2 Installing OpenJDK

Step 1 Install JDK 1.7. yum install -y java-1.7.0-openjdk java-1.7.0-openjdk-devel

Step 2 Configure Java environment variables. vim /etc/profile

Add the following content to the end of the file: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.aarch64 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether the OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End

10.4 Compiling bigtop-jsvc

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites

You have downloaded and decompressed the bigtop-jsvc-1.0.10-cdh5.12.1 source package.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 91 Kunpeng BoostKit for Big Data 10 Bigtop-jsvc-1.0.10-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

Procedure

Step 1 Go to the bigtop-jsvc-1.0.10-cdh5.12.1 directory after the decompression. cd bigtop-jsvc-1.0.10-cdh5.12.1/unix/ Step 2 Generate a makefile. ./configure --with-java=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.191-2.6.15.5.el7.aarch64 -build=arm-linux Step 3 Change the value of INCLUDES in the Makedefs file to the following: vim Makedefs INCLUDES = -I/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.191-2.6.15.5.el7.aarch64/include -I/usr/lib/jvm/ java-1.7.0-openjdk-1.7.0.191-2.6.15.5.el7.aarch64/include/linux Step 4 Perform compilation. make

NO TE

The compiled jsvc file is stored in the root directory of the current compilation, that is, the unix directory in the root directory of the source code.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 92 Kunpeng BoostKit for Big Data 11 Flume-ng-1.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

11 Flume-ng-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)

11.1 Introduction 11.2 Environment Requirements 11.3 Configuring the Compilation Environment 11.4 Compiling Flume-NG 11.5 Troubleshooting

11.1 Introduction Flume-NG is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. The system centrally manages and allows intelligent dynamic management. It uses a simple extensible data model that allows for online analytic application. Cloudera launched Flume, involving some landmark changes to Flume: refactoring the core components, core configuration and code architecture, and the refactored version is collectively referred to as Flume NG (next generation). Another reason for the changes is that Flume has been incorporated into Apache, and Cloudera Flume was renamed Apache Flume. For more information about CDH, visit https://www.cloudera.com/.

11.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 93 Kunpeng BoostKit for Big Data 11 Flume-ng-1.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Item Remarks

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Software Requirements

Item Version

CentOS 7.6

OS Kernel 4.14.0

GCC 4.8.5

OpenJDK 1.7.0_191

Maven 3.5.4

11.3 Configuring the Compilation Environment

11.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo

Configure the local Yum source.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 94 Kunpeng BoostKit for Big Data 11 Flume-ng-1.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

[Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64

Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@"

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 95 Kunpeng BoostKit for Big Data 11 Flume-ng-1.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies

Use the Yum source to install dependencies.

yum install -y wget vim 11.3.2 Installing OpenJDK

Step 1 Install JDK 1.7. yum install -y java-1.7.0-openjdk java-1.7.0-openjdk-devel

Step 2 Configure Java environment variables. vim /etc/profile

Add the following content to the end of the file: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.aarch64 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether the OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 96 Kunpeng BoostKit for Big Data 11 Flume-ng-1.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

11.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 97 Kunpeng BoostKit for Big Data 11 Flume-ng-1.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Password Proxy server URL Proxy server port local.net|some.host.com ----End

11.4 Compiling Flume-NG

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Flume-ng-1.6.0-cdh5.12.1 source package.

Procedure Step 1 Go to the Flume-ng-1.6.0-cdh5.12.1 directory after the decompression. cd flume-ng-cdh5.12.1-release Step 2 Configure the Kunpeng repository and modify the pom.xml file. vim pom.xml Add the Kunpeng Maven repository to the repositories tag. The Kunpeng Maven repository must be placed first. Kunpeng.repo https://mirrors.huaweicloud.com/kunpeng/maven/ Kunpeng Repositories huaweicloud.repo https://mirrors.huaweicloud.com/repository/maven HuaweiCloud Repositories wso2.repo http://maven.wso2.org/nexus/content/groups/wso2-public/ wso2 Repositories

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 98 Kunpeng BoostKit for Big Data 11 Flume-ng-1.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Step 3 Perform compilation. mvn package -DskipTests The operation is successful if the following information is displayed:

The compilation result is apache-flume-1.6.0-cdh5.12.1-bin.tar.gz in flume-ng- dist/target/.

Step 4 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 99 Kunpeng BoostKit for Big Data 11 Flume-ng-1.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

NO TE

The compiled apache-flume-1.6.0-cdh5.12.1-bin.tar.gz package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are not contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

11.5 Troubleshooting

11.5.1 Failed to Verify the github.com Certificate Downloaded by Using wget

Symptom

During compilation, an error is reported, indicating that the certificate issued by github.com cannot be verified. The permission of the issuer cannot be verified locally.

Procedure

To connect to github.com in an insecure manner, specify the --no-check- certificate parameter in the wget command. 11.5.2 "java.lang.OutOfMemoryError: PermGen space" Is Reported During Compilation

Symptom

During compilation, the error "java.lang.OutOfMemoryError: PermGen space" is displayed.

Handling Procedure

Before compilation, run the following command or add the following command to the end of the /etc/profile file:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 100 Kunpeng BoostKit for Big Data 11 Flume-ng-1.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

export MAVEN_OPTS="-Xmx10240m -XX:MaxPermSize=768m"

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 101 Kunpeng BoostKit for Big Data 12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)

12.1 Introduction 12.2 Environment Requirements 12.3 Configuring the Compilation Environment 12.4 Compiling Hadoop 12.5 Troubleshooting

12.1 Introduction Hadoop is a distributed system infrastructure developed by the Apache Foundation. It allows users to develop distributed programs using high-speed computing and storage provided by clusters without knowing the underlying details of the distributed system. This document describes how to adapt Hadoop components in CDH to TaiShan servers. For more information about CDH, visit https://www.cloudera.com/.

12.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 102 Kunpeng BoostKit for Big Data 12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Software requirements

Item Version

CentOS 7.6

OS Kernel 4.14.0-115

JDK 1.7.0_191

GCC 4.8.5

Maven 3.5.4

Ant 1.7.1

Protobuf 2.5.0

12.3 Configuring the Compilation Environment

12.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo

Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 103 Kunpeng BoostKit for Big Data 12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64

Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 104 Kunpeng BoostKit for Big Data 12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Install dependencies using the Yum source.

yum install -y wget openssl-devel zlib-devel automake libtool make cmake libstdc++-static glibc- static git snappy snappy-devel fuse fuse-devel 12.3.2 Installing OpenJDK

Step 1 Install JDK 1.7. yum install -y java-1.7.0-openjdk java-1.7.0-openjdk-devel

Step 2 Configure Java environment variables. vim /etc/profile

Add the following content to the end of the file: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.aarch64 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether the OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End 12.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 105 Kunpeng BoostKit for Big Data 12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 106 Kunpeng BoostKit for Big Data 12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

12.3.4 Installing Ant

Step 1 Download and install the package to the specified directory. wget https://archive.apache.org/dist/ant/binaries/apache-ant-1.7.1-bin.tar.gz tar -zxf apache-ant-1.7.1-bin.tar.gz mv apache-ant-1.7.1 /opt/tools/installed/

Step 2 Modify environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export ANT_HOME=/opt/tools/installed/apache-ant-1.7.1 export PATH=$ANT_HOME/bin:$PATH

Step 3 Run the following command for the environment variables to take effect: source /etc/profile

Step 4 Check whether the configuration takes effect. ant -version Apache Ant version 1.7.1 compiled on June 27 2008

----End 12.3.5 Installing Protobuf

CentOS

Step 1 Install Protobuf. yum install -y protobuf protobuf-devel

Step 2 Check whether Protobuf is installed successfully. protoc --version

The installation is successful if information similar to the following is displayed:

Step 3 Install Maven. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/bin/protoc

----End openEuler

Step 1 Download and decompress the source code. wget https://github.com/protocolbuffers/protobuf/releases/download/v2.5.0/protobuf-2.5.0.tar.gz tar -zxf protobuf-2.5.0.tar.gz

Step 2 Move the decompressed directory to the /opt/tools/installed/ directory. mv protobuf-2.5.0 /opt/tools/installed/

Step 3 Go to the /opt/tools/installed/ directory. cd /opt/tools/installed

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 107 Kunpeng BoostKit for Big Data 12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Step 4 Download the protoc.zip package and decompress it to obtain the protoc.patch file whose storage path can be specified, for example, to /opt/tools/installed/. wget https://mirrors.huaweicloud.com/kunpeng/archive/kunpeng_solution/bigdata/Patch/protoc.zip unzip protoc.zip cp ./protoc/protoc.patch ./protobuf-2.5.0/src/google/protobuf/stubs/

Step 5 Go to the protobuf-2.5.0/src/google/protobuf/stubs/ directory and install the patch. cd protobuf-2.5.0/src/google/protobuf/stubs/ patch -p1 < protoc.patch

Step 6 Go back to the root directory of protobuf-2.5.0, compile the file, and install it in the default directory. cd /opt/tools/installed/protobuf-2.5.0 ./autogen.sh && ./configure CFLAGS='-fsigned-char' && make -j8 && make install

Step 7 Deploy Protoc in the local Maven repository. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/local/bin/protoc

----End

12.4 Compiling Hadoop

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Hadoop-cdh5.12.1 source package.

Procedure

Step 1 Go to the Hadoop-cdh5.12.1 source code directory. cd hadoop-common-cdh5.12.1-release

Step 2 Modify the pom.xml file in the root directory to add the Maven repository source. vim pom.xml

Add the Kunpeng Maven repository to the repositories tag. The Kunpeng Maven repository must be placed first. Kunpeng.repo https://mirrors.huaweicloud.com/kunpeng/maven/ Kunpeng Repositories false huaweicloud.repo http://mirrors.huaweicloud.com/repository/maven huaweicloud Repositories false

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 108 Kunpeng BoostKit for Big Data 12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

In addition to the dependency repository source, add the plug-in repository source. The node levels of pluginRepositories and repositories are the same. huaweicloud-plugin http://mirrors.huaweicloud.com/repository/maven true

Step 3 Modify the bswap and bswap64 methods in primitives.h. vim hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/ main/native/src/lib/primitives.h

96 /** 97 * little-endian to big-endian or vice versa 98 */ 99 inline uint32_t bswap(uint32_t val) { 100 #ifdef __aarch64__ 101 __asm__("rev %w[dst], %w[src]" : [dst]"=r"(val) : [src]"r"(val)); 102 #else 103 __asm__("bswap %0" : "=r" (val) : "0" (val)); 104 #endif 105 return val; 106 } 107 108 inline uint64_t bswap64(uint64_t val) { 109 #ifdef __aarch64__ 110 __asm__("rev %[dst], %[src]" : [dst]"=r"(val) : [src]"r"(val)); 111 #else 112 #ifdef __X64 113 __asm__("bswapq %0" : "=r" (val) : "0" (val)); 114 #else 115 116 uint64_t lower = val & 0xffffffffU; 117 uint32_t higher = (val >> 32) & 0xffffffffU; 118 119 lower = bswap(lower); 120 higher = bswap(higher); 121 122 return (lower << 32) + higher; 123 124 #endif 125 #endif 126 return val; 127 } Step 4 Modify the Checksum.cc file. vim hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/ main/native/src/util/Checksum.cc Add the following code to the line 582:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 109 Kunpeng BoostKit for Big Data 12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

#ifdef __aarch64__ // Awaiting HW implementation #define SOFTWARE_CRC #endif Step 5 Compile task-controller. 1. Go to the hadoop-mapreduce1-project directory. cd hadoop-mapreduce1-project 2. Modify the build.xml file. vim build.xml Change http://repo2.maven.org/maven2 in the file to https:// repo1.maven.org/maven2. 3. Modify the build-contrib.xml file. vim src/contrib/build-contrib.xml Change http://repo2.maven.org/maven2 in the file to https:// repo1.maven.org/maven2. value="https://repo1.maven.org/maven2/org/apache/ivy/ivy/${ivy.version}/ivy-${ivy.version}.jar" /> 4. Modify the ivysettings.xml file. vim ivy/ivysettings.xml Change http://repo1.maven.org/maven2 in the file to https:// repo1.maven.org/maven2. value="https://repo1.maven.org/maven2/" 5. Perform compilation. ant task-controller

NO TE

The compiled task-controller is stored in build/hadoop-2.6.0-mr1-cdh5.12.1/sbin/ Linux/task-controller. 6. Go to the root directory. cd hadoop-common-cdh5.12.1-release Step 6 Run the compilation command. Set -Dsnappy.lib to the directory where libsnappy.so is located. mvn package -DskipTests -Pdist,native -Dtar -Dsnappy.lib=/usr/lib64 -Dbundle.snappy - Dmaven.javadoc.skip=true The operation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 110 Kunpeng BoostKit for Big Data 12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

NO TE

The compiled deployment package is stored in hadoop-dist/target/hadoop-2.6.0- cdh5.12.1.tar.gz.

NO TE

The compiled hadoop-2.6.0-cdh5.12.1.tar.gz must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

12.5 Troubleshooting

12.5.1 Failed to Verify the github.com Certificate Downloaded by Using wget

Symptom

During compilation, an error is reported, indicating that the certificate issued by github.com cannot be verified. The permission of the issuer cannot be verified locally.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 111 Kunpeng BoostKit for Big Data 12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Procedure To connect to github.com in an insecure manner, specify the --no-check- certificate parameter in the wget command. 12.5.2 Failed to Compile or Download Hadoop

Symptom 1 During compilation, an error is reported, indicating that the certificate issued by github.com cannot be verified. The permission of the issuer cannot be verified locally. To connect to github.com in an insecure manner, use --no-check- certificate.

Solution 1 To connect to github.com in an insecure manner, specify the --no-check- certificate parameter in the wget command.

Symptom 2 During compilation, an error is reported: Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (dist) on project hadoop- kms: An Ant BuildException has occured: java.net.UnknownHostException: archive.apache.org.

Solution 2 The apache-tomcat-6.0.44.tar.gz package fails to be downloaded from the server. You need to use other methods, for example, download the package from the Windows operating system and copy it to the hadoop-common-project/hadoop- kms/downloads directory.

Symptom 3 During compilation, an error is reported: Could not resolve dependencies for project org.apache.hadoop:hadoop-auth:jar:2.7.1.2.3.4.7-4:Failed to collect dependencies at org.mortbay.jetty:jetty-util:jar:6.1.26.hwx.

Solution 3 The cause is that the jetty-util-6.1.26.hwx.jar package fails to be downloaded from the remote repository, and the corresponding JAR package cannot be downloaded from the https://repository.apache.org/content/repositories/ snapshorts. You can find the corresponding JAR package and POM file in https://

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 112 Kunpeng BoostKit for Big Data 12 Hadoop-2.6.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

repo.spring.io/plugins-release/org/ and save them to the corresponding path of the local repository. The error information indicates that the local path of the JAR package is ~/.m2/repository/org/mortbay/jetty/jetty-util/6.1.26.hwx and the remote repository path is https://repo.spring.io/plugins-release/org/mortbay/jetty/ jetty-util/6.1.26.hwx/jetty-util-6.1.26.hwx.jar. It can be seen that the path of the remote repository corresponds to the path of the local repository. If this problem occurs again, you can refer to this method. However, you must download the POM file corresponding to the JAR package to the local repository.

Symptom 4 A memory overflow error occurs during compilation.

Solution 4 Modify the environment variables before compilation.

export MAVEN_OPTS="-Xms4096m -Xmx4096m -XX:PermSize=768M -XX:MaxPermSize=768m"

Symptom 5 An SSL-related error is reported during compilation.

Solution 5 Add the following parameters to the end of the command:

-Dmaven.wagon.http.ssl.insecure=true -Dmaven.wagon.http.ssl.allowall=true

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 113 Kunpeng BoostKit for Big Data 13 HBase-1.2.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

13 HBase-1.2.0-cdh5.12.1 Porting Guide (CentOS 7.6)

13.1 Introduction 13.2 Environment Requirements 13.3 Configuring the Compilation Environment 13.4 Compiling HBase 13.5 Troubleshooting

13.1 Introduction HBase is a distributed, column-based, open-source database modeled after Google's Bigtable: A Distributed Storage System for Structured Data written by Fay Chang. Similar to Bigtable using the distributed data storage system built on Google File System, HBase provides the Bigtable-like capabilities in Hadoop. HBase is a sub-project of the project. Different from ordinary relational databases, HBase is a database suitable for unstructured data storage. Besides, HBase is based on columns rather than rows. This document describes how to port HBase-related components of CDH5.12.1 x86 to TaiShan servers. For more information about HBase, visit https://hbase.apache.org.

13.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 114 Kunpeng BoostKit for Big Data 13 HBase-1.2.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Item Remarks

Drive partition No requirement for drive partitions

Network Accessible to the Internet

OS Requirements Item Version

CentOS 7.6

OS Kernel 4.14.0

JDK 1.7.0_191

GCC 4.8.5

Maven 3.1.0

Protobuf 2.5.0

13.3 Configuring the Compilation Environment

13.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 115 Kunpeng BoostKit for Big Data 13 HBase-1.2.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64

Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 116 Kunpeng BoostKit for Big Data 13 HBase-1.2.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies

Install dependencies using the Yum source.

yum install -y wget patch openssl-devel zlib-devel automake libtool make cmake libstdc++-static glibc-static git 13.3.2 Installing OpenJDK

Step 1 Install JDK 1.7. yum install -y java-1.7.0-openjdk java-1.7.0-openjdk-devel

Step 2 Configure Java environment variables. vim /etc/profile

Add the following content to the end of the file: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.aarch64 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether the OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 117 Kunpeng BoostKit for Big Data 13 HBase-1.2.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

13.3.3 Installing Maven

Step 1 Download and install the installation package to a directory (for example, /opt/ tools/installed). wget https://archive.apache.org/dist/maven/maven-3/3.1.0/binaries/apache-maven-3.1.0-bin.tar.gz tar -zxf apache-maven-3.1.0-bin.tar.gz mkdir -p /opt/tools/installed mv apache-maven-3.1.0 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following code at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.1.0 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -version The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.1.0/conf/ settings.xml.

vim /opt/tools/installed/apache-maven-3.1.0/conf/settings.xml

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Configure the remote repository. (Change the repository to your Maven repository. If the Maven repository does not exist, configure it as follows.)

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central If the compilation environment cannot access Internet, add the following proxy configuration to the settings.xml:

optional true http

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 118 Kunpeng BoostKit for Big Data 13 HBase-1.2.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 13.3.4 Installing Protobuf

Step 1 Install Protobuf. yum install -y protobuf protobuf-devel

Step 2 Check whether the installation is successful. protoc --version

The installation is successful if information similar to the following is displayed:

Step 3 Deploy Protoc in the local Maven repository. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/bin/protoc

----End

13.4 Compiling HBase

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites

You have downloaded and decompressed the HBase-cdh5.12.1 source package.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 119 Kunpeng BoostKit for Big Data 13 HBase-1.2.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Procedure

Step 1 Go to the HBase-cdh5.12.1 source code directory. cd hbase-cdh5.12.1-release Step 2 Modify the pom.xml file. vim pom.xml Add the Kunpeng Maven repository at the first place of the repositories tag. kunpengmaven kunpeng maven http://mirrors.huaweicloud.com/kunpeng/maven huaweicloud.repo http://mirrors.huaweicloud.com/repository/maven huaweicloud Repositories

Step 3 Modify the pom.xml file. vim pom.xml Add the HUAWEI CLOUD Maven repository to the first line under the pluginRepositories tag. huaweicloud-plugin https://mirrors.huaweicloud.com/repository/maven

Step 4 Perform compilation. mvn package -DskipTests assembly:single -Pnative

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 120 Kunpeng BoostKit for Big Data 13 HBase-1.2.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

After the compilation is complete, the .tar.gz package is generated in the hbase- assembly/target/ directory.

Step 5 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

NO TE

The compiled hbase-1.2.0-cdh5.12.1-bin.tar.gz package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 121 Kunpeng BoostKit for Big Data 13 HBase-1.2.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

13.5 Troubleshooting

13.5.1 Can't find bundle for base name org.apache.jasper.resources.LocalStrings

Symptom The error message "Can't find bundle for base name org.apache.jasper.resources.LocalStrings" is displayed during compilation.

Method 1 Step 1 Go to the hbase-server directory. cd hbase-server/ Step 2 Modify the pom.xml file in the current directory. vim pom.xml Add a compile scope to the jasper-runtime dependency. tomcat jasper-runtime compile

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 122 Kunpeng BoostKit for Big Data 13 HBase-1.2.0-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Method 2

Step 1 Download a patch from the hbase-cdh5.12.1-release source code path. wget https://issues.apache.org/jira/secure/attachment/12899868/HBASE-19188.branch-1.2.002.patch Step 2 Install the patch. patch -p1 < HBASE-19188.branch-1.2.002.patch Step 3 Perform compilation. mvn package -DskipTests assembly:single -Pnative

----End 13.5.2 could not be resolved: Could not transfer artifact XXX

Symptom The error message "could not be resolved: Could not transfer artifact XXX" is displayed during compilation.

Possible Causes The Maven repository network is unstable.

Solution Run the compilation command again.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 123 Kunpeng BoostKit for Big Data 14 HBase-indexer-1.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

14 HBase-indexer-1.5-cdh5.12.1 Porting Guide (CentOS 7.6)

14.1 Introduction 14.2 Environment Requirements 14.3 Configuring the Compilation Environment 14.4 Compiling HBase-indexer 14.5 Troubleshooting

14.1 Introduction Lily HBase Indexer is developed by NGDATA to store HBase data of the Lily subsystem to Solr. NGDATA hosts the source code on GitHub and accesses the HBase-Indexer project home page and code at https://github.com/NGDATA/ hbase-indexer. This document describes how to compile the CDH release.

14.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 124 Kunpeng BoostKit for Big Data 14 HBase-indexer-1.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

Software Requirements Item Version

CentOS 7.6

OS Kernel 4.14.0

JDK 1.7.0_191

GCC 4.8.5

Maven 3.5.4

14.3 Configuring the Compilation Environment

14.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the YUM repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to agree to the deletion.

Step 3 Modify the /etc/yum.repos.d/Local.repo file to configure the local Yum repository. vim /etc/yum.repos.d/Local.repo [Local] name=CentOS-7.6 Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Run the following commands to make the configuration take effect: yum clean all yum makecache

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 125 Kunpeng BoostKit for Big Data 14 HBase-indexer-1.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

Step 5 Install the software related to GCC using the Yum source.

yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64

Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the path where GCC is located. Generally, the path is /usr/bin/gcc. command -v gcc 2. Change the name of GCC, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Run the following command, enter the information, and save the file: vim /usr/bin/gcc #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Run the following command to add execution permissions for the script: chmod +x /usr/bin/gcc 5. Run the following command to ensure that the command is available: gcc --version

Step 7 Resolve the -fsigned-char problem (by modifying the g++). 1. Search for the path where GCC is located. Generally, the path is /usr/bin/g++. command -v g++ 2. Change the name of g++, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Run the following command, enter the information, and save the file: vim /usr/bin/g++ #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Run the following command to add execution permissions for the script: chmod +x /usr/bin/g++ 5. Run the following command to ensure that the command is available: g++ --version

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 126 Kunpeng BoostKit for Big Data 14 HBase-indexer-1.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

Installing Dependencies

Run the following command to install the dependencies:

yum install -y wget vim openssl-devel zlib-devel automake libtool make libstdc++-static glibc-static git snappy snappy-devel fuse fuse-devel 14.3.2 Installing OpenJDK

Step 1 Install JDK 1.7. yum install -y java-1.7.0-openjdk java-1.7.0-openjdk-devel

Step 2 Configure Java environment variables. vim /etc/profile

Add the following content to the end of the file: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.aarch64 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether the OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End 14.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 127 Kunpeng BoostKit for Big Data 14 HBase-indexer-1.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End

14.4 Compiling HBase-indexer

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites

You have downloaded and decompressed the HBase-indexer-cdh5.12.1 source package.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 128 Kunpeng BoostKit for Big Data 14 HBase-indexer-1.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

Procedure

Step 1 Go to the HBase-indexer-cdh5.12.1 source code directory. cd hbase-indexer-cdh5.12.1-release Step 2 Modify the pom.xml file. vim pom.xml 1. In the repositories tag, add the Kunpeng Maven repository source and HUAWEI CLOUD Maven repository. The Kunpeng Maven repository must be placed first. kunpengmaven kunpeng maven http://mirrors.huaweicloud.com/kunpeng/maven huaweicloud.repo huaweicloud Repositories http://mirrors.huaweicloud.com/repository/maven

2. Comment out the lilyproject.snapshot repository in the repositories tag.

3. Change the repository addresses of cdh.repo and cdh.snapshots.repo. cdh.repo https://repository.cloudera.com/content/repositories/releases/ Cloudera Repositories false cdh.snapshots.repo https://repository.cloudera.com/cloudera/libs-snapshot-local/ Cloudera Snapshots Repository true false

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 129 Kunpeng BoostKit for Big Data 14 HBase-indexer-1.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

4. Add the Maven repository at the end of the repositories tag. wso2 maven wso2 repository http://maven.wso2.org/nexus/content/groups/wso2-public

5. Modify the pom.xml file and add the Huawei image Maven repository to the first line of the pluginRepositories tag. huaweicloud-plugin https://mirrors.huaweicloud.com/repository/maven

Step 3 Perform compilation. mvn package apache-rat:check -Drat.numUnapprovedLicenses=600 -DskipTests -Dtar -Pdist

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 130 Kunpeng BoostKit for Big Data 14 HBase-indexer-1.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

A tar.gz package is generated in the hbase-indexer-dist/target directory.

Step 4 Use the Kunpeng Porting Advisor to scan the .tar package generated after the compilation and ensure that the .tar package does not contain x86 .so or .jar packages.

NO TE

The compiled hbase-indexer-1.5-cdh5.12.1.tar.gz package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are not contained. If the compiled package contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

14.5 Troubleshooting

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 131 Kunpeng BoostKit for Big Data 14 HBase-indexer-1.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

14.5.1 org.apache.rat:apache-rat-plugin:0.8:check (default) on project hbase: Too many unapproved license

Symptom org.apache.rat:apache-rat-plugin:0.8:check (default) on project hbase: Too many unapproved license

Solution Add apache-rat:check -Drat.numUnapprovedLicenses=600 to the mvn command. 14.5.2 java.lang.OutOfMemoryError: PermGen space

Symptom java.lang.OutOfMemoryError: PermGen space

Solution Import MAVEN_OPTS and compile it.

export MAVEN_OPTS="-Xms10240m -Xmx10240m -XX:PermSize=1280m -XX:MaxPermSize=1280m"

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 132 Kunpeng BoostKit for Big Data Porting Guide (CDH) 15 Hive-1.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

15 Hive-1.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

15.1 Introduction 15.2 Environment Requirements 15.3 Configuring the Compilation Environment 15.4 Compiling Hive 15.5 Troubleshooting

15.1 Introduction Hive is a data warehouse tool running on Hadoop. It maps structured data files to a database table, provides simple SQL search functions, and converts SQL statements into MapReduce tasks.

15.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 133 Kunpeng BoostKit for Big Data Porting Guide (CDH) 15 Hive-1.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Software Requirements Item Version

CentOS 7.6

OS kernel 4.14.0-115

GCC 4.8.5

OpenJDK 1.7.0_191

Maven 3.5.4

Protobuf 2.5.0

15.3 Configuring the Compilation Environment

15.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 134 Kunpeng BoostKit for Big Data Porting Guide (CDH) 15 Hive-1.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 135 Kunpeng BoostKit for Big Data Porting Guide (CDH) 15 Hive-1.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Install dependencies using the Yum source.

yum install -y wget vim patch openssl-devel zlib-devel automake libtool make cmake libstdc++-static glibc-static git 15.3.2 Installing OpenJDK

Step 1 Install JDK 1.7. yum install -y java-1.7.0-openjdk java-1.7.0-openjdk-devel Step 2 Configure Java environment variables. vim /etc/profile Add the following content to the end of the file: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.aarch64 export PATH=$JAVA_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether the OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 15.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 136 Kunpeng BoostKit for Big Data Porting Guide (CDH) 15 Hive-1.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 15.3.4 Installing Protobuf

Step 1 Install Protobuf. yum install -y protobuf protobuf-devel Step 2 Check whether the installation is successful.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 137 Kunpeng BoostKit for Big Data Porting Guide (CDH) 15 Hive-1.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

protoc --version

The installation is successful if information similar to the following is displayed:

Step 3 Deploy Protoc in the local Maven repository. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/bin/protoc

----End

15.4 Compiling Hive

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Hive-cdh5.12.1 source package.

Procedure

Step 1 Go to the Hive-cdh5.12.1 source code directory. cd hive-cdh5.12.1-release

Step 2 Modify the pom.xml file. vim pom.xml

In the pom.xml file, add the Maven repository configuration to the repositories section. The Kunpeng repository source must be placed in the first place. kunpengmaven kunpeng maven https://mirrors.huaweicloud.com/kunpeng/maven central-maven central maven

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 138 Kunpeng BoostKit for Big Data Porting Guide (CDH) 15 Hive-1.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

https://repo1.maven.org/maven2

You also need to add the Maven plug-in repository.

huaweicloud-plugin http://mirrors.huaweicloud.com/repository/maven true

Step 3 Compile the source code. mvn package -DskipTests -Pdist -Dtar -Phadoop-2

The operation is successful if the following information is displayed:

After the compilation is complete, the apache-hive-1.1.0-cdh5.12.1-bin.tar.gz package is generated in packaging/target.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 139 Kunpeng BoostKit for Big Data Porting Guide (CDH) 15 Hive-1.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Step 4 Use the Kunpeng Porting Advisor to scan the .tar package generated after the compilation and ensure that the .tar package does not contain the x86 .so or .jar packages.

NO TE

The compiled apache-hive-1.1.0-cdh5.12.1-bin.tar.gz package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are not contained. If the compiled package contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

15.5 Troubleshooting

15.5.1 Compilation Error: "Could not find artifact com.google.protobuf:protoc:exe:linux-aarch_64:2.5.0"

Symptom

During the compilation process, "Could not find artifact com.google.protobuf:protoc:exe:linux-aarch_64:2.5.0" is displayed.

Procedure

Step 1 Install Protobuf. yum install -y protobuf protobuf-devel

Step 2 Run the mvn command displayed in the error message. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpacking=exe -Dfile=/usr/bin/protoc

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 140 Kunpeng BoostKit for Big Data Porting Guide (CDH) 16 Hue-3.9.0-cdh5.12.1 Porting Guide (CentOS 7.6)

16 Hue-3.9.0-cdh5.12.1 Porting Guide (CentOS 7.6)

16.1 Introduction 16.2 Environment Requirements 16.3 Configuring the Compilation Environment 16.4 Compiling Hue

16.1 Introduction Hue is an open-source SQL Assistant for querying databases and data warehouses. For more information about ZooKeeper, visit the official Hue website.

16.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 141 Kunpeng BoostKit for Big Data Porting Guide (CDH) 16 Hue-3.9.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Software Requirements

Item Version

CentOS 7.6

OS Kernel 4.14.0

GCC 4.8.5

OpenJDK 1.7.0_191

Maven 3.5.4

Node 14.4.0

Npm 6.14.5

Pip 20.1.1

16.3 Configuring the Compilation Environment

16.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo

Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 142 Kunpeng BoostKit for Big Data Porting Guide (CDH) 16 Hue-3.9.0-cdh5.12.1 Porting Guide (CentOS 7.6)

yum clean all yum makecache

Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64

Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 143 Kunpeng BoostKit for Big Data Porting Guide (CDH) 16 Hue-3.9.0-cdh5.12.1 Porting Guide (CentOS 7.6)

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Install dependencies using the Yum source.

yum install -y wget vim krb5-devel cyrus-sasl-gssapi cyrus-sasl-plain cyrus-sasl-devel libffi-devel libxml2-devel libxslt-devel make mysql mysql-devel openldap-devel python-devel sqlite-devel openssl-devel gmp-devel asciidoc 16.3.2 Installing OpenJDK

Step 1 Install JDK 1.7. yum install -y java-1.7.0-openjdk java-1.7.0-openjdk-devel Step 2 Configure Java environment variables. vim /etc/profile Add the following content to the end of the file: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.aarch64 export PATH=$JAVA_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether the OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 16.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 144 Kunpeng BoostKit for Big Data Porting Guide (CDH) 16 Hue-3.9.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 145 Kunpeng BoostKit for Big Data Porting Guide (CDH) 16 Hue-3.9.0-cdh5.12.1 Porting Guide (CentOS 7.6)

16.3.4 Installing Dependent Components

Installing Node.js and npm

Step 1 Download Node.js. wget https://nodejs.org/dist/v14.4.0/node-v14.4.0-linux-arm64.tar.gz tar -zxf node-v14.4.0-linux-arm64.tar.gz mv node-v14.4.0-linux-arm64 /opt/tools/installed/

Step 2 Set environment variables. vim /etc/profie

Add the following to the end of the /etc/profie file: export NODE_HOME=/opt/tools/installed/node-v14.4.0-linux-arm64 export PATH=$NODE_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

----End

Installing

Install setuptools.

yum install -y python-setuptools.noarch

Installing

Step 1 Download the installer script. curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py

Step 2 Install pip. python get-pip.py

----End

Installing Python Dependencies

Step 1 Install ipython. pip install ipython==5.2.0

Step 2 Install astroid. pip install astroid==1.5.3

Step 3 Install logilab-astng. pip install logilab-astng==0.24.3

Step 4 Install cryptography. pip install cryptography==2.1.4

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 146 Kunpeng BoostKit for Big Data Porting Guide (CDH) 16 Hue-3.9.0-cdh5.12.1 Porting Guide (CentOS 7.6)

16.4 Compiling Hue

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Hue-cdh5.12.1 source package.

Procedure

Step 1 Go to the Hue-cdh5.12.1 source code directory. cd hue-cdh5.12.1-release Step 2 Modify the maven/pom.xml file in the root directory to modify and add the Maven repository source. vim ./maven/pom.xml The Huawei Kunpeng warehouse source must be placed first. There is no requirement on the sequence of other sources. mirrors.huaweicloud.com https://mirrors.huaweicloud.com/kunpeng/maven mirrors huaweicloud com false repository.huaweicloud.com https://mirrors.huaweicloud.com/repository/maven repository huaweicloud com false Step 3 Perform compilation. make apps The operation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 147 Kunpeng BoostKit for Big Data Porting Guide (CDH) 16 Hue-3.9.0-cdh5.12.1 Porting Guide (CentOS 7.6)

The compiled files are stored in the build directory. The executable binary files are stored in the build/env/bin directory.

Step 4 Use the Kunpeng Porting Advisor to scan the build directory generated after compilation and ensure that there is no x86 .so or .jar packages in the directory.

NO TE

● The compiled directory must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages is contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study. ● Run the following command to start Hue: build/env/bin/hue runserver

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 148 Kunpeng BoostKit for Big Data 17 Kafka-cdh5-0.10.2_2.2.0 Porting Guide (CentOS Porting Guide (CDH) 7.6)

17 Kafka-cdh5-0.10.2_2.2.0 Porting Guide (CentOS 7.6)

17.1 Introduction 17.2 Environment Requirements 17.3 Configuring the Compilation Environment 17.4 Compiling Kafka

17.1 Introduction Kafka is an open-source streaming platform developed by Apache Software Foundation in Scala and Java. Kafka is a distributed publish-subscribe messaging system with high throughput. It can process all action flow data on the customer- related websites. Such actions (web browsing, search, and other user actions) are key factors in many social functions on modern networks. The data is usually processed by log handling and aggregation based on throughput requirements. Kafka is a feasible solution for Hadoop to support both offline analysis and log data processing, as well as real-time processing. Kafka aims to use the parallel loading mechanism of Hadoop to centrally process online and offline messages, and to provide real-time messages through clusters. For more information about Kafka, visit https://kafka.apache.org/.

17.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 149 Kunpeng BoostKit for Big Data 17 Kafka-cdh5-0.10.2_2.2.0 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Item Remarks

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Software Requirements Item Version

CentOS 7.6

OS Kernel 4.14.0

OpenJDK 1.8.0_252

Maven 3.1.0

gradle 4.10

17.3 Configuring the Compilation Environment

17.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 150 Kunpeng BoostKit for Big Data 17 Kafka-cdh5-0.10.2_2.2.0 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 151 Kunpeng BoostKit for Big Data 17 Kafka-cdh5-0.10.2_2.2.0 Porting Guide (CentOS Porting Guide (CDH) 7.6)

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Use the Yum source to install dependencies.

yum install -y wget patch openssl-devel zlib-devel automake libtool make cmake libstdc++-static glibc-static git unzip 17.3.2 Installing OpenJDK Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/ Step 2 Configure Java environment variables. vim /etc/profile Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 17.3.3 Installing Maven Step 1 Download and install the installation package to a directory (for example, /opt/ tools/installed).

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 152 Kunpeng BoostKit for Big Data 17 Kafka-cdh5-0.10.2_2.2.0 Porting Guide (CentOS Porting Guide (CDH) 7.6)

wget https://archive.apache.org/dist/maven/maven-3/3.1.0/binaries/apache-maven-3.1.0-bin.tar.gz tar -zxf apache-maven-3.1.0-bin.tar.gz mkdir -p /opt/tools/installed mv apache-maven-3.1.0 /opt/tools/installed/

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.1.0 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -version

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.1.0/conf/ settings.xml.

vim /opt/tools/installed/apache-maven-3.1.0/conf/settings.xml

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Configure the remote repository. (Change the repository to your Maven repository. If the Maven repository does not exist, configure it as follows.)

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to the settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 153 Kunpeng BoostKit for Big Data 17 Kafka-cdh5-0.10.2_2.2.0 Porting Guide (CentOS Porting Guide (CDH) 7.6)

----End 17.3.4 Installing Gradle

Step 1 Download and decompress the Gradle source code package to a directory (for example, /opt/tools/installed/). wget https://downloads.gradle.org/distributions/gradle-4.10-bin.zip --no-check-certificate unzip gradle-4.10-bin.zip mv gradle-4.10 /opt/tools/installed/ Step 2 Modify the gradle environment variables. vim /etc/profile Add the following code at the end of the /etc/profile file: export GRADLE_HOME=/opt/tools/installed/gradle-4.10 export PATH=$GRADLE_HOME/bin:$PATH Step 3 Press Esc and run :wq to save the configuration and exit. Step 4 Make the environment variables take effect. source /etc/profile

----End

17.4 Compiling Kafka

NO TE

This section explains the compilation for CDH5. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Kafka source package.

Procedure

Step 1 Switch to the directory where the Kafka source code package is decompressed. cd kafka-cdh5-0.10.2_2.2.0 Step 2 Modify the build.gradle file. vim build.gradle Change the repository configuration in lines 37 to 42 of the file to: maven { url "https://mirrors.huaweicloud.com/kunpeng/maven" } maven { url "https://mirrors.huaweicloud.com/repository/maven" } maven { url "https://repository.cloudera.com/artifactory/libs-snapshot-local" } maven { url "https://repository.cloudera.com/artifactory/cloudera-repos" } maven {

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 154 Kunpeng BoostKit for Big Data 17 Kafka-cdh5-0.10.2_2.2.0 Porting Guide (CentOS Porting Guide (CDH) 7.6)

url "https://repo1.maven.org/maven2" } mavenCentral()

Step 3 Compile the source code. gradle releaseTarGz -info

NO TE

You can use the -g parameter to specify the local Maven repository. For example, if the local repository is ~/.m2/repository, run the following command: gradle -g ~/.m2/repository releaseTarGz -info Obtain the kafka_2.11-0.10.2-kafka-2.2.0.tgz package generated in the core/ build/distributions directory. Step 4 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

NO TE

The compiled kafka_2.11-0.10.2-kafka-2.2.0.tgz must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 155 Kunpeng BoostKit for Big Data Porting Guide (CDH) 18 Kite-1.0.0-cdh5.12.1 Porting Guide (CentOS 7.6)

18 Kite-1.0.0-cdh5.12.1 Porting Guide (CentOS 7.6)

18.1 Introduction 18.2 Environment Requirements 18.3 Configuring the Compilation Environment 18.4 Compiling Kite

18.1 Introduction Kite is an open source set of libraries for building data-oriented systems and applications. With the Kite dataset API, you can perform tasks such as reading a dataset, defining and reading a dataset view, and processing a dataset using MapReduce.

18.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 156 Kunpeng BoostKit for Big Data Porting Guide (CDH) 18 Kite-1.0.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Software Requirements Item Version

CentOS 7.6

OS Kernel 4.14.0

GCC 4.8.5

OpenJDK 1.7.0_191

Maven 3.5.4

Protobuf 2.5.0

18.3 Configuring the Compilation Environment

18.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the YUM repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to agree to the deletion.

Step 3 Modify the /etc/yum.repos.d/Local.repo file to configure the local Yum repository. vim /etc/yum.repos.d/Local.repo [Local] name=CentOS-7.6 Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Run the following commands to make the configuration take effect: yum clean all

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 157 Kunpeng BoostKit for Big Data Porting Guide (CDH) 18 Kite-1.0.0-cdh5.12.1 Porting Guide (CentOS 7.6)

yum makecache

Step 5 Install the software related to GCC using the Yum source.

yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64

Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the path where GCC is located. Generally, the path is /usr/bin/gcc. command -v gcc 2. Change the name of GCC, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Run the following command, enter the information, and save the file: vim /usr/bin/gcc #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Run the following command to add execution permissions for the script: chmod +x /usr/bin/gcc 5. Run the following command to ensure that the command is available: gcc --version

Step 7 Resolve the -fsigned-char problem (by modifying the g++). 1. Search for the path where GCC is located. Generally, the path is /usr/bin/g++. command -v g++ 2. Change the name of g++, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Run the following command, enter the information, and save the file: vim /usr/bin/g++ #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Run the following command to add execution permissions for the script: chmod +x /usr/bin/g++ 5. Run the following command to ensure that the command is available: g++ --version

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 158 Kunpeng BoostKit for Big Data Porting Guide (CDH) 18 Kite-1.0.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Installing Dependencies

Run the following command to install the dependencies:

yum install -y wget vim openssl-devel zlib-devel automake libtool make libstdc++-static glibc-static git snappy snappy-devel fuse fuse-devel 18.3.2 Installing OpenJDK

Step 1 Install JDK 1.7. yum install -y java-1.7.0-openjdk java-1.7.0-openjdk-devel

Step 2 Configure Java environment variables. vim /etc/profile

Add the following content to the end of the file: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.aarch64 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether the OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End 18.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 159 Kunpeng BoostKit for Big Data Porting Guide (CDH) 18 Kite-1.0.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 18.3.4 Installing Protobuf

CentOS

Step 1 Install Protobuf. yum install -y protobuf protobuf-devel

Step 2 Check whether Protobuf is installed successfully. protoc --version

The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 160 Kunpeng BoostKit for Big Data Porting Guide (CDH) 18 Kite-1.0.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Step 3 Install Maven. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/bin/protoc

----End openEuler

Step 1 Download and decompress the source code. wget https://github.com/protocolbuffers/protobuf/releases/download/v2.5.0/protobuf-2.5.0.tar.gz tar -zxf protobuf-2.5.0.tar.gz

Step 2 Move the decompressed directory to the /opt/tools/installed/ directory. mv protobuf-2.5.0 /opt/tools/installed/

Step 3 Go to the /opt/tools/installed/ directory. cd /opt/tools/installed

Step 4 Download the protoc.zip package and decompress it to obtain the protoc.patch file whose storage path can be specified, for example, to /opt/tools/installed/. wget https://mirrors.huaweicloud.com/kunpeng/archive/kunpeng_solution/bigdata/Patch/protoc.zip unzip protoc.zip cp ./protoc/protoc.patch ./protobuf-2.5.0/src/google/protobuf/stubs/

Step 5 Go to the protobuf-2.5.0/src/google/protobuf/stubs/ directory and install the patch. cd protobuf-2.5.0/src/google/protobuf/stubs/ patch -p1 < protoc.patch

Step 6 Go back to the root directory of protobuf-2.5.0, compile the file, and install it in the default directory. cd /opt/tools/installed/protobuf-2.5.0 ./autogen.sh && ./configure CFLAGS='-fsigned-char' && make -j8 && make install

Step 7 Deploy Protoc in the local Maven repository. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/local/bin/protoc

----End

18.4 Compiling Kite

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites

You have downloaded and decompressed the kite-1.0.0-cdh5.12.1 source package.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 161 Kunpeng BoostKit for Big Data Porting Guide (CDH) 18 Kite-1.0.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Procedure Step 1 Go to the kite-1.0.0-cdh5.12.1 source code directory after the decompression. cd kite-1.0.0-cdh5.12.1 Step 2 Configure the Kunpeng repository. vim pom.xml Add the Kunpeng Maven repository to the repositories tag. The Kunpeng Maven repository must be placed first. If the dependency cannot be found during compilation, you can add other Maven repositories. Arm.repo Arm Repositories https://mirrors.huaweicloud.com/kunpeng/maven/ false cloudera.repo cloudera Maven Repo http://repository.cloudera.com/artifactory/cloudera-repos repo1.maven repo1 Maven Repo https://repo1.maven.org/maven2 wso2.repo wso2 Repositories http://maven.wso2.org/nexus/content/groups/wso2-public/ ...

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 162 Kunpeng BoostKit for Big Data Porting Guide (CDH) 18 Kite-1.0.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Comment out the maven-twttr repository in the pom.xml file.

Add the following configuration items to the pom.xml file in the root directory to configure the plug-in repository source: Note that the node level of pluginRepositories is the same as that of repositories.

central-repo https://repo1.maven.org/maven2

Step 3 Go to the root directory of source code and perform the compilation. mvn package -DskipTests The operation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 163 Kunpeng BoostKit for Big Data Porting Guide (CDH) 18 Kite-1.0.0-cdh5.12.1 Porting Guide (CentOS 7.6)

The compiled kite-tools-parent/kite-tools-cdh5/target package is in the kite- tools-cdh5-1.0.0-cdh5.12.1.tar.gz directory.

Step 4 Use the Kunpeng Porting Advisor to scan the .tar package generated after the compilation and ensure that the .tar package does not contain the x86 .so or .jar packages.

NO TE

The compiled kite-tools-cdh5-1.0.0-cdh5.12.1.tar.gz package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 164 Kunpeng BoostKit for Big Data 19 Lucene-solr-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

19 Lucene-solr-cdh5.12.1 Porting Guide (CentOS 7.6)

19.1 Introduction 19.2 Environment Requirements 19.3 Configuring the Compilation Environment 19.4 Compiling Solr

19.1 Introduction Solr (pronounced "solar") is an open-source enterprise-search platform from the Apache Lucene project. Its major features include full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document handling. Providing distributed search and index replication, Solr is designed for scalability.

19.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 165 Kunpeng BoostKit for Big Data 19 Lucene-solr-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

OS Requirements Item Version

CentOS 7.6

OS Kernel 4.14.0

JDK 1.7.0_191

GCC 4.8.5

Ant 1.8.4

Maven 3.5.4

19.3 Configuring the Compilation Environment

19.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 166 Kunpeng BoostKit for Big Data 19 Lucene-solr-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 167 Kunpeng BoostKit for Big Data 19 Lucene-solr-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Install dependencies using the Yum source.

yum install -y wget vim patch openssl-devel zlib-devel automake libtool make cmake libstdc++-static glibc-static git 19.3.2 Installing OpenJDK

Step 1 Install JDK 1.7. yum install -y java-1.7.0-openjdk java-1.7.0-openjdk-devel Step 2 Configure Java environment variables. vim /etc/profile Add the following content to the end of the file: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.aarch64 export PATH=$JAVA_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether the OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 19.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 168 Kunpeng BoostKit for Big Data 19 Lucene-solr-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 19.3.4 Installing Ant

Step 1 Download Ant 1.8.4. wget https://archive.apache.org/dist/ant/binaries/apache-ant-1.8.4-bin.tar.gz tar -zxf apache-ant-1.8.4-bin.tar.gz mv apache-ant-1.8.4 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 169 Kunpeng BoostKit for Big Data 19 Lucene-solr-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Step 2 Modify environment variables. vim /etc/profile Add Ant configuration at the end of the /etc/profile file. export ANT_HOME=/opt/tools/installed/apache-ant-1.8.4 export PATH=$ANT_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether the configuration takes effect: ant -version

----End

19.4 Compiling Solr

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Solr-cdh5.12.1 source package.

Procedure

Step 1 Go to the Solr source code directory after the decompression. cd lucene-solr-cdh5.12.1-release Step 2 Modify the lucene/ivy-settings.xml file. vim lucene/ivy-settings.xml 1. Change all addresses http://repo1.maven.org/maven2 in the file to https:// repo1.maven.org/maven2. 2. Comment out the ivy default repository. a. Comment out all include tags. b. Comment out the default ivy repository in the default chain. c. Comment out the default ivy repository in the cloudera chain. 3. Add the central warehouse configuration. a. Add a central repository to the first line of the default chain.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 170 Kunpeng BoostKit for Big Data 19 Lucene-solr-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

b. Add a central repository to the first line of the cloudera chain. Step 3 Modify the lucene/common-build.xml file. vim lucene/common-build.xml Change all addresses http://repo1.maven.org/maven2 in the file to https:// repo1.maven.org/maven2. Step 4 Modify the dev-tools/scripts/poll-mirrors.pl file. vim dev-tools/scripts/poll-mirrors.pl Change all addresses http://repo1.maven.org/maven2 in the file to https:// repo1.maven.org/maven2. $maven_url = "https://repo1.maven.org/maven2/org/apache/lucene/lucene-core/$version/lucene-core- $version.pom.asc"; Step 5 Modify the cloudera/templates/cdh.build.properties file. vim cloudera/templates/cdh.build.properties Modify the warehouse addresses in lines 4, 7, and 8 as follows: 4 repository.root=https://repository.cloudera.com 5 6 # These override the settings in ivysettings.xml 7 snapshots.cloudera.com=https://repository.cloudera.com/content/repositories/snapshots/ 8 releases.cloudera.com=https://repository.cloudera.com/content/groups/cdh-releases-rcs/ Step 6 Compile Solr. 1. Download and install ivy. ant ivy-bootstrap ant compile

NO TE

The first compilation may fail. If the following error information is displayed, modify the cdh.build.properties file.

Solution: vim cdh.build.properties Adjust the value of the corresponding name as follows: repository.root=https://repo1.maven.org/maven2

# These override the settings in ivysettings.xml snapshots.cloudera.com=https://repository.cloudera.com/cloudera/libs-snapshot-local/ releases.cloudera.com=https://repository.cloudera.com/content/repositories/releases 2. Replace the JAR package with the Kunpeng JAR package. wget http://mirrors.huaweicloud.com/kunpeng/maven/org/apache/hadoop/hadoop-hdfs/2.6.0- cdh5.12.1/hadoop-hdfs-2.6.0-cdh5.12.1.jar -O solr/core/lib/hadoop-hdfs-2.6.0-cdh5.12.1.jar 3. Go to the solr directory. cd solr

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 171 Kunpeng BoostKit for Big Data 19 Lucene-solr-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

4. Compile Solr. ant dist

NO TE

The compilation result is stored in solr/dist/solr-4.10.3-cdh5.12.1.war in the root directory of the source code. The operation is successful if the following information is displayed:

Step 7 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

NO TE

The compiled solr-4.10.3-cdh5.12.1.war must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled package contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 172 Kunpeng BoostKit for Big Data Porting Guide (CDH) 20 Oozie-4.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

20 Oozie-4.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

20.1 Introduction 20.2 Environment Requirements 20.3 Configuring the Compilation Environment 20.4 Porting Oozie

20.1 Introduction

Oozie is a Hadoop-based workflow engine (scheduler). It writes scheduling processes in XML format and schedules MapReduce, Pig, Hive, shell, jar and Spark jobs. In practice, it is practical to perform a series of operations on data. You do not need to write some processing code. Instead, you only need to define actions and connect them to a workflow for automatic execution. It is very useful for big data analysis.

For more information about Oozie, visit https://oozie.apache.org/.

20.2 Environment Requirements

Hardware Requirements

Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 173 Kunpeng BoostKit for Big Data Porting Guide (CDH) 20 Oozie-4.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

OS Requirements Item Version

CentOS 7.6

OS Kernel 4.14.0-115

JDK 1.8.0_252

20.3 Configuring the Compilation Environment

20.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 174 Kunpeng BoostKit for Big Data Porting Guide (CDH) 20 Oozie-4.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 175 Kunpeng BoostKit for Big Data Porting Guide (CDH) 20 Oozie-4.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

----End

Installing Dependencies Use the Yum source to install dependencies.

yum install -y boost.aarch64 boost-devel.aarch64 make cmake wget vim openssl-devel zlib-devel automake libtool libstdc++-static glibc-static git snappy snappy-devel 20.3.2 Installing OpenJDK Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/ Step 2 Configure Java environment variables. vim /etc/profile Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End

20.4 Porting Oozie

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites Create an /opt/cdh/oozie directory and save the downloaded and decompressed Oozie official installation package to the directory.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 176 Kunpeng BoostKit for Big Data Porting Guide (CDH) 20 Oozie-4.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Procedure

Step 1 Go to the Oozie directory after the decompression. cd oozie-4.1.0-cdh5.12.1 Step 2 Use the Kunpeng Porting Advisor to analyze the package. The following JAR packages need to be replaced with the Kunpeng JAR packages:

NO TE

The binary packages available at the Kunpeng mirror site are compiled based on the open source code, and do not involve vulnerability or bug fixes. When using open source software, comply with the applicable license agreements.

JAR Package Name SO File Name How to Obtain

chimera-0.9.2.jar libchimera.so Download address

hadoop-distcp-2.6.0- libnetty-transport- Download address cdh5.12.1.jar native-epoll.so

hadoop-hdfs-2.6.0- libnetty-transport- Download address cdh5.12.1.jar native-epoll.so

jansi-1.9.jar libjansi.so Download address

jline-2.10.5.jar libjansi.so Download address

jline-2.11.jar libjansi.so Download address

jython- libjffi-1.0.so Download address standalone-2.5.3.jar

leveldbjni-all-1.8.jar libleveldbjni.so Download address

lz4-1.3.0.jar liblz4-java.so Download address

netty-all-4.0.23.Final.jar libnetty-transport- Download address native-epoll.so

netty-all-4.0.29.Final.jar libnetty-transport- Download address native-epoll.so

snappy-java-1.0.4.1.jar libsnappyjava.so Download address

Step 3 Create a directory and go to this directory. mkdir -p /opt/cdh/jars cd /opt/cdh/jars Step 4 Download Kunpeng JAR packages. wget http://mirrors.huaweicloud.com/kunpeng/maven/com/intel/chimera/chimera/0.9.2/ chimera-0.9.2.jar wget http://mirrors.huaweicloud.com/kunpeng/maven/org/apache/hadoop/hadoop-distcp/2.6.0- cdh5.12.1/hadoop-distcp-2.6.0-cdh5.12.1.jar wget http://mirrors.huaweicloud.com/kunpeng/maven/org/apache/hadoop/hadoop-hdfs/2.6.0- cdh5.12.1/hadoop-hdfs-2.6.0-cdh5.12.1.jar wget http://mirrors.huaweicloud.com/kunpeng/maven/org/fusesource/jansi/jansi/1.9/jansi-1.9.jar wget http://mirrors.huaweicloud.com/kunpeng/maven/org/scala-lang/jline/2.10.5/jline-2.10.5.jar wget http://mirrors.huaweicloud.com/kunpeng/maven/jline/jline/2.11/jline-2.11.jar wget http://mirrors.huaweicloud.com/kunpeng/maven/org/python/jython-standalone/2.5.3/jython-

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 177 Kunpeng BoostKit for Big Data Porting Guide (CDH) 20 Oozie-4.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

standalone-2.5.3.jar wget http://mirrors.huaweicloud.com/kunpeng/maven/org/fusesource/leveldbjni/leveldbjni-all/1.8/ leveldbjni-all-1.8.jar wget http://mirrors.huaweicloud.com/kunpeng/maven/net/jpountz/lz4/lz4/1.3.0/lz4-1.3.0.jar wget http://mirrors.huaweicloud.com/kunpeng/maven/io/netty/netty-all/4.0.23.Final/netty- all-4.0.23.Final.jar wget http://mirrors.huaweicloud.com/kunpeng/maven/io/netty/netty-all/4.0.29.Final/netty- all-4.0.29.Final.jar wget http://mirrors.huaweicloud.com/kunpeng/maven/org/xerial/snappy/snappy-java/1.0.4.1/snappy- java-1.0.4.1.jar

Step 5 Replace the JAR packages with the Kunpeng JAR packages.

NO TE

When you run the cp command to replace a file, the system asks you whether to overwrite the original file. In this case, you can enter yes or run the alias cp='cp' command to overwrite the original file without confirmation.

1. Replace the JAR packages. cd /opt/cdh/oozie cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/lib/jline-2.11.jar cp /opt/cdh/jars/hadoop-distcp-2.6.0-cdh5.12.1.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/libtools/ hadoop-distcp-2.6.0-cdh5.12.1.jar cp /opt/cdh/jars/hadoop-hdfs-2.6.0-cdh5.12.1.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/libtools/ hadoop-hdfs-2.6.0-cdh5.12.1.jar cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/libtools/jline-2.11.jar cp /opt/cdh/jars/leveldbjni-all-1.8.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/libtools/leveldbjni- all-1.8.jar cp /opt/cdh/jars/netty-all-4.0.23.Final.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/libtools/netty- all-4.0.23.Final.jar cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/libtools/snappy- java-1.0.4.1.jar 2. Create the temporary directory /opt/cdh/oozie/tmp/ and decompress the oozie-hadooplibs-4.1.0-cdh5.12.1.tar.gz package to this directory. mkdir -p /opt/cdh/oozie/tmp tar -zxf /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/oozie-hadooplibs-4.1.0-cdh5.12.1.tar.gz -C /opt/cdh/ oozie/tmp/ 3. Replace the JAR packages in the oozie-hadooplibs-4.1.0-cdh5.12.1.tar.gz package. cp /opt/cdh/jars/hadoop-hdfs-2.6.0-cdh5.12.1.jar /opt/cdh/oozie/tmp/oozie-4.1.0-cdh5.12.1/ hadooplibs/hadooplib-2.6.0-cdh5.12.1.oozie-4.1.0-cdh5.12.1/hadoop-hdfs-2.6.0-cdh5.12.1.jar cp /opt/cdh/jars/leveldbjni-all-1.8.jar /opt/cdh/oozie/tmp/oozie-4.1.0-cdh5.12.1/hadooplibs/ hadooplib-2.6.0-cdh5.12.1.oozie-4.1.0-cdh5.12.1/leveldbjni-all-1.8.jar cp /opt/cdh/jars/netty-all-4.0.23.Final.jar /opt/cdh/oozie/tmp/oozie-4.1.0-cdh5.12.1/hadooplibs/ hadooplib-2.6.0-cdh5.12.1.oozie-4.1.0-cdh5.12.1/netty-all-4.0.23.Final.jar cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/tmp/oozie-4.1.0-cdh5.12.1/hadooplibs/ hadooplib-2.6.0-cdh5.12.1.oozie-4.1.0-cdh5.12.1/snappy-java-1.0.4.1.jar cp /opt/cdh/jars/hadoop-hdfs-2.6.0-cdh5.12.1.jar /opt/cdh/oozie/tmp/oozie-4.1.0-cdh5.12.1/ hadooplibs/hadooplib-2.6.0-mr1-cdh5.12.1.oozie-4.1.0-cdh5.12.1/hadoop-hdfs-2.6.0-cdh5.12.1.jar cp /opt/cdh/jars/leveldbjni-all-1.8.jar /opt/cdh/oozie/tmp/oozie-4.1.0-cdh5.12.1/hadooplibs/ hadooplib-2.6.0-mr1-cdh5.12.1.oozie-4.1.0-cdh5.12.1/leveldbjni-all-1.8.jar cp /opt/cdh/jars/netty-all-4.0.23.Final.jar /opt/cdh/oozie/tmp/oozie-4.1.0-cdh5.12.1/hadooplibs/ hadooplib-2.6.0-mr1-cdh5.12.1.oozie-4.1.0-cdh5.12.1/netty-all-4.0.23.Final.jar cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/tmp/oozie-4.1.0-cdh5.12.1/hadooplibs/ hadooplib-2.6.0-mr1-cdh5.12.1.oozie-4.1.0-cdh5.12.1/snappy-java-1.0.4.1.jar cd /opt/cdh/oozie/tmp tar -zcf oozie-hadooplibs-4.1.0-cdh5.12.1.tar.gz oozie-4.1.0-cdh5.12.1/ cd /opt/cdh/oozie cp /opt/cdh/oozie/tmp/oozie-hadooplibs-4.1.0-cdh5.12.1.tar.gz /opt/cdh/oozie/oozie-4.1.0- cdh5.12.1/ 4. Delete the tmp directory. rm -rf /opt/cdh/oozie/tmp 5. Create the temporary directory /opt/cdh/oozie/tmp/ and decompress the oozie-sharelib-4.1.0-cdh5.12.1.tar.gz package to this directory.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 178 Kunpeng BoostKit for Big Data Porting Guide (CDH) 20 Oozie-4.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

mkdir -p /opt/cdh/oozie/tmp tar -zxf /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/oozie-sharelib-4.1.0-cdh5.12.1.tar.gz -C /opt/cdh/ oozie/tmp/ 6. Replace the JAR packages in the oozie-sharelib-4.1.0-cdh5.12.1.tar.gz package. cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/tmp/share/lib/hcatalog/jline-2.11.jar cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/tmp/share/lib/hcatalog/snappy- java-1.0.4.1.jar cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/tmp/share/lib/hive2/jline-2.11.jar cp /opt/cdh/jars/leveldbjni-all-1.8.jar /opt/cdh/oozie/tmp/share/lib/hive2/leveldbjni-all-1.8.jar cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/tmp/share/lib/hive2/snappy- java-1.0.4.1.jar cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/tmp/share/lib/hive/jline-2.11.jar cp /opt/cdh/jars/jansi-1.9.jar /opt/cdh/oozie/tmp/share/lib/pig/jansi-1.9.jar cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/tmp/share/lib/pig/jline-2.11.jar cp /opt/cdh/jars/jython-standalone-2.5.3.jar /opt/cdh/oozie/tmp/share/lib/pig/jython- standalone-2.5.3.jar cp /opt/cdh/jars/netty-all-4.0.23.Final.jar /opt/cdh/oozie/tmp/share/lib/pig/netty- all-4.0.23.Final.jar cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/tmp/share/lib/pig/snappy-java-1.0.4.1.jar cp /opt/cdh/jars/chimera-0.9.2.jar /opt/cdh/oozie/tmp/share/lib/spark/chimera-0.9.2.jar cp /opt/cdh/jars/jansi-1.9.jar /opt/cdh/oozie/tmp/share/lib/spark/jansi-1.9.jar cp /opt/cdh/jars/jline-2.10.5.jar /opt/cdh/oozie/tmp/share/lib/spark/jline-2.10.5.jar cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/tmp/share/lib/spark/jline-2.11.jar cp /opt/cdh/jars/leveldbjni-all-1.8.jar /opt/cdh/oozie/tmp/share/lib/spark/leveldbjni-all-1.8.jar cp /opt/cdh/jars/lz4-1.3.0.jar /opt/cdh/oozie/tmp/share/lib/spark/lz4-1.3.0.jar cp /opt/cdh/jars/netty-all-4.0.29.Final.jar /opt/cdh/oozie/tmp/share/lib/spark/netty- all-4.0.29.Final.jar cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/tmp/share/lib/sqoop/jline-2.11.jar cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/tmp/share/lib/sqoop/snappy- java-1.0.4.1.jar cd /opt/cdh/oozie/tmp tar -zcf oozie-sharelib-4.1.0-cdh5.12.1.tar.gz share/ cd /opt/cdh/oozie cp /opt/cdh/oozie/tmp/oozie-sharelib-4.1.0-cdh5.12.1.tar.gz /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/ 7. Delete the tmp directory. rm -rf /opt/cdh/oozie/tmp 8. Create the temporary directory /opt/cdh/oozie/tmp/ and decompress the oozie-sharelib-4.1.0-cdh5.12.1-yarn.tar.gz package to this directory. mkdir -p /opt/cdh/oozie/tmp tar -zxf /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/oozie-sharelib-4.1.0-cdh5.12.1-yarn.tar.gz - C /opt/cdh/oozie/tmp/ 9. Replace the JAR packages in the oozie-sharelib-4.1.0-cdh5.12.1-yarn.tar.gz package. cp /opt/cdh/jars/hadoop-distcp-2.6.0-cdh5.12.1.jar /opt/cdh/oozie/tmp/share/lib/distcp/hadoop- distcp-2.6.0-cdh5.12.1.jar cp /opt/cdh/jars/netty-all-4.0.23.Final.jar /opt/cdh/oozie/tmp/share/lib/distcp/netty- all-4.0.23.Final.jar cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/tmp/share/lib/hcatalog/jline-2.11.jar cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/tmp/share/lib/hcatalog/snappy- java-1.0.4.1.jar cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/tmp/share/lib/hive2/jline-2.11.jar cp /opt/cdh/jars/leveldbjni-all-1.8.jar /opt/cdh/oozie/tmp/share/lib/hive2/leveldbjni-all-1.8.jar cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/tmp/share/lib/hive2/snappy- java-1.0.4.1.jar cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/tmp/share/lib/hive/jline-2.11.jar cp /opt/cdh/jars/jansi-1.9.jar /opt/cdh/oozie/tmp/share/lib/pig/jansi-1.9.jar cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/tmp/share/lib/pig/jline-2.11.jar cp /opt/cdh/jars/jython-standalone-2.5.3.jar /opt/cdh/oozie/tmp/share/lib/pig/jython- standalone-2.5.3.jar cp /opt/cdh/jars/netty-all-4.0.23.Final.jar /opt/cdh/oozie/tmp/share/lib/pig/netty- all-4.0.23.Final.jar cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/tmp/share/lib/pig/snappy-java-1.0.4.1.jar cp /opt/cdh/jars/chimera-0.9.2.jar /opt/cdh/oozie/tmp/share/lib/spark/chimera-0.9.2.jar cp /opt/cdh/jars/jansi-1.9.jar /opt/cdh/oozie/tmp/share/lib/spark/jansi-1.9.jar cp /opt/cdh/jars/jline-2.10.5.jar /opt/cdh/oozie/tmp/share/lib/spark/jline-2.10.5.jar

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 179 Kunpeng BoostKit for Big Data Porting Guide (CDH) 20 Oozie-4.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/tmp/share/lib/spark/jline-2.11.jar cp /opt/cdh/jars/leveldbjni-all-1.8.jar /opt/cdh/oozie/tmp/share/lib/spark/leveldbjni-all-1.8.jar cp /opt/cdh/jars/lz4-1.3.0.jar /opt/cdh/oozie/tmp/share/lib/spark/lz4-1.3.0.jar cp /opt/cdh/jars/netty-all-4.0.29.Final.jar /opt/cdh/oozie/tmp/share/lib/spark/netty- all-4.0.29.Final.jar cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/tmp/share/lib/sqoop/jline-2.11.jar cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/tmp/share/lib/sqoop/snappy- java-1.0.4.1.jar 10. Compress the oozie-sharelib-4.1.0-cdh5.12.1-yarn.tar.gz file again. cd /opt/cdh/oozie/tmp tar -zcf oozie-sharelib-4.1.0-cdh5.12.1-yarn.tar.gz share/ 11. Copy the .tar package to the /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/ directory and delete the tmp directory. cd /opt/cdh/oozie cp /opt/cdh/oozie/tmp/oozie-sharelib-4.1.0-cdh5.12.1-yarn.tar.gz /opt/cdh/oozie/oozie-4.1.0- cdh5.12.1/ rm -rf /opt/cdh/oozie/tmp 12. Create the temporary directory /opt/cdh/oozie/tmp/ and decompress the oozie.war package to this directory. mkdir -p /opt/cdh/oozie/tmp unzip /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/oozie.war -d /opt/cdh/oozie/tmp/ 13. Replace the JAR package in the oozie.war package. cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/tmp/WEB-INF/lib/jline-2.11.jar cp /opt/cdh/jars/leveldbjni-all-1.8.jar /opt/cdh/oozie/tmp/WEB-INF/lib/leveldbjni-all-1.8.jar cp /opt/cdh/jars/netty-all-4.0.29.Final.jar /opt/cdh/oozie/tmp/WEB-INF/lib/netty- all-4.0.29.Final.jar cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/tmp/WEB-INF/lib/snappy-java-1.0.4.1.jar 14. Repack the oozie.war package. cd /opt/cdh/oozie/tmp jar -cvfM0 oozie.war ./ 15. Copy the .tar package to the /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/ directory and delete the tmp directory. cp /opt/cdh/oozie/tmp/oozie.war /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/ rm -rf /opt/cdh/oozie/tmp 16. (Optional) Replace the JAR packages in the src directory. cp /opt/cdh/jars/hadoop-hdfs-2.6.0-cdh5.12.1.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/ hadooplibs/hadoop-cdh5mr1/target/hadooplibs/hadooplib-2.6.0-mr1-cdh5.12.1.oozie-4.1.0- cdh5.12.1 cp /opt/cdh/jars/leveldbjni-all-1.8.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/hadooplibs/ hadoop-cdh5mr1/target/hadooplibs/hadooplib-2.6.0-mr1-cdh5.12.1.oozie-4.1.0-cdh5.12.1 cp /opt/cdh/jars/netty-all-4.0.23.Final.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/hadooplibs/ hadoop-cdh5mr1/target/hadooplibs/hadooplib-2.6.0-mr1-cdh5.12.1.oozie-4.1.0-cdh5.12.1 cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/hadooplibs/ hadoop-cdh5mr1/target/hadooplibs/hadooplib-2.6.0-mr1-cdh5.12.1.oozie-4.1.0-cdh5.12.1 cp /opt/cdh/jars/hadoop-hdfs-2.6.0-cdh5.12.1.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/ hadooplibs/hadoop-cdh5mr2/target/hadooplibs/hadooplib-2.6.0-cdh5.12.1.oozie-4.1.0-cdh5.12.1 cp /opt/cdh/jars/leveldbjni-all-1.8.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/hadooplibs/ hadoop-cdh5mr2/target/hadooplibs/hadooplib-2.6.0-cdh5.12.1.oozie-4.1.0-cdh5.12.1 cp /opt/cdh/jars/netty-all-4.0.23.Final.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/hadooplibs/ hadoop-cdh5mr2/target/hadooplibs/hadooplib-2.6.0-cdh5.12.1.oozie-4.1.0-cdh5.12.1 cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/hadooplibs/ hadoop-cdh5mr2/target/hadooplibs/hadooplib-2.6.0-cdh5.12.1.oozie-4.1.0-cdh5.12.1 cp /opt/cdh/jars/netty-all-4.0.23.Final.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/hbaselibs/ hbase-cdh5/target/hbaselibs/hbaselib-1.2.0-cdh5.12.1.oozie-4.1.0-cdh5.12.1 cp /opt/cdh/jars/hadoop-distcp-2.6.0-cdh5.12.1.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/ sharelib/distcp/target/partial-sharelib/share/lib/distcp cp /opt/cdh/jars/netty-all-4.0.23.Final.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/ distcp/target/partial-sharelib/share/lib/distcp cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/hcatalog/target/ partial-sharelib/share/lib/hcatalog cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/ hcatalog/target/partial-sharelib/share/lib/hcatalog cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/hive2/target/ partial-sharelib/share/lib/hive2

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 180 Kunpeng BoostKit for Big Data Porting Guide (CDH) 20 Oozie-4.1.0-cdh5.12.1 Porting Guide (CentOS 7.6)

cp /opt/cdh/jars/leveldbjni-all-1.8.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/hive2/ target/partial-sharelib/share/lib/hive2 cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/hive2/ target/partial-sharelib/share/lib/hive2 cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/hive2/target/ partial-sharelib/share/lib/hive cp /opt/cdh/jars/jansi-1.9.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/pig/target/ partial-sharelib/share/lib/pig cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/pig/target/ partial-sharelib/share/lib/pig cp /opt/cdh/jars/jython-standalone-2.5.3.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/ sharelib/pig/target/partial-sharelib/share/lib/pig cp /opt/cdh/jars/netty-all-4.0.23.Final.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/pig/ target/partial-sharelib/share/lib/pig cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/pig/ target/partial-sharelib/share/lib/pig cp /opt/cdh/jars/chimera-0.9.2.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/spark/ target/partial-sharelib/share/lib/spark cp /opt/cdh/jars/jansi-1.9.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/spark/target/ partial-sharelib/share/lib/spark cp /opt/cdh/jars/jline-2.10.5.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/spark/target/ partial-sharelib/share/lib/spark cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/spark/target/ partial-sharelib/share/lib/spark cp /opt/cdh/jars/leveldbjni-all-1.8.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/spark/ target/partial-sharelib/share/lib/spark cp /opt/cdh/jars/lz4-1.3.0.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/spark/target/ partial-sharelib/share/lib/spark cp /opt/cdh/jars/netty-all-4.0.29.Final.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/ spark/target/partial-sharelib/share/lib/spark cp /opt/cdh/jars/jline-2.11.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/sqoop/target/ partial-sharelib/share/lib/sqoop cp /opt/cdh/jars/snappy-java-1.0.4.1.jar /opt/cdh/oozie/oozie-4.1.0-cdh5.12.1/src/sharelib/sqoop/ target/partial-sharelib/share/lib/sqoop

NO TE

If this step is skipped, ignore the scanning result in the directory when scanning the repacked installation package. Step 6 Repackage the installation package. cd /opt/cdh/oozie mv oozie-4.1.0-cdh5.12.1.tar.gz oozie-4.1.0-cdh5.12.1-x86.tar.gz tar -zcf oozie-4.1.0-cdh5.12.1.tar.gz oozie-4.1.0-cdh5.12.1 The processed installation package is stored in /opt/cdh/oozie at /opt/cdh/oozie/ oozie-4.1.0-cdh5.12.1.tar.gz. Step 7 Use the Kunpeng Porting Advisor to scan the processed package after the compilation and ensure that the package does not contain x86 .so or .jar packages.

NO TE

The compiled oozie-4.1.0-cdh5.12.1.tar.gz must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 181 Kunpeng BoostKit for Big Data Porting Guide (CDH) 21 Pig-0.12.0-cdh5.12.1 Porting' Guide (CentOS 7.6)

21 Pig-0.12.0-cdh5.12.1 Porting' Guide (CentOS 7.6)

21.1 Introduction 21.2 Environment Requirements 21.3 Configuring the Compilation Environment 21.4 Compiling Pig 21.5 Troubleshooting

21.1 Introduction is a free open-source project on the Apache platform. Pig provides high-level abstraction for processing large data sets. In most cases, multiple MapReduce processes are required for data processing, making it difficult to match the data processing process with the pattern. With Pig, you can use more data structures. Pig Latin is a relatively simple language. A statement is an operation, which is similar to a database table and can be found in a relational database (where a tuple represents a row, and each tuple consists of fields). Pig has a large number of data types. It supports not only simple data types, such as int, long, float, double, chararray, and bytearray, but also advanced concepts such as package, tuple, and mapping. Pig also has a complete set of comparison operators, including rich matching patterns using regular expressions. For more information about Pig, visit https://pig.apache.org/.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 182 Kunpeng BoostKit for Big Data Porting Guide (CDH) 21 Pig-0.12.0-cdh5.12.1 Porting' Guide (CentOS 7.6)

21.2 Environment Requirements

Hardware Requirements

Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Software Requirements

Item Version

CentOS 7.6

Kernel 4.14.0

Ant 1.7.1

Maven 3.5.4

JDK 1.7.0

forrest 0.9

21.3 Configuring the Compilation Environment

21.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 183 Kunpeng BoostKit for Big Data Porting Guide (CDH) 21 Pig-0.12.0-cdh5.12.1 Porting' Guide (CentOS 7.6)

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 184 Kunpeng BoostKit for Big Data Porting Guide (CDH) 21 Pig-0.12.0-cdh5.12.1 Porting' Guide (CentOS 7.6)

command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies

Install dependencies using the Yum source.

yum install -y wget patch openssl-devel zlib-devel automake libtool make cmake libstdc++-static glibc-static git 21.3.2 Installing OpenJDK

Step 1 Install JDK 1.7. yum install -y java-1.7.0-openjdk java-1.7.0-openjdk-devel

Step 2 Configure Java environment variables. vim /etc/profile

Add the following content to the end of the file: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.aarch64 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether the OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 185 Kunpeng BoostKit for Big Data Porting Guide (CDH) 21 Pig-0.12.0-cdh5.12.1 Porting' Guide (CentOS 7.6)

----End 21.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 186 Kunpeng BoostKit for Big Data Porting Guide (CDH) 21 Pig-0.12.0-cdh5.12.1 Porting' Guide (CentOS 7.6)

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 21.3.4 Installing Ant

Step 1 Download and install the package to the specified directory. wget https://archive.apache.org/dist/ant/binaries/apache-ant-1.7.1-bin.tar.gz tar -zxf apache-ant-1.7.1-bin.tar.gz mv apache-ant-1.7.1 /opt/tools/installed/

Step 2 Modify environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export ANT_HOME=/opt/tools/installed/apache-ant-1.7.1 export PATH=$ANT_HOME/bin:$PATH

Step 3 Run the following command for the environment variables to take effect: source /etc/profile

Step 4 Check whether the configuration takes effect. ant -version Apache Ant version 1.7.1 compiled on June 27 2008

----End 21.3.5 Installing Forrest

Step 1 Download and install Forrest to a specified directory, for example, /opt/tools/ installed. wget http://archive.apache.org/dist/forrest/0.9/apache-forrest-0.9.tar.gz tar -zxf apache-forrest-0.9.tar.gz mv apache-forrest-0.9 /opt/tools/installed/

Step 2 Modify environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export FORREST_HOME=/opt/tools/installed/apache-forrest-0.9 export PATH=$FORREST_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether the environment variables take effect. forrest -projecthelp

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 187 Kunpeng BoostKit for Big Data Porting Guide (CDH) 21 Pig-0.12.0-cdh5.12.1 Porting' Guide (CentOS 7.6)

----End

21.4 Compiling Pig

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Pig-cdh5.12.1 source package.

Procedure

Step 1 Go to the Pig-cdh5.12.1 source code directory. cd pig-cdh5.12.1-release

Step 2 Open the build.xml file. vim build.xml

Change the mvnrepo repository source from http://repo2.maven.org/maven2 to https://repo1.maven.org/maven2.

Step 3 Modify the contrib/piggybank/java/build.xml file. vim contrib/piggybank/java/build.xml

Change all addresses http://repo2.maven.org/maven2 to https:// repo1.maven.org/maven2.

Step 4 Modify the ivy/ivysettings.xml file as follows: vim ivy/ivysettings.xml

Add the Maven repository source as follows: 1. Add the following information before line 32 in the file:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 188 Kunpeng BoostKit for Big Data Porting Guide (CDH) 21 Pig-0.12.0-cdh5.12.1 Porting' Guide (CentOS 7.6)

2. Add the following information before line 53 in the file: 3. Add the following information before line 76 in the file:

Step 5 Compile the source code. export MAVEN_OPTS="-Xmx10240m -XX:MaxPermSize=768m" ant tar -Dforrest.home=/opt/tools/installed/apache-forrest-0.9

NO TE

The first compilation may fail. If the following error information is displayed, modify the build.properties file. [ivy:resolve] SERVER ERROR: notresolvable url=http://maven.jenkins.cloudera.com:8081/artifactory/ cdh-staging-local/org/slf4j/slf4j-parent/1.5.11/slf4j-parent-1.5.11.jar [ivy:resolve] SERVER ERROR: notresolvable url=http://maven.jenkins.cloudera.com:8081/artifactory/ cdh-staging-local/org/slf4j/slf4j-log4j12/1.5.11/slf4j-log4j12-1.5.11-javadoc.jar [ivy:resolve] SERVER ERROR: notresolvable url=http://maven.jenkins.cloudera.com:8081/artifactory/ cdh-staging-local/org/slf4j/slf4j-api/1.5.11/slf4j-api-1.5.11-javadoc.jar [ivy:resolve] [ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS

BUILD FAILED Solution: vim build.properties Adjust the value of the corresponding name as follows: 4 repository.root=https://repo1.maven.org/maven2 5 6 # These override the settings in ivysettings.xml 7 snapshots.cloudera.com=https://repository.cloudera.com/content/repositories/snapshots 8 releases.cloudera.com=https://repository.cloudera.com/content/groups/cdh-releases-rcs

The operation is successful if the following information is displayed:

After the compilation is successful, the pig-0.12.0-cdh5.12.1.tar.gz package of the compilation result is stored in the build directory.

Step 6 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 189 Kunpeng BoostKit for Big Data Porting Guide (CDH) 21 Pig-0.12.0-cdh5.12.1 Porting' Guide (CentOS 7.6)

NO TE

The compiled pig-0.12.0-cdh5.12.1.tar.gz must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

21.5 Troubleshooting

21.5.1 Failed to Download ivy-2.2.0.jar Due to Connection Timeout

Symptom Failed to download ivy-2.2.0.jar due to connection timeout.

Handling Procedure You need to manually download the ivy-2.2.0.jar package to the specified directory. (For details about the directory, see the error description.) Run the following command to manually download the file:

wget https://repo1.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar 21.5.2 "GC overhead limit exceeded" Is Displayed During Compilation

Symptom "GC overhead limit exceeded" is displayed during compilation.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 190 Kunpeng BoostKit for Big Data Porting Guide (CDH) 21 Pig-0.12.0-cdh5.12.1 Porting' Guide (CentOS 7.6)

Handling Procedure Run the following command before compilation.

export MAVEN_OPTS="-Xmx10240m -XX:MaxPermSize=768m"

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 191 Kunpeng BoostKit for Big Data Porting Guide (CDH) 22 Spark-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)

22 Spark-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)

22.1 Introduction 22.2 Environment Requirements 22.3 Configuring the Compilation Environment 22.4 Compiling Spark 22.5 Troubleshooting

22.1 Introduction Spark is a unified analysis engine used for large-scale data processing. It features scalability and memory-based computing and has become a unified platform for quick processing of lightweight big data. Spark can be used to build the data store and running system for various applications, such as real-time stream processing, machine learning, and interactive query.

For more information about Spark, see the official Spark documentation on the official website.

22.2 Environment Requirements

Hardware Requirements

Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 192 Kunpeng BoostKit for Big Data Porting Guide (CDH) 22 Spark-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Software Requirements Item Version

CentOS 7.6

OS Kernel 4.14.0

GCC 4.8.5

OpenJDK 1.7.0

Maven 3.5.4

R 3.1.1

22.3 Configuring the Compilation Environment

22.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 193 Kunpeng BoostKit for Big Data Porting Guide (CDH) 22 Spark-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64

Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 194 Kunpeng BoostKit for Big Data Porting Guide (CDH) 22 Spark-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Install dependencies using the Yum source.

yum install -y wget vim openssl-devel zlib-devel automake libtool make cmake libstdc++-static glibc- static git readline-devel libXt-devel 22.3.2 Installing OpenJDK Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/ Step 2 Configure Java environment variables. vim /etc/profile Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 22.3.3 Installing Maven Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/).

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 195 Kunpeng BoostKit for Big Data Porting Guide (CDH) 22 Spark-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)

wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 196 Kunpeng BoostKit for Big Data Porting Guide (CDH) 22 Spark-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)

----End 22.3.4 Installing the R Language

Step 1 Download the R language source code package and decompress it. wget http://cran.rstudio.com/src/base/R-3/R-3.1.1.tar.gz tar -zxf R-3.1.1.tar.gz

Step 2 Go to the directory generated after the decompression. cd R-3.1.1

Step 3 Compile and install the R language to the specified directory. ./configure --enable-R-shlib --enable-R-static-lib --with-libpng --with-jpeglib --prefix=/opt/tools/ installed/R-3.1.1 make all -j8 && make install

Step 4 Configure the R language environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export R_HOME=/opt/tools/installed/R-3.1.1 export PATH=$R_HOME/bin:$PATH

Step 5 Make the environment variables take effect. source /etc/profile

Step 6 Verify the R language. R --version R version 3.1.1 (2014-07-10) -- "Sock it to Me" Copyright (C) 2014 The R Foundation for Statistical Computing Platform: aarch64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under the terms of the GNU General Public License versions 2 or 3. For more information about these matters see http://www.gnu.org/licenses/.

----End

22.4 Compiling Spark

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites

You have downloaded and decompressed the Spark-cdh5.12.1 source package.

Procedure

Step 1 Go to the Spark-cdh5.12.1 source code directory. cd spark-cdh5.12.1-release

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 197 Kunpeng BoostKit for Big Data Porting Guide (CDH) 22 Spark-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)

Step 2 Modify the pom.xml file. vim pom.xml 1. Add the Kunpeng Maven repository to the repositories tag. The Kunpeng Maven repository must be placed first. kunpengmaven kunpeng maven https://mirrors.huaweicloud.com/kunpeng/maven 2. Comment out the mqtt-repo Maven source. Step 3 Perform compilation. export MAVEN_OPTS="-Xmx10240m -XX:MaxPermSize=768m" ./make-distribution.sh --tgz -Pyarn,hive,sparkr -DskipTests The operation is successful if the following information is displayed:

After the compilation is complete, the spark-1.6.0-cdh5.12.1-bin-2.6.0- cdh5.12.1.tgz package is generated in the root directory of the source code. Step 4 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 198 Kunpeng BoostKit for Big Data Porting Guide (CDH) 22 Spark-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)

NO TE

Use the Kunpeng Porting Advisor to scan the compiled spark-1.6.0-cdh5.12.1-bin-2.6.0- cdh5.12.1.tgz package and ensure that the .so and .jar packages of x86 are not contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

22.5 Troubleshooting

22.5.1 "Cannot find 'R_HOME'. Please specify 'R_HOME' or make sure R is properly installed" Is Reported During the Spark Compilation

Symptom During the Spark compilation, the error "Cannot find 'R_HOME'. Please specify'R_HOME' or make sure R is properly installed" is reported.

Cause Analysis The R language support is enabled during Spark compilation. You need to compile and install the R language in the environment.

Handling Procedure Compile and install the R language in the /opt/tools/installed directory, set R_HOME, and run the compilation command again.

export R_HOME=/opt/tools/installed/R-3.1.1

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 199 Kunpeng BoostKit for Big Data Porting Guide (CDH) 22 Spark-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)

22.5.2 "error: cannot compile a simple Fortran program" Reported During the R Language Compilation

Symptom

During the R language compilation, "error: cannot compile a simple Fortran program" is displayed.

Cause Analysis

The gfortran package does not exist in the system.

Solution

Run the yum command to install the gfortran package in the OS image.

yum -y install gcc-gfortran.aarch64 22.5.3 "configure: error: --with-x=yes (default) and X11 headers/libs are not available" Reported During the R Language Compilation

Symptom

During the R language compilation, "configure: error: --with-x=yes (default) and X11 headers/libs are not available" is displayed.

Cause Analysis

--with-x=yes (use the X Window System) is enabled by default. Therefore, you need to install the libXt-devel module.

Solution

Run the yum command to install the module in the OS image.

yum -y install libXt-devel.aarch64

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 200 Kunpeng BoostKit for Big Data Porting Guide (CDH) 22 Spark-1.6.0-cdh5.12.1 Porting Guide (CentOS 7.6)

22.5.4 "/usr/bin/install: cannot stat' NEWS.pdf': No such file or directory" Reported During the R Language Compilation

Symptom During the R language compilation, "/usr/bin/install: cannot stat' NEWS.pdf': No such file or directory" is displayed.

Cause Analysis The NEWS.pdf file is not found in the source code directory R-3.1.1. Therefore, the file cannot be copied when the make install command is executed.

Solution The NEWS text file exists in the doc directory. Therefore, copy the contents of the NEWS file in the doc directory to the NEWS.pdf file.

cat doc/NEWS > doc/NEWS.pdf

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 201 Kunpeng BoostKit for Big Data 23 Sqoop-1.4.6-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

23 Sqoop-1.4.6-cdh5.12.1 Porting Guide (CentOS 7.6)

23.1 Introduction 23.2 Environment Requirements 23.3 Configuring the Compilation Environment 23.4 Compiling Sqoop 23.5 Troubleshooting

23.1 Introduction Sqoop is a tool designed for efficiently transferring bulk data between Hadoop and structured datastores such as relational databases.

23.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 202 Kunpeng BoostKit for Big Data 23 Sqoop-1.4.6-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Software Requirements Item Version

CentOS 7.6

Kernel 4.14.0

Ant 1.7.1

Maven 3.1.0

JDK 1.7.0

GCC 4.8.5

23.3 Configuring the Compilation Environment

23.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 203 Kunpeng BoostKit for Big Data 23 Sqoop-1.4.6-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 204 Kunpeng BoostKit for Big Data 23 Sqoop-1.4.6-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Use the Yum source to install dependencies.

yum install -y wget patch openssl-devel zlib-devel automake libtool make cmake libstdc++-static glibc-static git redhat-lsb asciidoc python python-devel xmlto 23.3.2 Installing OpenJDK

Step 1 Install JDK 1.7. yum install -y java-1.7.0-openjdk java-1.7.0-openjdk-devel Step 2 Configure Java environment variables. vim /etc/profile Add the following content to the end of the file: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.aarch64 export PATH=$JAVA_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether the OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 23.3.3 Installing Maven

Step 1 Download and install the installation package to a directory (for example, /opt/ tools/installed). wget https://archive.apache.org/dist/maven/maven-3/3.1.0/binaries/apache-maven-3.1.0-bin.tar.gz tar -zxf apache-maven-3.1.0-bin.tar.gz mkdir -p /opt/tools/installed mv apache-maven-3.1.0 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following code at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.1.0 export PATH=$MAVEN_HOME/bin:$PATH

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 205 Kunpeng BoostKit for Big Data 23 Sqoop-1.4.6-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -version The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.1.0/conf/ settings.xml.

vim /opt/tools/installed/apache-maven-3.1.0/conf/settings.xml

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Configure the remote repository. (Change the repository to your Maven repository. If the Maven repository does not exist, configure it as follows.)

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central If the compilation environment cannot access Internet, add the following proxy configuration to the settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 23.3.4 Installing Ant

Step 1 Download and install the package to the specified directory. wget https://archive.apache.org/dist/ant/binaries/apache-ant-1.7.1-bin.tar.gz tar -zxf apache-ant-1.7.1-bin.tar.gz mv apache-ant-1.7.1 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 206 Kunpeng BoostKit for Big Data 23 Sqoop-1.4.6-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

Step 2 Modify environment variables. vim /etc/profile Add the following code at the end of the /etc/profile file: export ANT_HOME=/opt/tools/installed/apache-ant-1.7.1 export PATH=$ANT_HOME/bin:$PATH Step 3 Run the following command for the environment variables to take effect: source /etc/profile Step 4 Check whether the configuration takes effect. ant -version Apache Ant version 1.7.1 compiled on June 27 2008

----End

23.4 Compiling Sqoop

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the sqoop-cdh5.12.1 source package.

Procedure

Step 1 Go to the directory generated after the decompression. cd sqoop-cdh5.12.1-release Step 2 Modify the build.xml file. vim build.xml Change all http://repo2.maven.org/maven2 to https://repo1.maven.org/ maven2. value="https://repo1.maven.org/maven2/org/apache/ivy/ivy/${ivy.version}/ivy-${ivy.version}.jar" /> ... value="https://repo1.maven.org/maven2/org/apache/maven/maven-ant-tasks/${mvn.version}/maven-ant- tasks-${mvn.version}.jar"/> Step 3 Modify the ivy/ivysettings.xml file as follows: vim ivy/ivysettings.xml Change all http://repo1.maven.org/maven2 to https://repo1.maven.org/ maven2. https://repo1.maven.org/maven2/ ... tag in the file:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 207 Kunpeng BoostKit for Big Data 23 Sqoop-1.4.6-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

2. Add the following information to the tag in the file: 3. Add the following information to the tag in the file: 4. Add the following information to the tag in the file: 5. Change https://repository.cloudera.com/artifactory/cdh-releases-rcs to the following: https://repository.cloudera.com/content/repositories/releases/ 6. Change https://repository.cloudera.com/content/repositories/snapshots to the following: https://repository.cloudera.com/cloudera/libs-snapshot-local/

Step 5 Modify the cloudera/maven-packaging/pom.xml file. vim cloudera/maven-packaging/pom.xml

1. Change https://repository.cloudera.com/content/groups/cdh-releases-rcs to the following:

https://repository.cloudera.com/content/repositories/releases

2. Change https://repository.cloudera.com/content/repositories/snapshots to the following:

https://repository.cloudera.com/content/repositories/libs-snapshot-local

Step 6 Modify the cloudera-pom.xml file.

1. Change https://repository.cloudera.com/content/groups/cdh-releases-rcs to the following:

https://repository.cloudera.com/content/repositories/releases

2. Change https://repository.cloudera.com/content/repositories/snapshots to the following:

https://repository.cloudera.com/cloudera/libs-snapshot-local

Step 7 Perform compilation. ant tar

After the compilation is successful, the sqoop-1.4.6-cdh5.12.1.tar.gz package of the compilation result is stored in the build directory.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 208 Kunpeng BoostKit for Big Data 23 Sqoop-1.4.6-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

NO TE

The first compilation may fail. If the following error information is displayed, modify the cdh.build.properties file. [ivy:resolve] SERVER ERROR: notresolvable url=http://maven.jenkins.cloudera.com:8081/artifactory/cdh- snapshot-local/org/aspectj/aspectjrt/1.6.11/aspectjrt-1.6.11.jar [ivy:resolve] SERVER ERROR: notresolvable url=http://maven.jenkins.cloudera.com:8081/artifactory/cdh- staging-local/org/aspectj/aspectjrt/1.6.11/aspectjrt-1.6.11-sources.jar [ivy:resolve] SERVER ERROR: notresolvable url=http://maven.jenkins.cloudera.com:8081/artifactory/cdh- snapshot-local/org/aspectj/aspectjrt/1.6.11/aspectjrt-1.6.11-sources.jar [ivy:resolve] SERVER ERROR: notresolvable url=http://maven.jenkins.cloudera.com:8081/artifactory/cdh- staging-local/org/aspectj/aspectjrt/1.6.11/aspectjrt-1.6.11-javadoc.jar [ivy:resolve] SERVER ERROR: notresolvable url=http://maven.jenkins.cloudera.com:8081/artifactory/cdh- snapshot-local/org/aspectj/aspectjrt/1.6.11/aspectjrt-1.6.11-javadoc.jar

vim cdh.build.properties Adjust the value of the corresponding name as follows: 4 repository.root=https://repo1.maven.org/maven2 5 6 # These override the settings in ivysettings.xml 7 snapshots.cloudera.com=https://repository.cloudera.com/cloudera/libs-snapshot-local/ 8 releases.cloudera.com=https://repository.cloudera.com/content/repositories/releases The operation is successful if the following information is displayed:

Step 8 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages. NO TE

The compiled sqoop-1.4.6-cdh5.12.1.tar.gz must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

23.5 Troubleshooting

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 209 Kunpeng BoostKit for Big Data 23 Sqoop-1.4.6-cdh5.12.1 Porting Guide (CentOS Porting Guide (CDH) 7.6)

23.5.1 Error Message "asciidoc: command not found" Is Displayed During Compilation

Symptom Error Message "asciidoc: command not found" Is Displayed During Compilation

real-docs: [exec] /bin/sh: lsb_release: command not found [exec] make: Entering directory `/home/src/sqoop-cdh5.12.1-release/src/docs' [exec] asciidoc --unsafe -b docbook -d manpage -a "author=Sqoop Team" man/sqoop-codegen.txt [exec] /bin/sh: asciidoc: command not found

Handling Procedure Run the following commands to install lsb and AsciiDoc:

yum install -y redhat-lsb.aarch64 asciidoc

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 210 Kunpeng BoostKit for Big Data 24 ZooKeeper-3.4.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

24 ZooKeeper-3.4.5-cdh5.12.1 Porting Guide (CentOS 7.6)

24.1 Introduction 24.2 Environment Requirements 24.3 Configuring the Compilation Environment 24.4 Compiling ZooKeeper 24.5 Troubleshooting

24.1 Introduction ZooKeeper is a distributed and open-source coordination service for distributed applications, an open-source implementation of Google Chubby, and an important component of Hadoop and HBase. ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. For more information about ZooKeeper, visit the official ZooKeeper website.

24.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 211 Kunpeng BoostKit for Big Data 24 ZooKeeper-3.4.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

Software Requirements Item Version

CentOS 7.6

OS Kernel 4.14.0

GCC 4.8.5

OpenJDK 1.7.0

Maven 3.5.4

Ant 1.7.1

24.3 Configuring the Compilation Environment

24.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 212 Kunpeng BoostKit for Big Data 24 ZooKeeper-3.4.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 213 Kunpeng BoostKit for Big Data 24 ZooKeeper-3.4.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies

Install dependencies using the Yum source.

yum install -y cppunit-devel libtool 24.3.2 Installing OpenJDK

Step 1 Install JDK 1.7. yum install -y java-1.7.0-openjdk java-1.7.0-openjdk-devel

Step 2 Configure Java environment variables. vim /etc/profile

Add the following content to the end of the file: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.aarch64 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether the OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End 24.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 214 Kunpeng BoostKit for Big Data 24 ZooKeeper-3.4.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 24.3.4 Installing Ant

Step 1 Download and install the package to the specified directory. wget https://archive.apache.org/dist/ant/binaries/apache-ant-1.7.1-bin.tar.gz tar -zxf apache-ant-1.7.1-bin.tar.gz mv apache-ant-1.7.1 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 215 Kunpeng BoostKit for Big Data 24 ZooKeeper-3.4.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

Step 2 Modify environment variables. vim /etc/profile Add the following code at the end of the /etc/profile file: export ANT_HOME=/opt/tools/installed/apache-ant-1.7.1 export PATH=$ANT_HOME/bin:$PATH Step 3 Run the following command for the environment variables to take effect: source /etc/profile Step 4 Check whether the configuration takes effect. ant -version Apache Ant version 1.7.1 compiled on June 27 2008

----End

24.4 Compiling ZooKeeper

NO TE

This section explains the compilation for CDH 5.12.1. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the ZooKeeper-cdh5.12.1 source package.

Procedure

Step 1 Go to the ZooKeeper-cdh5.12.1 source code directory. cd zookeeper-cdh5.12.1-release Step 2 Modify the build.xml file. vim build.xml Change all http://repo2.maven.org in the file to https://repo1.maven.org. value="https://repo1.maven.org/maven2/org/apache/ivy/ivy" /> Step 3 Modify the ivysettings.xml file. vim ivysettings.xml 1. Change all http://repo1.maven.org in the file to https://repo1.maven.org. value="https://repo1.maven.org/maven2/" override="false"/> 2. Add the following information to the tag in the file: 3. Add the following information to the tag in the file: 4. Add the following information to the tag in the file: Step 4 Modify the src/contrib/build-contrib.xml file. vim src/contrib/build-contrib.xml

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 216 Kunpeng BoostKit for Big Data 24 ZooKeeper-3.4.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

Change all http://repo2.maven.org in the file to https://repo1.maven.org. 46 value="https://repo1.maven.org/maven2/org/apache/ivy/ivy" /> Step 5 Modify the cloudera/maven-packaging/pom.xml file. vim cloudera/maven-packaging/pom.xml 1. Add the following repository configuration to the first line under the tag: kunpeng.repo https://mirrors.huaweicloud.com/kunpeng/maven Kunpeng Repository false central.repo https://repo1.maven.org/maven2 Central Repository false 2. Change all https://repository.cloudera.com/content/groups/cdh-releases- rcs in the tags to the following: https://repository.cloudera.com/content/repositories/releases/ 3. Change all https://repository.cloudera.com/content/repositories/snapshots in the tags to the following: https://repository.cloudera.com/cloudera/libs-snapshot-local/ 4. Add the Maven plug-in repository to the end of . huaweicloud-plugin http://mirrors.huaweicloud.com/repository/maven true Step 6 Modify the cloudera-pom.xml file. 1. Add the following repository configuration to the first line under the tag: kunpeng.repo https://mirrors.huaweicloud.com/kunpeng/maven Kunpeng Repository false 2. Change all https://repository.cloudera.com/content/groups/cdh-releases- rcs in the file to the following: https://repository.cloudera.com/content/repositories/releases/ 3. Change all https://repository.cloudera.com/content/repositories/snapshots in the file to the following: https://repository.cloudera.com/cloudera/libs-snapshot-local/ Step 7 Modify the src/c/src/mt_adaptor.c file. vim src/c/src/mt_adaptor.c

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 217 Kunpeng BoostKit for Big Data 24 ZooKeeper-3.4.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

Modify fetch_and_add as follows: int32_t fetch_and_add(volatile int32_t* operand, int incr) { #ifndef WIN32 //int32_t result; //asm __volatile__( // "lock xaddl %0,%1\n" // : "=r"(result), "=m"(*(int *)operand) // : "0"(incr) // : "memory"); //return result; return __sync_fetch_and_add(operand, incr); #else Step 8 Perform compilation. If the compilation is successful, the following information is displayed. ant

ant compile-native

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 218 Kunpeng BoostKit for Big Data 24 ZooKeeper-3.4.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

ant package tar

Step 9 The compiled binary file is stored in the build/c/build/usr directory.

The compiled .tar package is build/zookeeper-3.4.5.tar.gz.

Step 10 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 219 Kunpeng BoostKit for Big Data 24 ZooKeeper-3.4.5-cdh5.12.1 Porting Guide Porting Guide (CDH) (CentOS 7.6)

NO TE

The compiled zookeeper-3.4.5.tar.gz must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

24.5 Troubleshooting

24.5.1 Error Message "configure.ac:37: error: possibly undefined macro: AM_PATH_CPPUNIT" Is Displayed During Compilation

Symptom The following error is reported during ZooKeeper compilation:

[exec] libtoolize: Consider adding `AC_CONFIG_MACRO_DIR([m4])' to configure.ac and [exec] libtoolize: rerunning libtoolize, to keep the correct libtool macros in-tree. [exec] libtoolize: Consider adding `-I m4' to ACLOCAL_AMFLAGS in Makefile.am. [exec] configure.ac:37: warning: macro 'AM_PATH_CPPUNIT' not found in library [exec] configure.ac:37: error: possibly undefined macro: AM_PATH_CPPUNIT [exec] If this token and others are legitimate, please use m4_pattern_allow. [exec] See the Autoconf documentation. [exec] autoreconf: /usr/bin/autoconf failed with exit status: 1

Cause Analysis Not all dependency packages are installed before compilation.

Handling Procedure Run the following command to install the compilation dependency package cppunit-devel-1.12.1-11.el7.aarch64:

yum install -y cppunit-devel-1.12.1-11.el7.aarch64

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 220 Kunpeng BoostKit for Big Data 25 Hadoop-3.0.0-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

25 Hadoop-3.0.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

25.1 Introduction 25.2 Environment Requirements 25.3 Configuring the Compilation Environment 25.4 Compiling Hadoop

25.1 Introduction Hadoop is a distributed system infrastructure developed by the Apache Foundation. Users can develop distributed programs without knowing the details of the distributed bottom layer. The powerful cluster capability enables you to implement high-speed computing and storage. This document describes how to adapt Hadoop components in CDH to TaiShan servers. For more information about CDH, visit https://www.cloudera.com/.

25.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 221 Kunpeng BoostKit for Big Data 25 Hadoop-3.0.0-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Software Requirements

Item Version

CentOS 7.6

OS kernel 4.14.0-115

OpenJDK 1.8.0_252

Maven 3.5.4

Hadoop 3.0.0

CMake 3.12.4

Protoc 2.5.0

GCC 4.8.5

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

25.3 Configuring the Compilation Environment

25.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 222 Kunpeng BoostKit for Big Data 25 Hadoop-3.0.0-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 223 Kunpeng BoostKit for Big Data 25 Hadoop-3.0.0-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies

Use the Yum source to install dependencies.

yum install -y wget patch openssl-devel zlib-devel automake libtool make libstdc++-static glibc-static git snappy snappy-devel 25.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 224 Kunpeng BoostKit for Big Data 25 Hadoop-3.0.0-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End 25.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 225 Kunpeng BoostKit for Big Data 25 Hadoop-3.0.0-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 25.3.4 Installing CMake The CMake version must be 3.12 or later for component compilation. This document uses CMake 3.12.4 as an example.

CentOS The built-in CMake version of CentOS is too early. To manually install CMake, perform the following steps:

Step 1 Download the CMake installation package. wget https://cmake.org/files/v3.12/cmake-3.12.4.tar.gz Step 2 Decompress the installation package. tar -zxf cmake-3.12.4.tar.gz Step 3 Compile and install CMake. cd cmake-3.12.4 ./bootstrap make -j8 make install

----End openEuler yum install cmake

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 226 Kunpeng BoostKit for Big Data 25 Hadoop-3.0.0-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

25.3.5 Installing Protobuf

CentOS

Step 1 Install Protobuf. yum install -y protobuf protobuf-devel

Step 2 Check whether Protobuf is installed successfully. protoc --version

The installation is successful if information similar to the following is displayed:

Step 3 Install Maven. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/bin/protoc

----End openEuler

Step 1 Download and decompress the source code. wget https://github.com/protocolbuffers/protobuf/releases/download/v2.5.0/protobuf-2.5.0.tar.gz tar -zxf protobuf-2.5.0.tar.gz

Step 2 Move the decompressed directory to the /opt/tools/installed/ directory. mv protobuf-2.5.0 /opt/tools/installed/

Step 3 Go to the /opt/tools/installed/ directory. cd /opt/tools/installed

Step 4 Download the protoc.zip package and decompress it to obtain the protoc.patch file whose storage path can be specified, for example, to /opt/tools/installed/. wget https://mirrors.huaweicloud.com/kunpeng/archive/kunpeng_solution/bigdata/Patch/protoc.zip unzip protoc.zip cp ./protoc/protoc.patch ./protobuf-2.5.0/src/google/protobuf/stubs/

Step 5 Go to the protobuf-2.5.0/src/google/protobuf/stubs/ directory and install the patch. cd protobuf-2.5.0/src/google/protobuf/stubs/ patch -p1 < protoc.patch

Step 6 Go back to the root directory of protobuf-2.5.0, compile the file, and install it in the default directory. cd /opt/tools/installed/protobuf-2.5.0 ./autogen.sh && ./configure CFLAGS='-fsigned-char' && make -j8 && make install

Step 7 Deploy Protoc in the local Maven repository. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/local/bin/protoc

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 227 Kunpeng BoostKit for Big Data 25 Hadoop-3.0.0-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

25.4 Compiling Hadoop

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Hadoop-cdh6.3.2 source package.

Procedure

Step 1 Compile and install ZSTD. 1. Download the source code and decompress it. wget https://github.com/facebook/zstd/archive/v1.4.4.tar.gz -O zstd-v1.4.4.tar.gz tar -zxf zstd-v1.4.4.tar.gz 2. Go to the directory decompressed. cd zstd-1.4.4/ 3. Use the Yum source to install required components. yum install -y lz4 lz4-devel 4. Perform compilation and installation. make -j40

NO TE

The SO file of ZSTD is stored in /usr/local/lib/libzstd.so.1.4.4. Step 2 Compile and install the ISA-L. 1. Download the source code and decompress it. wget https://github.com/intel/isa-l/archive/v2.29.0.tar.gz -O isa-l-v2.29.0.tar.gz tar -zxf isa-l-v2.29.0.tar.gz 2. Go to the directory decompressed. cd isa-l-2.29.0 3. Perform compilation and installation. ./autogen.sh ./configure make -j40

NO TE

The SO file is stored in /usr/lib/libisal.so.2.0.29. Step 3 Go to the Hadoop-cdh6.3.2 source code directory. cd hadoop-common-cdh6.3.2-release Step 4 Modify the pom.xml file in the root directory of the source code to modify and add the Maven repository source. vim pom.xml Add the Kunpeng Maven repository to the repositories tag. The Kunpeng Maven repository must be placed first. mirrors.huaweicloud.com https://mirrors.huaweicloud.com/kunpeng/maven

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 228 Kunpeng BoostKit for Big Data 25 Hadoop-3.0.0-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

mirrors huaweicloud com false

repository.huaweicloud.com https://mirrors.huaweicloud.com/repository/maven repository huaweicloud com false

NO TICE

In addition to the repository source, you also need to add the plug-in repository source. (Add the following information after the tag.)

huaweicloud-plugin http://mirrors.huaweicloud.com/repository/maven true Step 5 Perform compilation. mvn install -Pdist,native -DskipTests -Dtar -Drequire.isal -Dbundle.isal -Disal.lib=YOUR_ISAL_LIB - Drequire.zstd -Dbundle.zstd -Dzstd.lib=YOUR_ZSTD_LIB -Drequire.snappy -Dbundle.snappy - Dsnappy.lib=YOUR_SNAPPY_LIB

NO TE

Replace YOUR_SNAPPY_LIB,YOUR_ISAL_LIB,YOUR_ZSTD_LIB with the actual installation directory of each library on the local host. Example: -Dzstd.lib=/usr/local/lib/ -Dsnappy.lib=/usr/lib64/ -Disal.lib=/usr/lib/ The operation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 229 Kunpeng BoostKit for Big Data 25 Hadoop-3.0.0-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

The compiled deployment package is stored in hadoop-dist/target/hadoop-3.0.0- cdh6.3.2.tar.gz.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 230 Kunpeng BoostKit for Big Data 25 Hadoop-3.0.0-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Step 6 Use the Kunpeng Porting Advisor to scan the TAR file generated after the compilation and ensure that the TAR file does not contain x86 SO or JAR files.

NO TE

The compiled hadoop-3.0.0-cdh6.3.2.tar.gz file must be scanned using the Kunpeng Porting Advisor to ensure that no x86 SO or JAR files are contained. If the compiled file contains x86 SO or JAR files, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 231 Kunpeng BoostKit for Big Data 26 HBase-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

26 HBase-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

26.1 Introduction 26.2 Environment Requirements 26.3 Configuring the Compilation Environment 26.4 Compiling HBase

26.1 Introduction HBase is a distributed, column-based, open-source database modeled after Google's Bigtable: A Distributed Storage System for Structured Data written by Fay Chang. Similar to Bigtable using the distributed data storage system built on Google File System, HBase provides the Bigtable-like capabilities in Hadoop. HBase is a sub-project of the Apache Hadoop project. Different from ordinary relational databases, HBase is a database suitable for unstructured data storage. Besides, HBase is based on columns rather than rows. For more information about HBase, visit https://hbase.apache.org.

26.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 232 Kunpeng BoostKit for Big Data 26 HBase-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Software Requirements

Item Version

OpenJDK 1.8.0_252

Maven 3.5.4

Protobuf 2.5.0

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

26.3 Configuring the Compilation Environment

26.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image.

mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the YUM repo file and clear the /etc/yum.repos.d/ directory.

cp -r /etc/yum.repos.d /etc/yum.repos.d-bak

rm /etc/yum.repos.d/*

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 233 Kunpeng BoostKit for Big Data 26 HBase-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to agree to the deletion.

Step 3 Modify the /etc/yum.repos.d/Local.repo file to configure the local Yum repository. vim /etc/yum.repos.d/Local.repo [Local] name=CentOS-7.6 Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Run the following commands to make the configuration take effect: yum clean all yum makecache Step 5 Install the software related to GCC using the Yum source. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the path where GCC is located. Generally, the path is /usr/bin/gcc. command -v gcc 2. Change the name of GCC, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Run the following command, enter the information, and save the file: vim /usr/bin/gcc #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Run the following command to add execution permissions for the script: chmod +x /usr/bin/gcc 5. Run the following command to ensure that the command is available: gcc --version

Step 7 Resolve the -fsigned-char problem (by modifying the g++). 1. Search for the path where GCC is located. Generally, the path is /usr/bin/g++. command -v g++ 2. Change the name of g++, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 234 Kunpeng BoostKit for Big Data 26 HBase-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

3. Run the following command, enter the information, and save the file: vim /usr/bin/g++ #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Run the following command to add execution permissions for the script: chmod +x /usr/bin/g++ 5. Run the following command to ensure that the command is available: g++ --version

----End

Installing Dependencies

Run the following command to install the dependencies:

yum install -y wget vim openssl-devel zlib-devel automake libtool make libstdc++-static glibc-static git snappy snappy-devel fuse fuse-devel 26.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 235 Kunpeng BoostKit for Big Data 26 HBase-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

26.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 236 Kunpeng BoostKit for Big Data 26 HBase-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Password Proxy server URL Proxy server port local.net|some.host.com

----End 26.3.4 Installing Protobuf

Step 1 Install Protobuf. yum install -y protobuf protobuf-devel

Step 2 Check whether the installation is successful. protoc --version

The installation is successful if information similar to the following is displayed:

Step 3 Deploy Protoc in the local Maven repository. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/bin/protoc

----End

26.4 Compiling HBase

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites

You have downloaded and decompressed the Hadoop-cdh6.3.2 source package.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 237 Kunpeng BoostKit for Big Data 26 HBase-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Procedure

Step 1 Go to the HBase-cdh6.3.2 source code directory. cd hbase-cdh6.3.2-release Step 2 Modify the pom.xml file. vim pom.xml Add the Maven repository source before the tag at the end of the file. The Kunpeng repository source must be placed first. kunpeng-repo https://mirrors.huaweicloud.com/kunpeng/maven kunpeng repo false huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven Step 3 Perform compilation. export MAVEN_OPTS="-Xmx10240m -XX:MaxPermSize=768m"

NO TE

The value of MAVEN_OPTS can be changed based on the site requirements. mvn package -DskipTests assembly:single -Pnative The operation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 238 Kunpeng BoostKit for Big Data 26 HBase-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

After the compilation is complete, the tar.gz package is generated in the hbase- assembly/target/ directory.

Step 4 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

NO TE

The compiled hbase-2.1.0-cdh6.3.2-bin.tar.gz package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 239 Kunpeng BoostKit for Big Data 27 Hive-2.1.1-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

27 Hive-2.1.1-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

27.1 Introduction 27.2 Environment Requirements 27.3 Configuring the Compilation Environment 27.4 Compiling Hive

27.1 Introduction Hive is a data repository tool running on Hadoop. It maps structured data files to a database table, provides simple SQL search functions, and converts SQL statements into MapReduce tasks. For more information about Hive, visit the official Hive website.

27.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 240 Kunpeng BoostKit for Big Data 27 Hive-2.1.1-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Software Requirements

Item Version

OpenJDK 1.8.0_252

Maven 3.5.4

Protobuf 2.5.0

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

27.3 Configuring the Compilation Environment

27.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 241 Kunpeng BoostKit for Big Data 27 Hive-2.1.1-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 242 Kunpeng BoostKit for Big Data 27 Hive-2.1.1-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies

Install dependencies using the Yum source.

yum install -y wget vim patch openssl-devel zlib-devel automake libtool make cmake libstdc++-static glibc-static git unzip 27.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 243 Kunpeng BoostKit for Big Data 27 Hive-2.1.1-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 4 Check whether OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 27.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 244 Kunpeng BoostKit for Big Data 27 Hive-2.1.1-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 27.3.4 Installing Protobuf

Step 1 Install Protobuf. yum install -y protobuf protobuf-devel

Step 2 Check whether the installation is successful. protoc --version

The installation is successful if information similar to the following is displayed:

Step 3 Deploy Protoc in the local Maven repository. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/bin/protoc

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 245 Kunpeng BoostKit for Big Data 27 Hive-2.1.1-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

27.4 Compiling Hive

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites

You have downloaded and decompressed the Hive-cdh6.3.2 source package.

Procedure

Step 1 Go to the Hive-cdh6.3.2 source code directory. cd hive-cdh6.3.2-release

Step 2 Modify the pom.xml file in the root directory of the source code to modify and add the Maven repository source. vim pom.xml

Add a repository to the repositories tag. The Huawei Kunpeng repository source must be placed first. mirrors.huaweicloud.com https://mirrors.huaweicloud.com/kunpeng/maven mirrors huaweicloud com false repository.huaweicloud.com https://mirrors.huaweicloud.com/repository/maven repository huaweicloud com cdh.repo https://repository.cloudera.com/artifactory/cloudera-repos CDH Repository

In addition to the repository source, you also need to add the plug-in repository source.

huaweicloud-plugin http://mirrors.huaweicloud.com/repository/maven true

Step 3 Run the compilation script. mvn package -DskipTests -Pdist -Dtar

The operation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 246 Kunpeng BoostKit for Big Data 27 Hive-2.1.1-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Obtain the tar.gz packages in the packaging/target directory.

Step 4 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 247 Kunpeng BoostKit for Big Data 27 Hive-2.1.1-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

NO TE

The compiled apache-hive-2.1.1-cdh6.3.2-bin.tar.gz package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are not contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 248 Kunpeng BoostKit for Big Data 28 Spark-2.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

28 Spark-2.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

28.1 Introduction 28.2 Environment Requirements 28.3 Configuring the Compilation Environment 28.4 Compiling Spark 28.5 Troubleshooting

28.1 Introduction

Spark is a unified analysis engine used for large-scale data processing. It features scalability and memory-based computing and has become a unified platform for quick processing of lightweight big data. Spark can be used to build the data store and running system for various applications, such as real-time stream processing, machine learning, and interactive query.

For more information about Spark, visit the Spark official website.

Programming language: Scala

Brief description: large-scale data computing engine

28.2 Environment Requirements

Hardware Requirements

Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 249 Kunpeng BoostKit for Big Data 28 Spark-2.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Item Remarks

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Software Requirements

Item Version

Maven 3.5.4

R 3.1.1

Spark 2.4.0

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

28.3 Configuring the Compilation Environment

28.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 250 Kunpeng BoostKit for Big Data 28 Spark-2.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 251 Kunpeng BoostKit for Big Data 28 Spark-2.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Install dependencies using the Yum source.

yum install -y make wget vim openssl-devel zlib-devel automake libtool git 28.3.2 Installing OpenJDK Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/ Step 2 Configure Java environment variables. vim /etc/profile

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 252 Kunpeng BoostKit for Big Data 28 Spark-2.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 28.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 253 Kunpeng BoostKit for Big Data 28 Spark-2.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 28.3.4 Installing the R Language

Step 1 Download the R language source code package and decompress it. wget http://cran.rstudio.com/src/base/R-3/R-3.1.1.tar.gz tar -zxf R-3.1.1.tar.gz

Step 2 Go to the directory generated after the decompression. cd R-3.1.1

Step 3 Compile and install the R language to the specified directory. ./configure --enable-R-shlib --enable-R-static-lib --with-libpng --with-jpeglib --prefix=/opt/tools/ installed/R-3.1.1 make all -j8 && make install

Step 4 Configure the R language environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export R_HOME=/opt/tools/installed/R-3.1.1 export PATH=$R_HOME/bin:$PATH

Step 5 Make the environment variables take effect. source /etc/profile

Step 6 Verify the R language. R --version R version 3.1.1 (2014-07-10) -- "Sock it to Me" Copyright (C) 2014 The R Foundation for Statistical Computing Platform: aarch64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under the terms of the GNU General Public License versions 2 or 3.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 254 Kunpeng BoostKit for Big Data 28 Spark-2.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

For more information about these matters see http://www.gnu.org/licenses/.

----End

28.4 Compiling Spark

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Spark-cdh6.3.2 source package.

Procedure

Step 1 Go to the Spark-cdh6.3.2 source code directory. cd spark-cdh6.3.2-release Step 2 Modify the pom.xml file. vim pom.xml Add the Maven repository source to the end of line 233 and place the Maven repository source in line 1 under the tag. mirrors.huaweicloud.com https://mirrors.huaweicloud.com/kunpeng/maven mirrors huaweicloud com false Step 3 Run the compilation script. ./dev/make-distribution.sh --tgz -Pyarn,hive,sparkr -DskipTests After the compilation is complete, the spark-2.4.0-cdh6.3.2-bin-3.0.0- cdh6.3.2.tgz package is generated in the root directory.

Step 4 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 255 Kunpeng BoostKit for Big Data 28 Spark-2.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

NO TE

Use the Kunpeng Porting Advisor to scan the compiled spark-2.4.0-cdh6.3.2-bin-3.0.0- cdh6.3.2.tgz package and ensure that the .so and .jar packages of x86 are not contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

28.5 Troubleshooting

28.5.1 "error: cannot compile a simple Fortran program" Reported During the R Language Compilation

Symptom During the R language compilation, "error: cannot compile a simple Fortran program" is displayed.

Cause Analysis The gfortran package does not exist in the system.

Solution Run the yum command to install the gfortran package in the OS image.

yum -y install gcc-gfortran.aarch64 28.5.2 "configure: error: --with-x=yes (default) and X11 headers/libs are not available" Reported During the R Language Compilation

Symptom During the R language compilation, "configure: error: --with-x=yes (default) and X11 headers/libs are not available" is displayed.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 256 Kunpeng BoostKit for Big Data 28 Spark-2.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Cause Analysis

--with-x=yes (use the X Window System) is enabled by default. Therefore, you need to install the libXt-devel module.

Solution

Run the yum command to install the module in the OS image.

yum -y install libXt-devel.aarch64 28.5.3 "/usr/bin/install: cannot stat' NEWS.pdf': No such file or directory" Reported During the R Language Compilation

Symptom

During the R language compilation, "/usr/bin/install: cannot stat' NEWS.pdf': No such file or directory" is displayed.

Cause Analysis

The NEWS.pdf file is not found in the source code directory R-3.1.1. Therefore, the file cannot be copied when the make install command is executed.

Solution

The NEWS text file exists in the doc directory. Therefore, copy the contents of the NEWS file in the doc directory to the NEWS.pdf file.

cat doc/NEWS > doc/NEWS.pdf 28.5.4 "git clone: error: RPC failed; result=18, HTTP code = 200" Is Reported During Source Code Downloading by Using git clone

Symptom

During source code downloading by using git clone, "git clone: error: RPC failed; result=18, HTTP code = 200" is displayed.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 257 Kunpeng BoostKit for Big Data 28 Spark-2.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Possible Cause The file to be downloaded is large, but the default value of postBuffer of git is small. As a result, the download fails.

Procedure Set this parameter to a large value.

git config --global http.postBuffer 1024288000

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 258 Kunpeng BoostKit for Big Data 29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

29.1 Introduction 29.2 Environment Requirements 29.3 Configuring the Compilation Environment 29.4 Compiling ZooKeeper 29.5 Troubleshooting

29.1 Introduction ZooKeeper is a distributed and open-source coordination service for distributed applications, an open-source implementation of Google Chubby, and an important component of Hadoop and HBase. ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. For more information about ZooKeeper, visit the official ZooKeeper website.

29.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 259 Kunpeng BoostKit for Big Data 29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Software Requirements Item Version

JDK 1.8.0_252

Maven 3.5.4

Ant 1.10.8

CentOS Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

29.3 Configuring the Compilation Environment

29.3.1 Configuring the Local Yum Source Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the Yum repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to agree to the deletion.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 260 Kunpeng BoostKit for Big Data 29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Step 3 Create the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo

Add the following content to the file to configure the local Yum source:

[Local] name=CentOS-7.6 Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install related software. yum -y install wget vim cppunit-devel libtool make

----End 29.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End 29.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 261 Kunpeng BoostKit for Big Data 29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 262 Kunpeng BoostKit for Big Data 29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

29.3.4 Installing Ant Step 1 Download and decompress the Ant 1.10.8 installation package. wget https://archive.apache.org/dist/ant/binaries/apache-ant-1.10.8-bin.tar.gz tar -zxvf apache-ant-1.10.8-bin.tar.gz -C /opt/tools/installed/ Step 2 Modify the /etc/profile file. vim /etc/profile Add Ant environment variables at the end of the file. export ANT_HOME=/opt/tools/installed/apache-ant-1.10.8 export PATH=$ANT_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check the Ant version. ant -version

----End

29.4 Compiling ZooKeeper

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the ZooKeeper-cdh6.3.2 source package.

Procedure Step 1 Go to the ZooKeeper-cdh6.3.2 source code directory. cd zookeeper-cdh6.3.2-release Step 2 Modify the build.xml file. vim build.xml Change all http://repo2.maven.org in the file to https://repo1.maven.org. value="https://repo1.maven.org/maven2/org/apache/ivy/ivy" /> ...

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 263 Kunpeng BoostKit for Big Data 29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Step 3 Modify the ivysettings.xml file. vim ivysettings.xml Replace http://repo1.maven.org in the file with https://repo1.maven.org.

value="https://repo1.maven.org/maven2/" override="false"/>

Step 4 Modify the src/contrib/build-contrib.xml file. vim src/contrib/build-contrib.xml Replace http://repo2.maven.org in the file with https://repo1.maven.org.

value="https://repo1.maven.org/maven2/org/apache/ivy/ivy" />

Step 5 Modify the cloudera/maven-packaging/pom.xml file. vim cloudera/maven-packaging/pom.xml 1. Add the following repository configuration to the first line under the tag: kunpeng.repo https://mirrors.huaweicloud.com/kunpeng/maven Kunpeng Repository false central.repo https://repo1.maven.org/maven2 Central Repository false

2. Change all addresses https://repository.cloudera.com/content/groups/cdh- releases-rcs in the tag to https://repository.cloudera.com/content/repositories/releases/ 3. Change all addresses https://repository.cloudera.com/content/repositories/ snapshots in the tag to

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 264 Kunpeng BoostKit for Big Data 29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

https://repository.cloudera.com/cloudera/libs-snapshot-local/ 4. Add the Maven plug-in repository to the end of . huaweicloud-plugin http://mirrors.huaweicloud.com/repository/maven true

Step 6 Modify the cloudera-pom.xml file. vim cloudera-pom.xml 1. Add the following repository configuration to the first line under the tag: kunpeng.repo https://mirrors.huaweicloud.com/kunpeng/maven Kunpeng Repository false 2. Change all addresses https://repository.cloudera.com/content/groups/cdh- releases-rcs in the file to https://repository.cloudera.com/content/repositories/releases/ 3. Change all addresses https://repository.cloudera.com/content/repositories/ snapshots in the file to https://repository.cloudera.com/cloudera/libs-snapshot-local/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 265 Kunpeng BoostKit for Big Data 29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Step 7 Modify the fetch_and_add method in line 483 in src/c/src/mt_adaptor.c. vim src/c/src/mt_adaptor.c

Modify the content from line 483. int32_t fetch_and_add(volatile int32_t* operand, int incr) { #ifndef WIN32 //int32_t result; //asm __volatile__( // "lock xaddl %0,%1\n" // : "=r"(result), "=m"(*(int *)operand) // : "0"(incr) // : "memory"); //return result; return __sync_fetch_and_add(operand, incr); #else

Step 8 Add the Maven repository source to the ivysettings.xml file. vim ivysettings.xml

The modification is as follows: 1. Add the following information to line 20 under the tag in the file: 2. Add the following information to line 34 under the tag in the file:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 266 Kunpeng BoostKit for Big Data 29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

3. Add the following information to line 44 under the tag in the file:

Step 9 Create the build.properties file in the cloudera directory. vim build.properties Add the following content to the file: # #Sat Jun 06 17:27:33 CST 2020 version=3.4.6-cdh6.3.2 zookeeper.version=3.4.6-cdh6.3.2

Step 10 Perform compilation. ant

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 267 Kunpeng BoostKit for Big Data 29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

ant compile-native

ant tar

NO TE

The compiled binary file is stored in the build/c/build/usr directory, and the compiled .tar package is build/zookeeper-3.4.6-SNAPSHOT.tar.gz. The ant compile-native and ant tar commands can be combined as the ant tar compile- native command. However, the output of the two commands is different. Therefore, the two commands are not combined here. Step 11 View the compilation result. The compiled binary file is stored in the build/c/build/usr directory.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 268 Kunpeng BoostKit for Big Data 29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

The compiled .tar package is build/zookeeper-3.4.6-SNAPSHOT.tar.gz.

Step 12 Use the Kunpeng Porting Advisor to scan the .tar package generated after compilation and the binary files generated in the build/c/build/usr directory. Ensure that no x86 .so or .jar packages are contained.

NO TE

The compiled zookeeper-3.4.6-SNAPSHOT.tar.gz and binary files in the build/c/build/usr directory must be scanned by the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages of x86 are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

29.5 Troubleshooting

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 269 Kunpeng BoostKit for Big Data 29 ZooKeeper-3.4.6-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

29.5.1 Error Message "configure.ac:37: error: possibly undefined macro: AM_PATH_CPPUNIT" Is Displayed During Compilation

Symptom The following error is reported during ZooKeeper compilation:

[exec] libtoolize: Consider adding `AC_CONFIG_MACRO_DIR([m4])' to configure.ac and [exec] libtoolize: rerunning libtoolize, to keep the correct libtool macros in-tree. [exec] libtoolize: Consider adding `-I m4' to ACLOCAL_AMFLAGS in Makefile.am. [exec] configure.ac:37: warning: macro 'AM_PATH_CPPUNIT' not found in library [exec] configure.ac:37: error: possibly undefined macro: AM_PATH_CPPUNIT [exec] If this token and others are legitimate, please use m4_pattern_allow. [exec] See the Autoconf documentation. [exec] autoreconf: /usr/bin/autoconf failed with exit status: 1

Cause Analysis Not all dependency packages are installed before compilation.

Handling Procedure Run the following command to install the compilation dependency package cppunit-devel-1.12.1-11.el7.aarch64:

yum install -y cppunit-devel-1.12.1-11.el7.aarch64

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 270 Kunpeng BoostKit for Big Data 30 Avro-1.8.2-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

30 Avro-1.8.2-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

30.1 Introduction 30.2 Environment Requirements 30.3 Configuring the Compilation Environment 30.4 Compiling Avro

30.1 Introduction Apache Avro offers data serialization capabilities and provides data exchange services in big data-based systems and applications. It supports binary serialization to simply and rapidly process a large amount of data, and integrates dynamic languages to facilitate flexible processing of Avro data. This document describes how to adapt Avro components in CDH to TaiShan servers. For more information about CDH, visit https://www.cloudera.com/.

30.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 271 Kunpeng BoostKit for Big Data 30 Avro-1.8.2-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Software Requirements Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

OpenJDK 1.8.0_252

Maven 3.5.4

Ant 1.7.1

Forrest 0.9

CentOS Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

30.3 Configuring the Compilation Environment

30.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 272 Kunpeng BoostKit for Big Data 30 Avro-1.8.2-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 273 Kunpeng BoostKit for Big Data 30 Avro-1.8.2-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++. command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Use the Yum source to install dependencies.

yum install -y boost.aarch64 boost-devel.aarch64 make cmake wget openssl-devel zlib-devel automake libtool libstdc++-static glibc-static git snappy snappy-devel jansson-devel.aarch64 asciidoc.noarch doxygen 30.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/ Step 2 Configure Java environment variables. vim /etc/profile Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 274 Kunpeng BoostKit for Big Data 30 Avro-1.8.2-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 30.3.3 Installing Maven Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 275 Kunpeng BoostKit for Big Data 30 Avro-1.8.2-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 30.3.4 Installing Ant

Step 1 Download and install the package to the specified directory. wget https://archive.apache.org/dist/ant/binaries/apache-ant-1.7.1-bin.tar.gz tar -zxf apache-ant-1.7.1-bin.tar.gz mv apache-ant-1.7.1 /opt/tools/installed/

Step 2 Modify environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export ANT_HOME=/opt/tools/installed/apache-ant-1.7.1 export PATH=$ANT_HOME/bin:$PATH

Step 3 Run the following command for the environment variables to take effect: source /etc/profile

Step 4 Check whether the configuration takes effect. ant -version Apache Ant version 1.7.1 compiled on June 27 2008

----End 30.3.5 Installing Forrest

Step 1 Download and install Forrest to a specified directory, for example, /opt/tools/ installed. wget http://archive.apache.org/dist/forrest/0.9/apache-forrest-0.9.tar.gz tar -zxf apache-forrest-0.9.tar.gz mv apache-forrest-0.9 /opt/tools/installed/

Step 2 Modify environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export FORREST_HOME=/opt/tools/installed/apache-forrest-0.9 export PATH=$FORREST_HOME/bin:$PATH

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 276 Kunpeng BoostKit for Big Data 30 Avro-1.8.2-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether the environment variables take effect. forrest -projecthelp

----End

30.4 Compiling Avro

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Avro-cdh6.3.2 source package.

Procedure

Step 1 Go to the Avro-cdh6.3.2 source code directory. cd avro-cdh6.3.2-release Step 2 Modify the pom.xml file. vim pom.xml Add the Kunpeng Maven repository in line 63. (The Kunpeng repository must be placed in the first place.) kunpengmaven kunpeng maven https://mirrors.huaweicloud.com/kunpeng/maven

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 277 Kunpeng BoostKit for Big Data 30 Avro-1.8.2-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Modify the build.sh file. vim build.sh 1. Enter the following command to line 88: 88 mkdir -p build/${SRC_DIR} ...

2. Comment out lines 89 to 100 and line 115. 89 #rm -rf build/${SRC_DIR} 90 #if [ -d .svn ]; ... 100 #fi 115 #(cd lang/py3; ./build.sh dist)

NO TE

In CentOS 7.6, the Python version is 2.7.5. You need to remove Python 3 before compilation. 3. Comment out lines 121 to 131. 121 #(cd lang/csharp; ./build.sh dist): 122 123 #(cd lang/js; ./build.sh dist) 124 125 #(cd lang/ruby; ./build.sh dist) 126 127 #(cd lang/php; ./build.sh dist) 128 129 #mkdir -p dist/perl 130 #(cd lang/perl; perl ./Makefile.PL && make dist) 131 #cp lang/perl/Avro-$VERSION.tar.gz dist/perl/ ...

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 278 Kunpeng BoostKit for Big Data 30 Avro-1.8.2-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

NO TE

C#, JavaScript, Ruby, PHP, and Perl are rarely used and do not need to be compiled. 4. Modify the docs compilation procedure in line 134 and add the Forrest configuration. 134 (cd doc; ant -Dforrest.home=/opt/tools/installed/apache-forrest-0.9)

Step 4 Modify the lang/py/build.xml file. vim lang/py/build.xml Change http://repo2.maven.org/maven2/... to https:// mirrors.huaweicloud.com/repository/maven/....

Symptom

During compilation, "java.lang.OutOfMemoryError: PermGen space" is displayed.

Procedure

Step 1 Before compilation, run the following command or add the following command to the end of the /etc/profile file: export MAVEN_OPTS="-Xmx10240m -XX:MaxMetaspaceSize=768m"

Step 2 Save the file for the configuration to take effect. source /etc/profile

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 290 Kunpeng BoostKit for Big Data 32 HBase-indexer-1.5-cdh6.3.2 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

32 HBase-indexer-1.5-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

32.1 Introduction 32.2 Environment Requirements 32.3 Configuring the Compilation Environment 32.4 Compiling HBase Indexer

32.1 Introduction Lily HBase Indexer is developed by NGDATA to store HBase data of the Lily subsystem to Solr. NGDATA hosts the source code on GitHub and accesses the HBase Indexer project home page and code at https://github.com/NGDATA/ hbase-indexer.

32.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 291 Kunpeng BoostKit for Big Data 32 HBase-indexer-1.5-cdh6.3.2 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

Software Requirements

Item Version

JDK 1.8.0_252

Maven 3.5.4

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

32.3 Configuring the Compilation Environment

32.3.1 Configuring the Local Yum Source

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the Yum repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to agree to the deletion.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 292 Kunpeng BoostKit for Big Data 32 HBase-indexer-1.5-cdh6.3.2 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

Step 3 Create the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo

Add the following content to the file to configure the local Yum source:

[Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install related software. yum -y install wget vim

----End 32.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End 32.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 293 Kunpeng BoostKit for Big Data 32 HBase-indexer-1.5-cdh6.3.2 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 294 Kunpeng BoostKit for Big Data 32 HBase-indexer-1.5-cdh6.3.2 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

32.4 Compiling HBase Indexer

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites

You have downloaded and decompressed the HBase-indexer-cdh6.3.2 source package.

Procedure

Step 1 Go to the HBase-indexer-cdh6.3.2 source code directory. cd hbase-indexer-cdh6.3.2-release

Step 2 Modify the pom.xml file. vim pom.xml

Add the Kunpeng Maven repository in the first line under repositories.

kunpengmaven kunpeng maven http://mirrors.huaweicloud.com/kunpeng/maven huaweicloud.repo http://mirrors.huaweicloud.com/repository/maven huaweicloud Repositories wso2.repo http://maven.wso2.org/nexus/content/groups/wso2-public/ wso2 Repositories

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 295 Kunpeng BoostKit for Big Data 32 HBase-indexer-1.5-cdh6.3.2 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

In addition to the repository source, you also need to add the plug-in repository source.

huaweicloud-plugin https://mirrors.huaweicloud.com/repository/maven

Step 3 Perform compilation. mvn package apache-rat:check -Drat.numUnapprovedLicenses=2 -DskipTests -Dtar -Pdist

The operation is successful if the following information is displayed:

Step 4 Obtain the hbase-indexer-1.5-cdh6.3.2.tar.gz package generated in the hbase- indexer-dist/target directory.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 296 Kunpeng BoostKit for Big Data 32 HBase-indexer-1.5-cdh6.3.2 Porting Guide Porting Guide (CDH) (CentOS 7.6 & openEuler 20.03)

Step 5 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

NO TE

The compiled hbase-indexer-1.5-cdh6.3.2.tar.gz package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are not contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 297 Kunpeng BoostKit for Big Data 33 Hue-4.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

33 Hue-4.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

33.1 Introduction 33.2 Environment Requirements 33.3 Configuring the Compilation Environment 33.4 Compiling Hue 33.5 Troubleshooting

33.1 Introduction Hue is an open-source Apache Hadoop UI system developed by Cloudera and later contributed to the open-source community. It is built on the Django Python . You can use Hue to manage the Hadoop cluster through a browser. For example, execute put, get, and MapReduce Job operations.

33.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 298 Kunpeng BoostKit for Big Data 33 Hue-4.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Software Requirements Item Version

OpenJDK 1.8.0_252

Maven 3.5.4

Pip 20.1.1

Node 8.6.0

cmake 3.12.4

CentOS Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

33.3 Configuring the Compilation Environment

33.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 299 Kunpeng BoostKit for Big Data 33 Hue-4.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 300 Kunpeng BoostKit for Big Data 33 Hue-4.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies

Install dependencies using the Yum source.

yum install -y libxml2-devel libxslt-devel mysql mysql-devel openldap-devel python python-devel python-simplejson sqlite-devel libffi-devel openssl-devel gmp-devel make unzip wget 33.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 301 Kunpeng BoostKit for Big Data 33 Hue-4.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 4 Check whether OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 33.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 302 Kunpeng BoostKit for Big Data 33 Hue-4.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 33.3.4 Installing Dependent Components

Installing a Node Step 1 Download Node.js. wget http://nodejs.org/dist/v8.6.0/node-v8.6.0-linux-arm64.tar.gz

NO TICE

If an error indicating that the certificate issued by github.com cannot be verified is reported during wget download, add the --no-check-certificate parameter to the end of the wget command.

Step 2 Decompress the installation package. tar -zxf node-v8.6.0-linux-arm64.tar.gz mv node-v8.6.0-linux-arm64 /opt/tools/installed Step 3 Modify the /etc/profile file. vim /etc/profile Set environment variables.

export NODE_HOME=/opt/tools/installed/node-v8.6.0-linux-arm64 export PATH=$NODE_HOME/bin:$PATH Step 4 Make the environment variables take effect. source /etc/profile Step 5 Check whether node is installed successfully. node -v

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 303 Kunpeng BoostKit for Big Data 33 Hue-4.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Installing Setuptools Install setuptools.

yum install -y python-setuptools.noarch

Installing pip

Step 1 Download the pip source code and decompress it. wget https://files.pythonhosted.org/packages/08/25/ f204a6138dade2f6757b4ae99bc3994aac28a5602c97ddb2a35e0e22fbc4/pip-20.1.1.tar.gz tar -zxf pip-20.1.1.tar.gz Step 2 Go to the directory decompressed. cd pip-20.1.1 Step 3 Install pip. python setup.py install Step 4 Check the pip version. pip -V

----End

Installing Python Dependencies Install Python dependencies.

pip install --upgrade setuptools pip install cryptography==2.1.4 pip install ipython==5.2.0 pip install astroid==1.5.3 33.3.5 Installing CMake The CMake version must be 3.12 or later for component compilation. This document uses CMake 3.12.4 as an example.

CentOS The built-in CMake version of CentOS is too early. To manually install CMake, perform the following steps:

Step 1 Download the CMake installation package. wget https://cmake.org/files/v3.12/cmake-3.12.4.tar.gz Step 2 Decompress the installation package. tar -zxf cmake-3.12.4.tar.gz Step 3 Compile and install CMake. cd cmake-3.12.4 ./bootstrap make -j8 make install

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 304 Kunpeng BoostKit for Big Data 33 Hue-4.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03) openEuler yum install cmake

33.4 Compiling Hue

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Hue-cdh6.3.2 source package.

Procedure

Step 1 Go to the Hue-cdh6.3.2 source code directory. cd hue-cdh6.3.2-release Step 2 In the maven/pom.xml file in the current directory, modify and add the Maven repository source. vim ./maven/pom.xml The Huawei Kunpeng repository source must be placed first. There is no requirement on the sequence of other sources. mirrors.huaweicloud.com https://mirrors.huaweicloud.com/kunpeng/maven mirrors huaweicloud com false repository.huaweicloud.com https://mirrors.huaweicloud.com/repository/maven repository huaweicloud com false

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 305 Kunpeng BoostKit for Big Data 33 Hue-4.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 3 Modify the Makefile file. vim Makefile Add the maven parameter to the end of the mvn command. The line number and position are as follows: -Dmaven.wagon.http.ssl.insecure=true -Dmaven.wagon.http.ssl.allowall=true - Dmaven.wagon.http.ssl.ignore.validity.dates=true

Step 4 Modify desktop/libs/librdbms/Makefile. vim desktop/libs/librdbms/Makefile Add the maven parameter to the mvn command as follows:

-Dmaven.wagon.http.ssl.insecure=true -Dmaven.wagon.http.ssl.allowall=true - Dmaven.wagon.http.ssl.ignore.validity.dates=true

Step 5 Perform the compilation. make apps The compilation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 306 Kunpeng BoostKit for Big Data 33 Hue-4.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

The compiled files are stored in the build directory. The executable commands are stored in the build/env/bin directory.

Step 6 Use the Kunpeng Porting Advisor to scan the build directory generated after compilation and ensure that the .so and .jar packages of x86 are not contained.

NO TE

● Use the Kunpeng Porting Advisor to scan the compiled directory and ensure that the .so or .jar packages of x86 are not contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study. ● Run the following command to start Hue: build/env/bin/hue runserver

----End

33.5 Troubleshooting

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 307 Kunpeng BoostKit for Big Data 33 Hue-4.4.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

33.5.1 An Error Reported During Compilation

Symptom

During compilation, "npm ERR: 502 Parent proxy unreachable - GET https:// registry.npmjs.org/globals/-/globals-11.10.0.tgz" is displayed.

Procedure

This problem is related to the proxy. Modify the nmp configuration as follows:

npm config rm proxy npm config rm https-proxy npm config --global rm proxy npm config --global rm https-proxy 33.5.2 "cannot uninstall 'enum34'" Reported During Installation

Symptom

During the Python dependency installation, an error message is displayed, indicating that the dnspython and python-ldap versions are earlier than the required versions. Although the Python dependency can be installed, "cannot uninstall 'enum34'" is also displayed when the astroid is installed. As a result, the installation fails.

Procedure

Uninstall the previous Python dependency, run the following command, and reinstall the Python dependency.

pip install --ignore-installed enum34

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 308 Kunpeng BoostKit for Big Data 34 Kafka-2.11-2.2.1-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

34 Kafka-2.11-2.2.1-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

34.1 Introduction 34.2 Environment Requirements 34.3 Configuring the Compilation Environment 34.4 Compiling Kafka

34.1 Introduction Kafka is an open-source streaming platform developed by Apache Software Foundation in Scala and Java. Kafka is a distributed publish-subscribe messaging system with high throughput. It processes all action flow data on the customer- related websites. Such actions (web browsing, search, and other user actions) are key factors in many social functions on modern networks. The data is usually processed by log handling and aggregation based on throughput requirements. Kafka is a feasible solution for Hadoop to support both offline analysis and log data processing, as well as real-time processing. Kafka aims to use the parallel loading mechanism of Hadoop to centrally process online and offline messages, and to provide real-time messages through clusters. For more compile commands, visit https://kafka.apache.org/.

34.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 309 Kunpeng BoostKit for Big Data 34 Kafka-2.11-2.2.1-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Item Remarks

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Software Requirements

Item Version

OpenJDK 1.8.0_252

Maven 3.5.4

gradle 5.4.1

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

34.3 Configuring the Compilation Environment

34.3.1 Configuring the Local Yum Source

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 310 Kunpeng BoostKit for Big Data 34 Kafka-2.11-2.2.1-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Step 2 Back up the Yum repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to agree to the deletion.

Step 3 Create the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo

Add the following content to the file to configure the local Yum source:

[Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install related software. yum -y install wget vim

----End 34.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 311 Kunpeng BoostKit for Big Data 34 Kafka-2.11-2.2.1-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

----End 34.3.3 Installing Maven Step 1 Download and install the installation package to a directory (for example, /opt/ tools/installed). wget https://archive.apache.org/dist/maven/maven-3/3.1.0/binaries/apache-maven-3.1.0-bin.tar.gz tar -zxf apache-maven-3.1.0-bin.tar.gz mkdir -p /opt/tools/installed mv apache-maven-3.1.0 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following code at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.1.0 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -version The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.1.0/conf/ settings.xml.

vim /opt/tools/installed/apache-maven-3.1.0/conf/settings.xml

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Configure the remote repository. (Change the repository to your Maven repository. If the Maven repository does not exist, configure it as follows.)

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 312 Kunpeng BoostKit for Big Data 34 Kafka-2.11-2.2.1-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

If the compilation environment cannot access Internet, add the following proxy configuration to the settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 34.3.4 Installing Gradle Step 1 Download and install the Gradle source code to a specified directory, for example, /opt/tools/installed. wget https://downloads.gradle.org/distributions/gradle-5.4.1-bin.zip --no-check-certificate unzip gradle-5.4.1-bin.zip mv gradle-5.4.1 /opt/tools/installed/ Step 2 Modify the gradle environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export GRADLE_HOME=/opt/tools/installed/gradle-5.4.1 export PATH=$GRADLE_HOME/bin:$PATH Step 3 Press Esc and run :wq to save the configuration and exit. Step 4 Make the environment variables take effect. source /etc/profile Step 5 Check whether Gradle is installed successfully. gradle -v

----End

34.4 Compiling Kafka

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Kafka-cdh6.3.2 source package.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 313 Kunpeng BoostKit for Big Data 34 Kafka-2.11-2.2.1-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Procedure

Step 1 Go to the Kafka-cdh6.3.2 source code directory. cd kafka-cdh6.3.2-release/ Step 2 Modify the build.gradle file. vim build.gradle Add multiple Maven repositories under the repositories section. allprojects { repositories { mavenLocal() maven { url "https://mirrors.huaweicloud.com/kunpeng/maven" } maven { url "https://mirrors.huaweicloud.com/repository/maven" } maven { url "https://repository.cloudera.com/artifactory/libs-snapshot-local" } maven { url "https://repository.cloudera.com/artifactory/cloudera-repos" } maven { url "https://repo1.maven.org/maven2" } ..... } }

Step 3 Change mavenUrl to https://mirrors.huaweicloud.com/repository/maven in the gradle.properties file. vim gradle.properties

Step 4 Perform the compilation. gradle releaseTarGz -info

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 314 Kunpeng BoostKit for Big Data 34 Kafka-2.11-2.2.1-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Check the .tgz package generated in the core/build/distributions directory.

Step 5 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain the x86 .so or .jar packages.

NO TE

The compiled kafka_2.11-2.2.1-cdh6.3.2.tgz must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled package contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 315 Kunpeng BoostKit for Big Data 35 Lucene-solr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

35 Lucene-solr-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

35.1 Introduction 35.2 Environment Requirements 35.3 Configuring the Compilation Environment 35.4 Compiling Solr

35.1 Introduction Solr (pronounced "solar") is an open-source enterprise-search platform from the Apache Lucene project. Its major features include full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document handling. Providing distributed search and index replication, Solr is designed for scalability.

35.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 316 Kunpeng BoostKit for Big Data 35 Lucene-solr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Software Requirements Item Version

JDK 1.8.0_252

Ant 1.8.4

Maven 3.5.4

CentOS Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

35.3 Configuring the Compilation Environment

35.3.1 Configuring the Local Yum Source Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name. Step 2 Back up the Yum repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to agree to the deletion.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 317 Kunpeng BoostKit for Big Data 35 Lucene-solr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 3 Create the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Add the following content to the file to configure the local Yum source:

[Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install related software. yum -y install wget vim

----End 35.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/ Step 2 Configure Java environment variables. vim /etc/profile Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether OpenJDK is successfully installed. java -version The installation is successful if information similar to the following is displayed:

----End 35.3.3 Installing Maven

Step 1 Download and install the installation package to a directory (for example, /opt/ tools/installed). wget https://archive.apache.org/dist/maven/maven-3/3.1.0/binaries/apache-maven-3.1.0-bin.tar.gz tar -zxf apache-maven-3.1.0-bin.tar.gz mkdir -p /opt/tools/installed mv apache-maven-3.1.0 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 318 Kunpeng BoostKit for Big Data 35 Lucene-solr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.1.0 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -version

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.1.0/conf/ settings.xml.

vim /opt/tools/installed/apache-maven-3.1.0/conf/settings.xml

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Configure the remote repository. (Change the repository to your Maven repository. If the Maven repository does not exist, configure it as follows.)

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to the settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 319 Kunpeng BoostKit for Big Data 35 Lucene-solr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

35.3.4 Installing Ant

Step 1 Download Ant 1.8.4. wget https://archive.apache.org/dist/ant/binaries/apache-ant-1.8.4-bin.tar.gz tar -zxf apache-ant-1.8.4-bin.tar.gz mv apache-ant-1.8.4 /opt/tools/installed/

Step 2 Modify environment variables. vim /etc/profile

Add Ant configuration at the end of the /etc/profile file. export ANT_HOME=/opt/tools/installed/apache-ant-1.8.4 export PATH=$ANT_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether the configuration takes effect: ant -version

----End

35.4 Compiling Solr

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Solr-cdh6.3.2 source package.

Procedure

Step 1 Go to the Solr-cdh6.3.2 source code directory. cd lucene-solr-cdh6.3.2-release

Step 2 Modify the lucene/default-nested-ivy-settings.xml file. vim lucene/default-nested-ivy-settings.xml 1. Change http://repo1.maven.org/maven2 in the file to https:// repo1.maven.org/maven2.

2. Comment out the default ivy repository. a. Comment out all include tags.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 320 Kunpeng BoostKit for Big Data 35 Lucene-solr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

b. Comment out the default ivy repository in the default chain. c. Comment out the default ivy repository in the cloudera chain.

3. Add the central repository configuration. a. Add a central repository to the first line of the default chain. b. Add a central repository to the first line of the cloudera chain. Step 3 Modify the lucene/common-build.xml file. vim lucene/common-build.xml Change http://repo1.maven.org/maven2 in the file to https://repo1.maven.org/ maven2.

Step 4 Modify the dev-tools/scripts/poll-mirrors.py file. vim dev-tools/scripts/poll-mirrors.py Change http://repo1.maven.org/maven2 in the file to https://repo1.maven.org/ maven2.

maven_url = None if args.version is None else "https://repo1.maven.org/maven2/";

Step 5 Modify the cloudera/templates/cdh.build.properties file. vim cloudera/templates/cdh.build.properties Modify the repository addresses in lines 4 and 5 as follows:

4 snapshots.cloudera.com=https://repository.cloudera.com/content/repositories/snapshots/ 5 releases.cloudera.com=https://repository.cloudera.com/artifactory/cdh-releases-rcs/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 321 Kunpeng BoostKit for Big Data 35 Lucene-solr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 6 Perform the compilation. 1. Perform the compilation. ant ivy-bootstrap

ant compile NO TE

The first compilation may fail. If the following error information is displayed, modify the cdh.build.properties file.

Solution: vim cdh.build.properties Adjust the values of the corresponding names as follows: snapshots.cloudera.com=https://repository.cloudera.com/content/repositories/snapshots/ releases.cloudera.com=https://repository.cloudera.com/artifactory/cdh-releases-rcs/ The compilation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 322 Kunpeng BoostKit for Big Data 35 Lucene-solr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

2. Go to the solr directory and continue the compilation. cd solr ant create-package

NO TE

The compilation result is stored in solr/package/solr-7.4.0-SNAPSHOT.tgz in the root directory of the source code. The compilation is successful if the following information is displayed:

Step 7 Use the Kunpeng Porting Advisor to scan the .tar package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

NO TE

The compiled package solr-7.4.0-SNAPSHOT.tgz must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 323 Kunpeng BoostKit for Big Data 36 Oozie-5.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

36 Oozie-5.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

36.1 Introduction 36.2 Environment Requirements 36.3 Configuring the Compilation Environment 36.4 Compiling Oozie

36.1 Introduction

Oozie is a Hadoop-based workflow engine (scheduler). It writes scheduling processes in XML format and schedules MapReduce, Pig, Hive, shell, jar, and Spark jobs. Oozie is practical to perform a series of operations on data. With Oozie, you do not need to write some processing code and only need to define actions and connect them to a workflow for automatic execution. Oozie is very useful for big data analysis.

For more information about Oozie, visit https://oozie.apache.org/.

36.2 Environment Requirements

Hardware Requirements

Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 324 Kunpeng BoostKit for Big Data 36 Oozie-5.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Software Requirements

Item Version

JDK 1.8.0_252

Maven 3.5.4

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

36.3 Configuring the Compilation Environment

36.3.1 Configuring the Local Yum Source

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the Yum repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to agree to the deletion.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 325 Kunpeng BoostKit for Big Data 36 Oozie-5.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 3 Create the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo

Add the following content to the file to configure the local Yum source:

[Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install related software. yum -y install wget vim

----End 36.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End 36.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 326 Kunpeng BoostKit for Big Data 36 Oozie-5.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 327 Kunpeng BoostKit for Big Data 36 Oozie-5.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

36.4 Compiling Oozie

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Oozie-cdh6.3.2 source package.

Procedure

Step 1 Go to the Oozie-cdh6.3.2 source code directory. cd oozie-cdh6.3.2-release Step 2 Modify the pom.xml file. vim pom.xml Add the Maven repository source in line 136. kunpengmaven kunpeng maven https://mirrors.huaweicloud.com/kunpeng/maven

Modify the dependency configuration in line 1592, delete the dependency JAR package of x86, and add the dependency JAR package of AArch64.

guru.nidi graphviz-java 0.7.0 com.eclipsesource.j2v8 j2v8_linux_x86_64 com.eclipsesource.j2v8 j2v8_linux_aarch64 4.6.0

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 328 Kunpeng BoostKit for Big Data 36 Oozie-5.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 3 Modify the core/pom.xml file. vim core/pom.xml Add a dependency in line 548.

com.eclipsesource.j2v8 j2v8_linux_aarch64 4.6.0

Step 4 Modify the fluent-job/fluent-job-api/pom.xml file. vim fluent-job/fluent-job-api/pom.xml Add a dependency under line 53.

com.eclipsesource.j2v8 j2v8_linux_aarch64 4.6.0

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 329 Kunpeng BoostKit for Big Data 36 Oozie-5.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 5 Perform the compilation. mvn package -DskipTests assembly:single

After the compilation is complete, the following information is displayed.

Obtain the tar.gz package in the distro/target/ directory.

Step 6 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

NO TE

The compiled package oozie-5.1.0-cdh6.3.2-distro.tar.gz must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled package contains x86 .so or .jar packages, the Oozie functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 330 Kunpeng BoostKit for Big Data 37 Parquet-format-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

37 Parquet-format-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

37.1 Introduction 37.2 Environment Requirements 37.3 Configuring the Compilation Environment 37.4 Compiling Parquet-format

37.1 Introduction Parquet uses the columnar storage format and supports data nesting. The Parquet-format project implemented by Java defines all Parquet metadata objects (data types and storage formats in Parquet). Parquet metadata is serialized using Apache Thrift and stored at the end of the Parquet file. This document describes how to adapt Parquet-format components in CDH to TaiShan servers. For more information about CDH, visit https://www.cloudera.com/.

37.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 331 Kunpeng BoostKit for Big Data 37 Parquet-format-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Software Requirements

Item Version

OpenJDK 1.8.0_252

Maven 3.5.4

Thrift 0.9.3

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

37.3 Configuring the Compilation Environment

37.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 332 Kunpeng BoostKit for Big Data 37 Parquet-format-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 333 Kunpeng BoostKit for Big Data 37 Parquet-format-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Install dependencies using the Yum source.

yum -y install wget make 37.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 334 Kunpeng BoostKit for Big Data 37 Parquet-format-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

java -version The installation is successful if information similar to the following is displayed:

----End 37.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 335 Kunpeng BoostKit for Big Data 37 Parquet-format-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 37.3.4 Installing Thrift

Step 1 Download the Thrift package. wget http://archive.apache.org/dist/thrift/0.9.3/thrift-0.9.3.tar.gz

Step 2 Decompress the package. tar -xvf thrift-0.9.3.tar.gz

Step 3 Switch to the folder generated after the Thrift package is decompressed. cd thrift-0.9.3

Step 4 Add the execute permission for the configure file. chmod +x configure

Step 5 Install Thrift. ./configure --disable-gen-erl --disable-gen-hs --without-ruby --without-haskell --without-erlang make make install

Step 6 Run the following command to check whether Thrift has been successfully installed. thrift -version

----End

37.4 Compiling Parquet-format

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 336 Kunpeng BoostKit for Big Data 37 Parquet-format-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

Prerequisites You have downloaded and decompressed the Parquet-format-cdh6.3.2 source package.

Procedure

Step 1 Go to the Parquet-format-cdh6.3.2 source code directory. cd parquet-format-cdh6.3.2-release Step 2 Modify the pom.xml file by adding the Kunpeng repository. vim pom.xml Add the repositories tag to the pom.xml file and add repositories in the repositories tag. The Kunpeng Maven repository must be placed first. Kunpengmaven Kunpeng maven https://mirrors.huaweicloud.com/kunpeng/maven/ huaweicloud.repo HuaweiCloud Repositories https://mirrors.huaweicloud.com/repository/maven wso2.repo http://maven.wso2.org/nexus/content/groups/wso2-public/ wso2 Repositories

Step 3 Perform the compilation. mvn package apache-rat:check -Drat.numUnapprovedLicenses=1 -DskipTests The compilation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 337 Kunpeng BoostKit for Big Data 37 Parquet-format-cdh6.3.2 Porting Guide (CentOS Porting Guide (CDH) 7.6 & openEuler 20.03)

The compilation result is parquet-format-2.4.0-cdh6.3.2.jar in target/.

Step 4 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain the x86 .so or .jar packages.

NO TE

The generated package parquet-format-2.4.0-cdh6.3.2.jar must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 338 Kunpeng BoostKit for Big Data 38 Parquet-mr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

38 Parquet-mr-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

38.1 Introduction 38.2 Environment Requirements 38.3 Configuring the Compilation Environment 38.4 Compiling Parquet-mr

38.1 Introduction Parquet uses the columnar storage format and supports data nesting. As a subproject of Parquet, the Parquet-MR project enables the object model converter function of Parquet, which maps external object models to internal data types of Parquet. This document describes how to adapt Parquet-MR components in CDH to TaiShan servers. For more information about CDH, visit https://www.cloudera.com/.

38.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 339 Kunpeng BoostKit for Big Data 38 Parquet-mr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Software Requirements

Item Version

OpenJDK 1.8.0_252

Maven 3.5.4

Thrift 0.9.3

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

38.3 Configuring the Compilation Environment

38.3.1 Installing Basic Libraries

Installing GCC

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 340 Kunpeng BoostKit for Big Data 38 Parquet-mr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to delete the files.

Step 3 Modify the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo Configure the local Yum source. [Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0 Step 4 Make the Yum source configuration take effect. yum clean all yum makecache Step 5 Use the Yum source to install GCC-related software. yum -y install gcc.aarch64 gcc-c++.aarch64 gcc-gfortran.aarch64 libgcc.aarch64 Step 6 Resolve the -fsigned-char problem (by modifying the GCC). 1. Search for the directory where GCC is located. Generally, the directory is /usr/bin/gcc. command -v gcc 2. Rename the original GCC file, for example, to gcc-impl. mv /usr/bin/gcc /usr/bin/gcc-impl 3. Create a new GCC file. vi /usr/bin/gcc Add the following information to the file and save the file: #! /bin/sh /usr/bin/gcc-impl -fsigned-char "$@" 4. Add the execute permission for the GCC file. chmod +x /usr/bin/gcc 5. Check whether the GCC is available. gcc --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

Step 7 Resolve the -fsigned-char problem (by modifying G++). 1. Search for the directory where G++ is located. Generally, the directory is /usr/bin/g++.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 341 Kunpeng BoostKit for Big Data 38 Parquet-mr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

command -v g++ 2. Change the original G++ file name, for example, to g++-impl. mv /usr/bin/g++ /usr/bin/g++-impl 3. Create a new G++ file. vi /usr/bin/g++ Add the following information to the file and save the file: #! /bin/sh /usr/bin/g++-impl -fsigned-char "$@" 4. Add the execute permission for the G++ file. chmod +x /usr/bin/g++ 5. Check whether G++ is available. g++ --version – CentOS: The installation is successful if information similar to the following is displayed:

– openEuler: The installation is successful if information similar to the following is displayed:

----End

Installing Dependencies Install dependencies using the Yum source.

yum -y install wget make 38.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 342 Kunpeng BoostKit for Big Data 38 Parquet-mr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

java -version The installation is successful if information similar to the following is displayed:

----End 38.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/ Step 2 Modify the Maven environment variables. vim /etc/profile Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH Step 3 Make the environment variables take effect. source /etc/profile Step 4 Check whether Maven is successfully installed. mvn -v The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file. Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified. Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 343 Kunpeng BoostKit for Big Data 38 Parquet-mr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 38.3.4 Installing Protobuf

CentOS

Step 1 Install Protobuf. yum install -y protobuf protobuf-devel

Step 2 Check whether Protobuf is installed successfully. protoc --version

The installation is successful if information similar to the following is displayed:

Step 3 Install Maven. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/bin/protoc

----End openEuler

Step 1 Download and decompress the source code. wget https://github.com/protocolbuffers/protobuf/releases/download/v2.5.0/protobuf-2.5.0.tar.gz tar -zxf protobuf-2.5.0.tar.gz

Step 2 Move the decompressed directory to the /opt/tools/installed/ directory. mv protobuf-2.5.0 /opt/tools/installed/

Step 3 Go to the /opt/tools/installed/ directory. cd /opt/tools/installed

Step 4 Download the protoc.zip package and decompress it to obtain the protoc.patch file whose storage path can be specified, for example, to /opt/tools/installed/.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 344 Kunpeng BoostKit for Big Data 38 Parquet-mr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

wget https://mirrors.huaweicloud.com/kunpeng/archive/kunpeng_solution/bigdata/Patch/protoc.zip unzip protoc.zip cp ./protoc/protoc.patch ./protobuf-2.5.0/src/google/protobuf/stubs/

Step 5 Go to the protobuf-2.5.0/src/google/protobuf/stubs/ directory and install the patch. cd protobuf-2.5.0/src/google/protobuf/stubs/ patch -p1 < protoc.patch

Step 6 Go back to the root directory of protobuf-2.5.0, compile the file, and install it in the default directory. cd /opt/tools/installed/protobuf-2.5.0 ./autogen.sh && ./configure CFLAGS='-fsigned-char' && make -j8 && make install

Step 7 Deploy Protoc in the local Maven repository. mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 - Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/usr/local/bin/protoc

----End 38.3.5 Installing Thrift

Step 1 Download the Thrift package. wget http://archive.apache.org/dist/thrift/0.9.3/thrift-0.9.3.tar.gz

Step 2 Decompress the package. tar -xvf thrift-0.9.3.tar.gz

Step 3 Switch to the folder generated after the Thrift package is decompressed. cd thrift-0.9.3

Step 4 Add the execute permission for the configure file. chmod +x configure

Step 5 Install Thrift. ./configure --disable-gen-erl --disable-gen-hs --without-ruby --without-haskell --without-erlang make make install

Step 6 Run the following command to check whether Thrift has been successfully installed. thrift -version

----End

38.4 Compiling Parquet-mr

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 345 Kunpeng BoostKit for Big Data 38 Parquet-mr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Prerequisites You have downloaded and decompressed the Parquet-mr-cdh6.3.2 source package.

Procedure

Step 1 Go to the Parquet-mr-cdh6.3.2 source code directory. cd parquet-mr-cdh6.3.2-release Step 2 Configure the Kunpeng repository and modify the pom.xml file. vim pom.xml Add the Kunpeng Maven repository to the repositories tag. The Kunpeng Maven repository must be placed first.

Kunpengmaven Kunpeng maven https://mirrors.huaweicloud.com/kunpeng/maven/ huaweicloud.repo HuaweiCloud Repositories https://mirrors.huaweicloud.com/repository/maven wso2.repo http://maven.wso2.org/nexus/content/groups/wso2-public/ wso2 Repositories pentaho-repo pentaho-repo https://public.nexus.pentaho.org/content/groups/omni/ bsdn-repo bsdn Repositories http://nexus.bsdn.org/content/repositories/public/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 346 Kunpeng BoostKit for Big Data 38 Parquet-mr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 3 Perform compilation. mvn package apache-rat:check -Drat.numUnapprovedLicenses=1 -DskipTests The operation is successful if the following information is displayed:

Parquet-MR is a formatted JAR package. The compiled component package is stored in the target directory of each component directory.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 347 Kunpeng BoostKit for Big Data 38 Parquet-mr-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 4 Use the Kunpeng Porting Advisor to scan the JAR packages in the directory of each component shown above and ensure that no x86 .so packages or .jar packages are contained.

NO TE

The compiled JAR package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled directory contains x86 .so or .jar packages, the component functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 348 Kunpeng BoostKit for Big Data 39 Pig-0.17.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

39 Pig-0.17.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

39.1 Introduction 39.2 Environment Requirements 39.3 Configuring the Compilation Environment 39.4 Compiling Pig 39.5 Troubleshooting

39.1 Introduction Apache Pig is an open-source project on the Apache platform. Pig provides high- level abstraction for processing large data sets. In most cases, multiple MapReduce processes are required for data processing, making it difficult to match the data processing process with the pattern. With Pig, you can use more data structures.

Pig Latin is a relatively simple language. A statement is an operation, which is similar to a database table and can be found in a relational database (where a tuple represents a row, and each tuple consists of fields).

For details about Pig, visit https://pig.apache.org/.

39.2 Environment Requirements

Hardware Requirements

Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 349 Kunpeng BoostKit for Big Data 39 Pig-0.17.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Item Remarks

Network Accessible to the Internet

Software Requirements

Item Version

Ant 1.9.4

Maven 3.5.4

JDK 1.8.0_252

forrest 0.9

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

39.3 Configuring the Compilation Environment

39.3.1 Configuring the Local Yum Source

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual .iso package name.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 350 Kunpeng BoostKit for Big Data 39 Pig-0.17.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 2 Back up the Yum repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to agree to the deletion.

Step 3 Create the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo

Add the following content to the file to configure the local Yum source:

[Local] name=CentOS-7.6 Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install related software. yum -y install wget patch openssl-devel zlib-devel automake libtool make cmake libstdc++-static glibc-static git

----End 39.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 351 Kunpeng BoostKit for Big Data 39 Pig-0.17.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

----End 39.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 352 Kunpeng BoostKit for Big Data 39 Pig-0.17.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End 39.3.4 Installing Ant

Step 1 Download and install the package to the specified directory. wget https://archive.apache.org/dist/ant/binaries/apache-ant-1.9.4-bin.tar.gz tar -zxf apache-ant-1.9.4-bin.tar.gz mv apache-ant-1.9.4 /opt/tools/installed/

Step 2 Modify the environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export ANT_HOME=/opt/tools/installed/apache-ant-1.9.4 export PATH=$ANT_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether the configuration takes effect: ant -version Apache Ant(TM) version 1.9.4 compiled on April 29 2014

----End 39.3.5 Installing Forrest

Step 1 Download and install Forrest to a specified directory, for example, /opt/tools/ installed. wget http://archive.apache.org/dist/forrest/0.9/apache-forrest-0.9.tar.gz tar -zxf apache-forrest-0.9.tar.gz mv apache-forrest-0.9 /opt/tools/installed/

Step 2 Modify environment variables. vim /etc/profile

Add the following code at the end of the /etc/profile file: export FORREST_HOME=/opt/tools/installed/apache-forrest-0.9 export PATH=$FORREST_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether the environment variables take effect. forrest -projecthelp

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 353 Kunpeng BoostKit for Big Data 39 Pig-0.17.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

----End

39.4 Compiling Pig

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites You have downloaded and decompressed the Pig-cdh6.3.2 source package.

Procedure

Step 1 Go to the Pig-cdh6.3.2 source code directory. cd pig-cdh6.3.2-release Step 2 Modify the build.xml file. vim build.xml Change the mvnrepo repository source from http://repo2.maven.org/maven2 to https://mirrors.huaweicloud.com/repository/maven.

Step 3 Modify the contrib/piggybank/java/build.xml file. vim contrib/piggybank/java/build.xml Change all http://repo2.maven.org/maven2 to https:// mirrors.huaweicloud.com/repository/maven.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 354 Kunpeng BoostKit for Big Data 39 Pig-0.17.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Step 4 Modify the ivy/ivysettings.xml file: vim ivy/ivysettings.xml Add the Maven repository source as follows: 1. Add the following information before line 32 in the file:

2. Add the following information before line 54:

3. Add the following information before lines 77 and 81: 4. Add the following information before line 102:

5. Comment out line 103.

Step 5 Perform the compilation. ant tar -Dforrest.home=/opt/tools/installed/apache-forrest-0.9 -Dhadoopversion=3

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 355 Kunpeng BoostKit for Big Data 39 Pig-0.17.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

NO TE

The first compilation may fail. Modify the build.properties file if the following error information is displayed: [ivy:resolve] SERVER ERROR: notresolvable url=http://maven.jenkins.cloudera.com:8081/artifactory/ cdh-staging-local/org/slf4j/slf4j-parent/1.5.11/slf4j-parent-1.5.11.jar [ivy:resolve] SERVER ERROR: notresolvable url=http://maven.jenkins.cloudera.com:8081/artifactory/ cdh-staging-local/org/slf4j/slf4j-log4j12/1.5.11/slf4j-log4j12-1.5.11-javadoc.jar [ivy:resolve] SERVER ERROR: notresolvable url=http://maven.jenkins.cloudera.com:8081/artifactory/ cdh-staging-local/org/slf4j/slf4j-api/1.5.11/slf4j-api-1.5.11-javadoc.jar [ivy:resolve] [ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS

BUILD FAILED Solution: vim build.properties Adjust the values of the corresponding names as follows: 4 repository.root=https://repo1.maven.org/maven2 5 6 # These override the settings in ivysettings.xml 7 snapshots.cloudera.com=https://repository.cloudera.com/content/repositories/snapshots 8 releases.cloudera.com=https://repository.cloudera.com/content/groups/cdh-releases-rcs

NO TICE

The compilation may take a long time.

The operation is successful if the following information is displayed:

After the compilation is successful, the pig-0.17.0-cdh6.3.2.tar.gz package of the compilation result is stored in the build directory. Step 6 Use the Kunpeng Porting Advisor to scan the package generated after the compilation and ensure that the package does not contain x86 .so or .jar packages.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 356 Kunpeng BoostKit for Big Data 39 Pig-0.17.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

NO TE

The compiled pig-0.17.0-cdh6.3.2.tar.gz must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled package contains x86 .so or .jar packages, the Pig functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

39.5 Troubleshooting

39.5.1 Failed to Download ivy-2.2.0.jar Due to Connection Timeout

Symptom Failed to download ivy-2.2.0.jar due to connection timeout.

Solution 1 You need to manually download the ivy-2.2.0.jar package to the specified directory. (For details about the directory, see the error description.)

Run the following command to manually download the file:

wget https://repo1.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar

After the download is complete, go to the root directory of the source code and continue the compilation.

Solution 2 This failure is caused by the network. Try again until the connection is normal. 39.5.2 "GC overhead limit exceeded" Reported During Compilation

Symptom During compilation, "GC overhead limit exceeded" is displayed.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 357 Kunpeng BoostKit for Big Data 39 Pig-0.17.0-cdh6.3.2 Porting Guide (CentOS 7.6 & Porting Guide (CDH) openEuler 20.03)

Procedure Run the following command before compilation.

export MAVEN_OPTS="-Xmx10240m -XX:MaxMetaspaceSize=768m"

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 358 Kunpeng BoostKit for Big Data 40 Search-1.0.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

40 Search-1.0.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

40.1 Introduction 40.2 Environment Requirements 40.3 Configuring the Compilation Environment 40.4 Compiling Search 40.5 Troubleshooting

40.1 Introduction Cloudera Search is a search service released by Cloudera based on the Apache open-source project Solr. It can be used for incremental data backup and data restoration. For more information about CDH, visit https://www.cloudera.com/.

40.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 359 Kunpeng BoostKit for Big Data 40 Search-1.0.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Software Requirements

Item Version

JDK 1.8.0_252

Maven 3.5.4

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

40.3 Configuring the Compilation Environment

40.3.1 Configuring the Local Yum Source

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the Yum repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to agree to the deletion.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 360 Kunpeng BoostKit for Big Data 40 Search-1.0.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 3 Create the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo

Add the following content to the file to configure the local Yum source:

[Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install related software. yum -y install wget vim

----End 40.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End 40.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 361 Kunpeng BoostKit for Big Data 40 Search-1.0.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 362 Kunpeng BoostKit for Big Data 40 Search-1.0.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

40.4 Compiling Search

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites

You have downloaded and decompressed the Search-1.0.0-cdh6.3.2 source package.

Procedure

Step 1 Go to the Search-1.0.0-cdh6.3.2 source code directory. cd search-cdh6.3.2-release

Step 2 Modify the pom.xml file to configure the Kunpeng repository. vim pom.xml

To be specific, add the Kunpeng Maven repository to the repositories tag. The Kunpeng Maven repository must be placed first. Kunpeng.repo https://mirrors.huaweicloud.com/kunpeng/maven/ Kunpeng Repositories huaweicloud.repo HuaweiCloud Repositories https://mirrors.huaweicloud.com/repository/maven wso2.repo http://maven.wso2.org/nexus/content/groups/wso2-public/ wso2 Repositories

Step 3 Perform compilation. mvn package -DskipTests -Dtar -Ddist

The operation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 363 Kunpeng BoostKit for Big Data 40 Search-1.0.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

The compilation result is cloudera-search-1.0.0-cdh6.3.2-search-dist.tar.gz in search-dist/target/.

Step 4 Use the Kunpeng Porting Advisor to scan the .tar package generated after compilation and ensure that the .tar package has no x86 .so or .jar package.

NO TE

The compiled cloudera-search-1.0.0-cdh6.3.2-search-dist.tar.gz package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled package contains x86 .so or .jar packages, the Search functions may be affected. For details about how to use the Kunpeng Porting Advisor, see the Kunpeng Porting Advisor Case Study.

----End

40.5 Troubleshooting

40.5.1 "java.lang.OutOfMemoryError: PermGen space" Reported During Compilation

Symptom During compilation, "java.lang.OutOfMemoryError: PermGen space" is displayed.

Procedure Before compilation, run the following command or add the following command to the end of the /etc/profile file:

export MAVEN_OPTS="-Xmx10240m -XX:MaxMetaspaceSize=768m"

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 364 Kunpeng BoostKit for Big Data 41 Sentry-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

41 Sentry-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 & openEuler 20.03)

41.1 Introduction 41.2 Environment Requirements 41.3 Configuring the Compilation Environment 41.4 Compiling Sentry

41.1 Introduction Sentry is a fine-grained role-based authorization component in Hadoop. Sentry provides access control for authenticated users and applications in a Hadoop cluster. This document describes how to adapt Sentry components in CDH to TaiShan servers. For more information about CDH, visit https://www.cloudera.com/.

41.2 Environment Requirements

Hardware Requirements Item Remarks

Server TaiShan server

CPU Huawei Kunpeng 920 processor or Huawei Kunpeng 916 processor

Drive partition No requirement for drive partitions

Network Accessible to the Internet

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 365 Kunpeng BoostKit for Big Data 41 Sentry-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Software Requirements

Item Version

JDK 1.8.0_252

Maven 3.5.4

CentOS

Item Version

CentOS 7.6

OS kernel 4.14.0

GCC 4.8.5

openEuler

Item Version

openEuler 20.03 LTS SP1

OS Kernel 4.19.90

GCC 7.3.0

41.3 Configuring the Compilation Environment

41.3.1 Configuring the Local Yum Source

Step 1 Mount the OS image. mount YOUR_OS.iso /media -o loop

NO TE

Replace YOUR_OS.iso with the actual iso package name.

Step 2 Back up the Yum repo file and clear the /etc/yum.repos.d/ directory. cp -r /etc/yum.repos.d /etc/yum.repos.d-bak rm /etc/yum.repos.d/*

NO TICE

Ensure that all repo files have been backed up. Enter y on the rm deletion page to agree to the deletion.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 366 Kunpeng BoostKit for Big Data 41 Sentry-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 3 Create the /etc/yum.repos.d/Local.repo file. vi /etc/yum.repos.d/Local.repo

Add the following content to the file to configure the local Yum source:

[Local] name=Local baseurl=file:///media/ enabled=1 gpgcheck=0

Step 4 Make the Yum source configuration take effect. yum clean all yum makecache

Step 5 Use the Yum source to install related software. yum -y install wget vim

----End 41.3.2 Installing OpenJDK

Step 1 Download and decompress the installation package to a directory (for example, /opt/tools/installed/). wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u252-b09/ OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz tar -zxf OpenJDK8U-jdk_aarch64_linux_hotspot_8u252b09.tar.gz mkdir -p /opt/tools/installed/ mv jdk8u252-b09 /opt/tools/installed/

Step 2 Configure Java environment variables. vim /etc/profile

Add the following to the end of the file: export JAVA_HOME=/opt/tools/installed/jdk8u252-b09 export PATH=$JAVA_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether OpenJDK is successfully installed. java -version

The installation is successful if information similar to the following is displayed:

----End 41.3.3 Installing Maven

Step 1 Download the installation package and install Maven to a directory (for example, /opt/tools/installed/). wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz tar -zxf apache-maven-3.5.4-bin.tar.gz mv apache-maven-3.5.4 /opt/tools/installed/

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 367 Kunpeng BoostKit for Big Data 41 Sentry-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 2 Modify the Maven environment variables. vim /etc/profile

Add the following at the end of the /etc/profile file: export MAVEN_HOME=/opt/tools/installed/apache-maven-3.5.4 export PATH=$MAVEN_HOME/bin:$PATH

Step 3 Make the environment variables take effect. source /etc/profile

Step 4 Check whether Maven is successfully installed. mvn -v

The installation is successful if information similar to the following is displayed:

Step 5 Modify the local repository path and remote repository in the Maven configuration file.

Configuration file path: /opt/tools/installed/apache-maven-3.5.4/conf/ settings.xml.

NO TE

The default local repository directory is ~/.m2/. If you want to change the directory to a specified one, modify the localRepository tag. You do not need to modify this parameter unless otherwise specified.

Add the following content to the tag to configure the remote repository (change the repository to the Maven repository that you have built. If the Maven repository does not exist, configure it based on the following example):

huaweimaven huawei maven https://mirrors.huaweicloud.com/repository/maven/ central

If the compilation environment cannot access Internet, add the following proxy configuration to settings.xml:

optional true http Username Password Proxy server URL Proxy server port local.net|some.host.com

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 368 Kunpeng BoostKit for Big Data 41 Sentry-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

41.4 Compiling Sentry

NO TE

This section explains the compilation for CDH 6.3.2. Refer to this section when compiling for another version.

Prerequisites

You have downloaded and decompressed the Sentry-2.1.0-cdh6.3.2 source package.

Procedure

Step 1 Go to the Sentry-2.1.0-cdh6.3.2 source code directory. cd sentry-cdh6.3.2-release

Step 2 Configure the Kunpeng repository and modify the pom.xml file. vim pom.xml

1. Add the Kunpeng Maven repository to the repositories tag. The Kunpeng Maven repository must be placed first. Kunpeng.repo https://mirrors.huaweicloud.com/kunpeng/maven/ Kunpeng Repositories huaweicloud.repo HuaweiCloud Repositories https://mirrors.huaweicloud.com/repository/maven wso2.repo http://maven.wso2.org/nexus/content/groups/wso2-public/ wso2 Repositories

2. Change the repository whose ID is apache to the Cloudera repository.

cloudera.repo https://repository.cloudera.com/cloudera/cloudera-repos/

3. Comment out a repository that cannot be accessed.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 369 Kunpeng BoostKit for Big Data 41 Sentry-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

Step 3 Modify the sentry-tests/pom.xml file. vim sentry-tests/pom.xml Change the value of activeByDefault in the profile whose ID is hive-authz1 to false so that it is disabled by default.

false

Step 4 Perform compilation. mvn package -DskipTests The operation is successful if the following information is displayed:

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 370 Kunpeng BoostKit for Big Data 41 Sentry-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

The compilation result is apache-sentry-2.1.0-cdh6.3.2-bin.tar.gz in sentry-dist/ target/.

Step 5 Use the Kunpeng Porting Advisor to scan the .tar package generated after compilation and ensure that the .tar package has no x86 .so or .jar package.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 371 Kunpeng BoostKit for Big Data 41 Sentry-2.1.0-cdh6.3.2 Porting Guide (CentOS 7.6 Porting Guide (CDH) & openEuler 20.03)

NO TE

The compiled apache-sentry-2.1.0-cdh6.3.2-bin.tar.gz package must be scanned by using the Kunpeng Porting Advisor to ensure that no x86 .so or .jar packages are contained. If the compiled package contains x86 .so or .jar packages, the Sentry functions may be affected. For details about how to use the Kunpeng Porting Advisor, see Kunpeng Porting Advisor Case Study.

----End

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 372 Kunpeng BoostKit for Big Data Porting Guide (CDH) A Change History

A Change History

Date Description

2021-07-13 This issue is the eighth official release. Added the adaptation to openEuler 20.03 in the porting guides of CDH5.13.3, CDH 6.3.0, and CDH 6.3.2.

2020-12-30 This issue is the seventh official release. Changed "Kunpeng Code Scanner" to "Kunpeng Porting Advisor".

2020-09-30 This issue is the sixth official release. Added the porting guides of CDH 6.3.2 components.

2020-08-24 This issue is the fifth official release. Deleted "Compiling Dependency Libraries".

2020-06-23 This issue is the fourth official release. Added the porting guides of CDH 5.12.1 and CDH 6.3.2 components.

2020-06-20 This issue is the third official release. Modified the procedure for installing Scala in the Spark-1.6.0- cdh5.13.3 Porting Guide (CentOS 7.6).

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 373 Kunpeng BoostKit for Big Data Porting Guide (CDH) A Change History

Date Description

2020-05-23 This issue is the second official release. ● Modified step 4 in 1.3.3 Installing Maven in the Avro-1.8.2- cdh6.3.0 Porting Guide (CentOS 7.6). ● Deleted "Reference" from the Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6) and added the related information to 1.1 Introduction. ● Deleted "Compiling Dependency Libraries" from the Avro-1.8.2-cdh6.3.0 Porting Guide (CentOS 7.6). ● Deleted "Reference" from the Flume-ng-cdh6.3.0 Porting Guide (CentOS 7.6) and added the related information to 2.1 Introduction. ● Deleted "Compiling Dependency Libraries" from the Flume-ng- cdh6.3.0 Porting Guide (CentOS 7.6). ● Modified step 4 in 7.3.3 Installing Maven in the Hive-1.1.0- cdh-5.13.3 Porting Guide (CentOS 7.6). ● Deleted "Reference" from the Hive-1.1.0-cdh-5.13.3 Porting Guide (CentOS 7.6) and added the related information to 7.1 Introduction. ● Deleted "Reference" from the Parquet-format-2.4.0-cdh6.3.0 Porting Guide (CentOS 7.6) and added the related information to 3.1 Introduction. ● Deleted "Compiling Dependency Libraries" from the Parquet- format-2.4.0-cdh6.3.0 Porting Guide (CentOS 7.6). ● Deleted "Reference" from the Parquet-mr-cdh6.3.0 Porting Guide (CentOS 7.6) and added the related information to 4.1 Introduction. ● Deleted "Compiling Dependency Libraries" from the Parquet- mr-cdh6.3.0 Porting Guide (CentOS 7.6). ● Deleted "Reference" from the Sentry-2.1.0-cdh6.3.0 Porting Guide (CentOS 7.6) and added the related information to 5.1 Introduction. ● Deleted "Reference" from the Solr-7.4.0-cdh6.3.0 Porting Guide (CentOS 7.6) and added the related information to 6.1 Introduction. ● Deleted "Compiling Dependency Libraries" from the Solr-7.4.0- cdh6.3.0 Porting Guide (CentOS 7.6). ● Modified step 2 in "Installing Scala" in the Spark-1.6.0- cdh5.13.3 Porting Guide (CentOS 7.6). ● Deleted "Reference" from the Spark-1.6.0-cdh5.13.3 Porting Guide (CentOS 7.6) and added the related information to 8.1 Introduction.

2020-03-20 This issue is the first official release.

Issue 08 (2021-07-13) Copyright © Huawei Technologies Co., Ltd. 374