Urika®-GX Analytic Applications Guide
Total Page:16
File Type:pdf, Size:1020Kb
Urika®-GX Analytic Applications Guide (2.2.UP00) S-3015 Contents Contents 1 About the Urika®-GX Analytic Applications Guide ................................................................................................. 4 2 Analytic Software Stack Components.....................................................................................................................6 3 Urika-GX Service Modes........................................................................................................................................ 7 4 Access Urika-GX Applications.............................................................................................................................. 10 4.1 Disable Framing on Urika Applications Interface (UAI)........................................................................... 11 5 Authentication Mechanisms..................................................................................................................................13 6 Apache Hadoop Support...................................................................................................................................... 15 6.1 Load Data into the Hadoop Distributed File System (HDFS).................................................................. 16 6.2 Run a Simple Hadoop Job.......................................................................................................................17 6.3 Run a Simple Word Count Application Using Hadoop.............................................................................18 6.4 Monitor Hadoop Applications...................................................................................................................19 6.5 Use Tiered Storage on Urika-GX.............................................................................................................20 6.6 Assign the HDFS/ptmp Directory to Use SSDs for Block Storage.......................................................... 22 6.7 Change the Default HDFS Storage Policy...............................................................................................22 7 Apache Spark Support..........................................................................................................................................24 7.1 Monitor Spark Applications......................................................................................................................26 7.2 Remove Temporary Spark Files from SSDs............................................................................................28 7.3 Obtain Additional Temporary Space for Running Spark Jobs................................................................. 28 7.4 Enable Anaconda Python and the Conda Environment Manager........................................................... 29 7.5 Provide Kerberos Credentials to Spark................................................................................................... 30 7.6 Redirect a Spark Job to a Specific Directory........................................................................................... 31 7.7 Modify the Default Number of Maximum Spark Cores............................................................................ 31 7.8 Execute Spark Jobs on Kubernetes........................................................................................................ 33 7.9 Multi-tenant Spark Thrift Server on Urika-GX..........................................................................................35 8 Use Apache Mesos on Urika-GX .........................................................................................................................38 8.1 Access the Apache Mesos Web UI......................................................................................................... 40 8.2 Use mrun to Retrieve Information About Marathon and Mesos Frameworks..........................................40 8.3 Clean Up Residual mrun Jobs.................................................................................................................44 8.4 Launch an HPC Job Using mrun............................................................................................................. 45 8.5 Manage Resources on Urika-GX.............................................................................................................46 8.6 Manage Long Running Services Using Marathon................................................................................... 48 8.7 Flex up a YARN sub-cluster on Urika-GX................................................................................................51 9 Access the Jupyter Notebook UI.......................................................................................................................... 53 9.1 Create a Jupyter Notebook......................................................................................................................54 9.2 Share or Upload a Jupyter Notebook...................................................................................................... 56 S3015 2 Contents 9.3 Create a Custom Python Based Kernel for JupyterHub.......................................................................... 59 10 Get Started with Using Grafana..........................................................................................................................60 10.1 Urika-GX Performance Analysis Tools.................................................................................................. 62 10.2 Update the InfluxDB Data Retention Policy...........................................................................................62 11 Use Docker on Urika-GX.................................................................................................................................... 64 11.1 Image Management with Docker and Kubernetes................................................................................ 65 11.2 Run the Native Docker Engine on Marathon......................................................................................... 66 12 Start Individual Kafka Brokers............................................................................................................................ 68 13 Overview of the Cray Application Management UI............................................................................................. 69 14 Update the InfluxDB Data Retention Policy........................................................................................................ 71 15 Manage the Spark Thrift Server as a Non-Admin User...................................................................................... 73 16 Use Tableau® with Urika-GX...............................................................................................................................74 16.1 Connect Tableau to HiveServer2 Using LDAP...................................................................................... 74 16.2 Connect Tableau to HiveServer2 Securely............................................................................................78 16.3 Connect Tableau to the Spark Thrift Server ......................................................................................... 81 16.4 Connect Tableau to the Spark Thrift Server Securely........................................................................... 85 16.5 Connect Tableau to Apache Spark Thrift Server on a VM.....................................................................89 16.6 Enable SSL for Spark Thrift Server of a Tenant.................................................................................... 92 17 File Systems....................................................................................................................................................... 94 18 Check the Current Service Mode........................................................................................................................95 19 Fault Tolerance on Urika-GX.............................................................................................................................. 96 20 Default Urika-GX Configurations........................................................................................................................ 97 20.1 Default Grafana Dashboards.................................................................................................................99 20.2 Performance Metrics Collected on Urika-GX.......................................................................................110 20.3 Default Log Settings............................................................................................................................ 114 20.4 Tunable Hadoop and Spark Configuration Parameters.......................................................................115 20.5 Node Types......................................................................................................................................... 117 20.6 Service to Node Mapping.................................................................................................................... 118 20.7 Port Assignments................................................................................................................................ 121 20.8 Major Software Components Versions................................................................................................ 124 21 Troubleshooting...............................................................................................................................................