Progress DataDirect® for JDBC for Apache Hive™ Driver Quick Start

Release 6.0.1

Quick Start: Progress DataDirect for JDBC for Apache Hive Driver

This quick start provides basic information that allows you to install, test, and tune your driver. To take full advantage of the features and functionality available for your driver, refer to the Progress DataDirect Documentation Library. If you are already familiar with DataDirect drivers, you can begin using the driver immediately with the following information:

• Driver and data source classes • Connection URL format

Note: OEM CUSTOMERS: Refer to the Progress DataDirect for JDBC Drivers Distribution Guide for information on installing, branding, unlocking, and distributing your branded drivers.

This quick start covers the following topics:

• Before You Start on page 3 • Requirements and Support on page 4 • Downloading the Driver on page 4 • Installing the JDBC Driver on page 5 • Setting the Classpath on page 7 • Data Source and Driver Classes on page 7 • Using Connection Properties on page 8 • Connecting to a DataSource on page 9 • Tuning for Performance on page 18 • Troubleshooting Setup/Connection Issues on page 21 • Additional Resources on page 22

Before You Start

Before you get started, you need the following:

• Appropriate user permissions to modify your environment and to read, write, and execute various files in the DataDirect for JDBC installation directory . • Connection information: • Database Name: The name of the Apache Hive database to which you want to connect by default.

Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 3 Chapter 1: Quick Start: Progress DataDirect for JDBC for Apache Hive Driver

• Server Name: The IP address or the server name (if your network supports named servers) of the primary database server. • Port Number: The port number of the server listener. • User Name and Password: If required by your environment, the user name and password that are used to connect to the Apache Hive instance. Additional information required for HTTP connections:

• HTTP Path: The path of the HTTP/HTTPS endpoint used for connections.

• For licensed installations, you will also need the following information that was provided by Progress DataDirect:

• IPE Key (control number) • Serial Number

Requirements and Support

Software Requirements:

• Java SE 6 or higher Supported Data Sources:

• Supports Apache Hive version 1.0, 2.0 and higher against the following distributions: • Amazon Elastic MapReduce (Amazon EMR), version 4.0 and higher • Cloudera's Distribution Including (CDH), version 5.4 and higher • Data Platform (HDP), version 2.3 and higher • IBM BigInsights, version 4.1 and higher • MapR Distribution for Apache Hadoop, version 5.2 and higher • Pivotal HD Enterprise (PHD), version 2.0 and higher For the latest information on supported data sources, visit the Progress DataDirect Supported Configurations page.

Downloading the Driver

To download the JDBC Apache Hive driver:

1. Visit the Progress DataDirect Connectors Download page. 2. Select your Hive data source from the list. 3. Select JDBC for the interface. 4. Select your OS when prompted for your OS and architecture.

4 Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 Installing the JDBC Driver

5. Fill in the registration form with your contact information. 6. Review the End User License Agreement. If you agree, select the corresponding box; then, click Download.

The installer program has been downloaded. See "Installing the JDBC Driver" for detailed instructions.

See also Installing the JDBC Driver on page 5

Installing the JDBC Driver

This section provides instructions for installing your downloaded files using the GUI installer.

Note: OEM CUSTOMERS: Refer to the Progress DataDirect for JDBC Drivers Distribution Guide for information on installing, branding, unlocking, and distributing your branded drivers.

Note: Make sure that the Java Virtual Machine (JVM) is defined on your path. Java SE 6 or higher is required to use the drivers.

1. Unzip the files to a temporary directory, maintaining the directory structure of the zip file. After extracting the files, the temporary directory should have the following structure: Windows: PROGRESS_DATADIRECT_JDBC_INSTALL.exe PROGRESS_DATADIRECT_JDBC_COMMON_n.n.n_INSTALL.iam.zip PROGRESS_DATADIRECT_JDBC_DOCUMENTATION_n.n.n_INSTALL.iam.zip PROGRESS_DATADIRECT_JDBC_HIVE_n.n.n_INSTALL.iam.zip Non-Windows: PROGRESS_DATADIRECT_JDBC_INSTALL.jar PROGRESS_DATADIRECT_JDBC_COMMON_n.n.n_INSTALL.iam.zip PROGRESS_DATADIRECT_JDBC_DOCUMENTATION_n.n.n_INSTALL.iam.zip PROGRESS_DATADIRECT_JDBC_HIVE_n.n.n_INSTALL.iam.zip

2. From the installer directory, run the appropriate installer file to start the installer.

• Windows: PROGRESS_DATADIRECT_JDBC_INSTALL.exe • Non-Windows: PROGRESS_DATADIRECT_JDBC_INSTALL.jar

Important: The Java installer can be run on most platforms, including Windows; however, if you run the Java installer on Windows, turn off User Account Controls or select a non-system directory as the installation directory. The Windows installer allows you to install the driver in the Program Files system directory on Windows without turning off User Account Controls.

3. The Introduction window appears. Click Next.

Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 5 Chapter 1: Quick Start: Progress DataDirect for JDBC for Apache Hive Driver

4. The License Agreement window appears. Make sure that you read and understand the license agreement. To continue with the installation, select the I accept the terms in the License Agreement option; then, click Next. 5. The Install Directory window appears. In the Where Would You Like to Install? field, type the path, including the drive letter, of the product installation directory or click the Choose button to browse to and select an installation directory. Verify the installation directory. Click Next to continue. 6. Choose the type of installation to perform. Select one of the following options.

• Evaluation installation (will expire in 15 days). Select this option to install evaluation versions of all available drivers. Click Next to continue with the installation. Skip to Step 10 on page 7. • OEM or Licensed installation. Select this option if you have purchased a licensed version of one or multiple drivers. Click Next. If you are updating a currently installed driver, skip to Step 8 on page 6; otherwise, proceed to the next step.

7. Type the IPE key (also known as the Control Number) that was provided by Progress DataDirect in the IPE Key field, and click the Validate button. You can add multiple keys consecutively. A tree menu of drivers with valid licenses appears in the selection box. For example, the following image demonstrates an installation.

8. From the tree menu, select the drivers that you want to install. Click Next to continue. Drivers that are already installed will be listed under Drivers (Installed) and cannot be deselected. To remove installed drivers, you must uninstall the product. If you are installing a new version of a currently installed driver, the installer will overwrite the installed driver files with the newer version. To revert to an earlier version of the driver, you will need to uninstall the product and reinstall the desired version. For information on uninstalling drivers, see Uninstalling the Product.

9. Enter name, company, and serial number in the fields provided. Click Next to continue. a) Type your name and company name into the corresponding fields. b) Type the serial number that was provided by Progress DataDirect. c) Verify that the I want to participate box is selected to allow the installer program to gather data for the Installer Customer Experience Improvement Program. Information collected for the program is used to improve our products by identifying trends or issues that impact the user experience. For details, refer to the Progress Privacy Policy.

6 Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 Setting the Classpath

10. The Pre-Installation Summary window appears. Review the installation information. Click Previous to revise selections; or click Install to begin the installation.

11. When the installation finishes, the Install Complete window appears. Click Done to exit the installer program.

Setting the Classpath

The driver must be defined on your CLASSPATH before you can connect. The CLASSPATH is the search string your Java Virtual Machine (JVM) uses to locate JDBC drivers on your computer. If the driver is not defined on your CLASSPATH, you will receive a class not found exception when trying to load the driver. Set your system CLASSPATH to include the hive.jar file as shown, where install_dir is the path to your product installation directory.

install_dir/lib/hive.jar

Windows Example CLASSPATH=.;C:\Program Files\Progress\DataDirect\JDBC_60\lib\hive.jar

UNIX Example CLASSPATH=.:/opt/Progress/DataDirect/JDBC_60/lib/hive.jar

Data Source and Driver Classes

The following are the data source and driver classes used by the driver: Driver class: com.ddtek.jdbc.hive.HiveDriver Data source class:

Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 7 Chapter 1: Quick Start: Progress DataDirect for JDBC for Apache Hive Driver

com.ddtek.jdbcx.hive.HiveDataSource

Using Connection Properties

You can use connection properties to customize the driver for your environment. You can use these connection properties with either the JDBC Driver Manager or a JDBC data source. For a Driver Manager connection, a property is expressed as a key value pair and takes the form property=value. For a data source connection, a property is expressed as a JDBC method and takes the form setProperty(value). For a complete list of supported properties, refer to "Connection Property Descriptions" in the Progress DataDirect for JDBC for Apache Hive Driver User's Guide The following table summarizes the minimum connection properties required to connect to a database. The first section describes properties required for both binary (TCP) mode and HTTP modes, while the second documents additional properties required to establish an HTTP connection. For a list of properties that affect performance, see "Tuning for Performance."

Note: All connection property names are case-insensitive. For example, Password is the same as password. Required properties are noted as such.

Note: The data type listed for each connection property is the Java data type used for the property value in a JDBC data source.

Table 1: Required Properties

Property Characteristic

Required properties for all connections

DatabaseName Specifies the name of the database. The database must exist, or the connection attempt will fail.

PortNumber The TCP port of the primary database server that is listening for connections to the database. The default is 10000 for binary connections and 10001 for HTTP connections.

ServerName Specifies either the IP address or the server name (if your network supports named servers) of the primary database server.

Additional properties required for enabling HTTP mode

8 Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 Connecting to a DataSource

Property Characteristic

HTTPPath Specifies the path of the HTTP/HTTPS endpoint used for connections when HTTP mode is enabled (TransportMode=http). The default is cliservice.

TransportMode Specifies whether binary (TCP) mode or HTTP mode is used to access Apache Hive data sources. If set to binary, Thrift RPC requests are sent to directly to data sources using a binary connection (TCP mode). If set to http, Thrift RPC requests are sent using HTTP transport (HTTP mode). HTTP mode is typically used when connecting to a reverse-proxy server, such as a gateway, for improved security, or a load balancer.

Note: To configure the driver to use HTTPS end points, set TransportMode=http and EncryptionMethod=SSL.

The default is binary.

See also Connecting Using the DriverManager on page 9 Connecting Using Data Sources on page 13 Tuning for Performance on page 18

Connecting to a DataSource

Once the driver is installed and configured, you can connect from your application to your database in either of the following ways.

• Using the JDBC DriverManager, by specifying the connection URL in the DriverManager.getConnection() method. • Creating a JDBC DataSource that can be accessed through the Java Naming Directory Interface (JNDI).

Connecting Using the DriverManager

One way to connect to a Hive database is through the JDBC DriverManager using the DriverManager.getConnection() method. As the following example shows, this method specifies a string containing a connection URL.

Connection conn = DriverManager.getConnection ("jdbc:datadirect:hive://MyServer:10000;User=test;Password=secret;DatabaseName=MyDB");

Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 9 Chapter 1: Quick Start: Progress DataDirect for JDBC for Apache Hive Driver

Passing the Connection URL After setting the CLASSPATH, the required connection information needs to be passed in the form of a connection URL. The form of the connection URL differs depending on whether you are using a binary or HTTP connection: For binary connections (the default):

Connection conn = DriverManager.getConnection (jdbc:datadirect:hive://servername:port; DatabaseName=database;[property=value[;...]];)

For HTTP connections (TransportMode=http):

Connection conn = DriverManager.getConnection (jdbc:datadirect:hive://servername:port;DatabaseName=database; HTTPPath=path;TransportMode=http;[property=value[;...]];)

where:

servername

specifies the name or the IP address of the server to which you want to connect.

port

specifies the TCP port of the primary database server that is listening for connections to the Apache Hive database. The default is 10000 for binary connections and 10001 for HTTP.

database

specifies the name of the Apache Hive database to which you want to connect.

path

specifies the path of the HTTP/HTTPS endpoint used for connections. The default is cliservice.

property=value

specifies connection property settings. Multiple properties are separated by a semi-colon. For more information on connection properties, see "Using Connection Properties."

The following examples show how to establish a connection to an Apache Hive database. For binary connections:

Connection conn = DriverManager.getConnection ("jdbc:datadirect:hive://MyServer:10000;DatabaseName=MyDB");

For HTTP connections:

Connection conn = DriverManager.getConnection (jdbc:datadirect:hive://myserver:10001;DatabaseName=MyDB; HTTPPath=cliservice;TransportMode=http);

See also Using Connection Properties on page 8

10 Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 Connecting to a DataSource

Testing a DriverManager Connection You can use DataDirect Test™ to establish and test a DriverManager connection. The screen shots in this section were taken on a Windows system. Take the following steps to establish a connection.

1. Navigate to the installation directory. The default location is:

• Windows systems: Program Files\Progress\DataDirect\JDBC_60\testforjdbc • UNIX and Linux systems: /opt/Progress/DataDirect/JDBC_60/testforjdbc

Note: For UNIX/Linux, if you do not have access to /opt, your home directory will be used in its place.

2. From the testforjdbc folder, run the platform-specific tool:

• testforjdbc.bat (on Windows systems) • testforjdbc.sh (on UNIX and Linux systems) The Test for JDBC Tool window appears:

3. Click Press Here to Continue.

Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 11 Chapter 1: Quick Start: Progress DataDirect for JDBC for Apache Hive Driver

The main dialog appears:

4. From the menu bar, select Connection > Connect to DB. The Select A Database dialog appears:

5. Select the appropriate database template from the Defined Databases field. 6. In the Database field, specify the ServerName, PortNumber, and DatabaseName for your Apache Hive data source. For example:

jdbc:datadirect:hive://MyServer:10000;DatabaseName=MyDB

12 Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 Connecting to a DataSource

7. If you are using user ID/password authentication, enter your user ID and password in the corresponding fields. 8. Click Connect.

If the connection information is entered correctly, the JDBC/Database Output window reports that a connection has been established. (If a connection is not established, the window reports an error.)

Connecting Using Data Sources

A JDBC data source is a Java object, specifically a DataSource object, that defines connection information required for a JDBC driver to connect to the database. Each JDBC driver vendor provides their own data source implementation for this purpose. A Progress DataDirect data source is Progress DataDirect’s implementation of a DataSource object that provides the connection information needed for the driver to connect to a database. Because data sources work with the Java Naming Directory Interface (JNDI) naming service, data sources can be created and managed separately from the applications that use them. Because the connection information is defined outside of the application, the effort to reconfigure your infrastructure when a change is made is minimized. For example, if the database is moved to another database server, the administrator need only change the relevant properties of the DataSource object. The applications using the database do not need to change because they only refer to the name of the data source.

How Data Sources Are Implemented Data sources are implemented through a data source class. A data source class implements the following interfaces. • javax..DataSource • javax.sql.ConnectionPoolDataSource (allows applications to use connection pooling) The data source class for the driver for Apache Hive is com.ddtek.jdbcx.hive.HiveDataSource.

Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 13 Chapter 1: Quick Start: Progress DataDirect for JDBC for Apache Hive Driver

Creating Data Sources The following example files provide details on creating and using Progress DataDirect data sources with the Java Naming Directory Interface (JNDI), where install_dir is the product installation directory.

• install_dir/Examples/JNDI/JNDI_LDAP_Example.java can be used to create a JDBC data source and save it in your LDAP directory using the JNDI Provider for LDAP. • install_dir/Examples/JNDI/JNDI_FILESYSTEM_Example.java can be used to create a JDBC data source and save it in your local file system using the File System JNDI Provider. See "Example Data Source" for an example data source definition for the example files. To connect using a JNDI data source, the driver needs to access a JNDI data store to persist the data source information. For a JNDI file system implementation, you must download the File System Service Provider from the Oracle Technology Network Java SE Support downloads page, unzip the files to an appropriate location, and add the fscontext.jar and providerutil.jar files to your CLASSPATH. These steps are not required for LDAP implementations because the LDAP Service Provider has been included with Java SE since Java 2 SDK, v1.3.

Example Data Source To configure a data source using the example files, you will need to create a data source definition. The content required to create a data source definition is divided into three sections. First, you will need to import the data source class. For example:

import com.ddtek.jdbcx.hive.HiveDataSource;

Next, you will need to set the values and define the data source. For example, the following definition contains the minimum properties required for a binary connection:

HiveDataSource mds = new HiveDataSource(); mds.setDescription("My Hive Server"); mds.setServerName("MyServer"); mds.setPortNumber(10000); mds.setDatabaseName("myDB");

The following example contains the minimum properties for a connection in HTTP mode:

HiveDataSource mds = new HiveDataSource(); mds.setDescription("My Hive Server"); mds.setServerName("MyServer"); mds.setPortNumber(10001); mds.setDatabaseName("myDB"); mds.setTransportMode("http");

Finally, you will need to configure the example application to print out the data source attributes. Note that this code is specific to the driver and should only be used in the example application. For example, you would add the following section for a binary connection using only the minimum properties:

if (ds instanceof HiveDataSource) { HiveDataSource jmds = (HiveDataSource) ds; System.out.println("description=" + jmds.getDescription()); System.out.println("serverName=" + jmds.getServerName()); System.out.println("portNumber=" + jmds.getPortNumber()); System.out.println("databaseName=" + jmds.getDatabaseName()); System.out.println(); }

14 Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 Connecting to a DataSource

Calling a Data Source in an Application Applications can call a Progress DataDirect data source using a logical name to retrieve the javax.sql.DataSource object. This object loads the specified driver and can be used to establish a connection to the database. Once the data source has been registered with JNDI, it can be used by your JDBC application as shown in the following code example.

Context ctx = new InitialContext(); DataSource ds = (DataSource)ctx.lookup("EmployeeDB"); Connection con = ds.getConnection("domino", "spark");

In this example, the JNDI environment is first initialized. Next, the initial naming context is used to find the logical name of the data source (EmployeeDB). The Context.lookup() method returns a reference to a Java object, which is narrowed to a javax.sql.DataSource object. Then, the DataSource.getConnection() method is called to establish a connection.

Testing a DataSource Connection You can use DataDirect Test™ to establish and test a DataSource connection. The screen shots in this section were taken on a Windows system. Take the following steps to establish a connection.

1. Navigate to the installation directory. The default location is:

• Windows systems: Program Files\Progress\DataDirect\JDBC_60\testforjdbc • UNIX and Linux systems: /opt/Progress/DataDirect/JDBC_60/testforjdbc

Note: For UNIX/Linux, if you do not have access to /opt, your home directory will be used in its place.

2. From the testforjdbc folder, run the platform-specific tool:

• testforjdbc.bat (on Windows systems) • testforjdbc.sh (on UNIX and Linux systems) The Test for JDBC Tool window appears:

Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 15 Chapter 1: Quick Start: Progress DataDirect for JDBC for Apache Hive Driver

3. Click Press Here to Continue. The main dialog appears:

16 Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 Connecting to a DataSource

4. From the menu bar, select Connection > Connect to DB via Data Source. The Select A Database dialog appears:

5. Select a datasource template from the Defined Datasources field. 6. Provide the following information: a) In the Initial Context Factory, specify the location of the initial context provider for your application. b) In the Context Provider URL, specify the location of the context provider for your application. c) In the Datasource field, specify the name of your datasource.

Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 17 Chapter 1: Quick Start: Progress DataDirect for JDBC for Apache Hive Driver

7. If you are using user ID/password authentication, enter your user ID and password in the corresponding fields. 8. Click Connect.

If the connection information is entered correctly, the JDBC/Database Output window reports that a connection has been established. If a connection is not established, the window reports an error.

Tuning for Performance

The connection properties described in this section directly affect the performance of your driver. To tune for performance, configure your driver according to the recommended settings and your environment.

ArrayFetchSize Purpose: Determines the number of fields the driver retrieves from a server for a fetch. When executing a fetch, the driver divides the value specified by the number columns in a particular table to determine the number of rows to retrieve. Performance Impact: To improve throughput, increase the value of ArrayFetchSize. By increasing the value specified, you increase the number of rows the driver will retrieve from the server for a fetch. In turn, increasing the number of rows that the driver can retrieve reduces the number, and expense, of network round trips. Note that improved throughput does come at the expense of increased demands on memory and slower response time. Furthermore, if the fetch size exceeds the available buffer memory of the server, an out of memory error is returned when attempting to execute a fetch. If you receive this error, decrease the value specified until fetches are successfully executed. Recommended Settings:

18 Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 Tuning for Performance

Tune this setting to reflect the typical fetch size of your application. Smaller fetch sizes can improve the initial response time of the query. Larger fetch sizes improve overall fetch times at the cost of additional memory.

BatchMechanism Purpose: Determines the mechanism that is used to execute batch operations. Performance Impact: Unlike the native batch mechanism, the multi-row insert mechanism only returns the total number of update counts for batch inserts. Therefore, setting BatchMechanism to MultiRowInsert offers substantial performance gains when performing batch inserts. Recommended Settings: If your application does not require individual update counts for each statement or parameter set in the batch, set to multiRowInsert for improved performance when executing batch inserts.

BinaryDescribeType Purpose: Specifies how columns of the Binary type are described. Performance Impact: When BinaryDescribeType is set to longvarbinary, the driver not only maps Binary to Longvarbinary, but also allocates more space to cache the long data. Because more space is allocated for the long data, your application will incur a performance penalty. Recommended Settings: If your application does not use the getClob() method, set to varbinary for improved performance.

CatalogMode Purpose: Determines whether the driver uses native catalog functions to retrieve information returned by DatabaseMetaData functions. Performance Impact: Apache Hive’s native catalog functions return incorrect information in certain scenarios. To address this issue, by default, the driver uses a combination of driver-discovered information and native functions to retrieve more accurate catalog information than native functions alone. While using driver-discovered information improves accuracy, it does so at an expense to performance. If accurate catalog information is not required, you can improve performance by setting CatalogMode connection property to native. Recommended Settings: If accurate catalog information is not required, set to native for the best performance. If your application requires accurate catalog information, set to mixed for the optimal balance of performance and accuracy.

EnableCookieAuthentication Purpose: Determines whether the driver attempts to use cookie based authentication for requests to an HTTP endpoint after the initial authentication to the server. Performance Impact:

Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 19 Chapter 1: Quick Start: Progress DataDirect for JDBC for Apache Hive Driver

Cookie based authentication improves response time by eliminating the need to re-authenticate with the server for each request. Recommended Settings: If your environment is configured to use cookies, set to true.

EncryptionMethod Purpose: Determines the method the driver uses to encrypt data sent between the driver and the database server. Performance Impact: Data encryption may adversely affect performance because of the additional overhead (mainly CPU usage) that is required to encrypt and decrypt data. Recommended Settings: If data encryption is not required, set to noEncryption for improved performance.

InsensitiveResultSetBufferSize Purpose: Determines the amount of memory that is used by the driver to cache insensitive result set data. Performance Impact: To improve performance when using scroll-insensitive result sets, the driver can cache the result set data in memory instead of writing it to disk. By default, the driver caches 2 MB of insensitive result set data in memory and writes any remaining result set data to disk. Performance can be improved by increasing the amount of memory used by the driver before writing data to disk or by forcing the driver to never write insensitive result set data to disk. The maximum cache size setting is 2 GB. Recommended Settings: Specify a value in KB that is a power of 2 for improved performance. This value should not exceed the amount available. The maximum value for this property is 2 GB.

MaxPooledStatements Purpose: Specifies the maximum number of prepared statements to be pooled for each connection and enables the driver’s internal prepared statement pooling when set to an integer greater than zero (0). Performance Impact: The driver’s internal prepared statement pooling provides performance benefits when the driver is not running from within an application server or another application that provides its own statement pooling. Recommended Settings: For better performance, specify a value that is greater than the number of prepared statements used by the application. Note that this performance benefit comes at the expense of greater memory consumption.

StringDescribeType Purpose: Specifies whether String columns are described as VARCHAR columns. This property affects ResultSetMetaData calls; it does not affect getTypeInfo() calls. Performance Impact:

20 Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 Troubleshooting Setup/Connection Issues

To obtain data from String columns with the getClob() method, the StringDescribeType connection property must be set to longvarchar. (Otherwise, calling getClob() results in an "unsupported data conversion" exception.) When StringDescribeType is set to longvarchar, the driver not only maps String to Longvarchar but also allocates more space to cache the long data. Because more space is allocated for the long data, your application will incur a performance penalty. Recommended Settings: If your application does not use the getClob() method, set to varchar for improved performance.

UseCurrentSchema Purpose: Specifies whether results are restricted to the tables and views in the current schema if a call is made without specifying a schema or if the schema is specified as the wildcard character %. Performance Impact: Restricting results to the tables and views in the current schema improves performance of calls that do not specify a schema. Recommended Settings: If your application needs to access tables and views owned only by the current user, performance of your application can be improved by setting this property to true.

Troubleshooting Setup/Connection Issues

This section describes common setup/connection issues you may encounter while trying to establish a database connection with the driver as well as some potential reasons for these issues. If you are experiencing a problem not described in this section, comprehensive troubleshooting resources are available in the "Troubleshooting" section of the Progress DataDirect for JDBC for Apache Hive Driver User's Guide.

Common Setup/Connection Issues

You are experiencing a setup/connection issue if you are encountering an error or hang while you are trying to make a database connection with the JDBC driver or are trying to configure the JDBC driver. Some common errors that are returned by the driver if you are experiencing a setup/connection issue include: • class not found • Specified driver could not be loaded. • Data source name not found and no default driver specified. • Unable to connect to destination. • Invalid username/password; logon denied.

Troubleshooting the Issue

Some common reasons that setup/connection issues occur are:

• The database and/or listener are not started.

Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 21 Chapter 1: Quick Start: Progress DataDirect for JDBC for Apache Hive Driver

• The driver jar file, hive.jar, is not defined on your CLASSPATH. If the driver is not defined on your CLASSPATH, you will receive a class not found exception when trying to load the driver. See "Setting the Classpath" for details. • The JDBC driver’s connection properties are not set correctly in the connection URL or data source. See "Configuring a Data Source" for more information. For example, the host name or port number are not correctly configured.

Additional Resources

In addition to this quick start, the following resources enable you to take full advantage of the features and support offered for your driver. • Product Documentation Library contains a comprehensive set of product documentation, including the following guides: • Progress DataDirect for JDBC Drivers Installation Guide details requirements and procedures for installing the product. • Progress DataDirect for JDBC for Apache Hive Driver User's Guide guides you through using and configuring the driver, provides detailed reference information, and explains the tools used to troubleshoot common problems.

• Progress Support Knowledgebase provides answers to questions, access to technical documentation, release notes, product alerts and other support information. • Progress Community allows you to contribute, share, and network with other Progress users and employees. • Technical Support provides technical support services, including maintenance services and opening a support case.

Contacting Technical Support

Progress DataDirect offers a variety of options to meet your support needs. Please visit our Web site for more details and for contact information: https://www.progress.com/support The Progress DataDirect Web site provides the latest support information through our global service network. The SupportLink program provides access to support contact details, tools, patches, and valuable information, including a list of FAQs for each product. In addition, you can search our Knowledgebase for technical bulletins and other information. When you contact us for assistance, please provide the following information:

• Your number or the serial number that corresponds to the product for which you are seeking support, or a case number if you have been provided one for your issue. If you do not have a SupportLink contract, the SupportLink representative assisting you will connect you with our Sales team. • Your name, phone number, email address, and organization. For a first-time call, you may be asked for full information, including location. • The Progress DataDirect product and the version that you are using. • The type and version of the operating system where you have installed your product.

22 Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 Copyright

• Any database, database version, third-party software, or other environment information required to understand the problem. • A brief description of the problem, including, but not limited to, any error messages you have received, what steps you followed prior to the initial occurrence of the problem, any trace logs capturing the issue, and so on. Depending on the complexity of the problem, you may be asked to submit an example or reproducible application so that the issue can be re-created. • A description of what you have attempted to resolve the issue. If you have researched your issue on Web search engines, our Knowledgebase, or have tested additional configurations, applications, or other vendor products, you will want to carefully note everything you have already attempted. • A simple assessment of how the severity of the issue is impacting your organization.

Copyright

© 2018 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.

These materials and all Progress® software products are copyrighted and all rights are reserved by Progress Software Corporation. The information in these materials is subject to change without notice, and Progress Software Corporation assumes no responsibility for any errors that may appear therein. The references in these materials to specific platforms supported are subject to change. Corticon, DataDirect (and design), DataDirect Cloud, DataDirect Connect, DataDirect Connect64, DataDirect XML Converters, DataDirect XQuery, DataRPM, Deliver More Than Expected, Icenium, Kendo UI, NativeScript, OpenEdge, Powered by Progress, Progress, Progress Software Developers Network, Rollbase, SequeLink, Sitefinity (and Design), SpeedScript, Stylus Studio, TeamPulse, Telerik, Telerik (and Design), Test Studio, and WebSpeed are registered trademarks of Progress Software Corporation or one of its affiliates or subsidiaries in the U.S. and/or other countries. Analytics360, AppServer, BusinessEdge, DataDirect Spy, SupportLink, DevCraft, Fiddler, JustAssembly, JustDecompile, JustMock, Kinvey, NativeScript Sidekick, OpenAccess, ProDataSet, Progress Results, Progress Software, ProVision, PSE Pro, Sitefinity, SmartBrowser, SmartComponent, SmartDataBrowser, SmartDataObjects, SmartDataView, SmartDialog, SmartFolder, SmartFrame, SmartObjects, SmartPanel, SmartQuery, SmartViewer, SmartWindow, and WebClient are trademarks or service marks of Progress Software Corporation and/or its subsidiaries or affiliates in the U.S. and other countries. Java is a registered trademark of Oracle and/or its affiliates. Any other marks contained herein may be trademarks of their respective owners. Please refer to the readme applicable to the particular Progress product release for any third-party acknowledgements required to be provided in the documentation associated with the Progress product.

Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1 23 Chapter 1: Quick Start: Progress DataDirect for JDBC for Apache Hive Driver

24 Progress DataDirect for JDBC for Apache Hive Driver: Quick Start: Version 6.0.1