Progress DataDirect for ODBC for Apache Cassandra User©s Guide and Reference

Release 8.0.0

Copyright

© 2020 Progress Software Corporation and/or one of its subsidiaries or affiliates. All rights reserved. These materials and all Progress® software products are copyrighted and all rights are reserved by Progress Software Corporation. The information in these materials is subject to change without notice, and Progress Software Corporation assumes no responsibility for any errors that may appear therein. The references in these materials to specific platforms supported are subject to change. Corticon, DataDirect (and design), DataDirect Cloud, DataDirect Connect, DataDirect Connect64, DataDirect XML Converters, DataDirect XQuery, DataRPM, Defrag This, Deliver More Than Expected, Icenium, Ipswitch, iMacros, Kendo UI, Kinvey, MessageWay, MOVEit, NativeChat, NativeScript, OpenEdge, Powered by Progress, Progress, Progress Software Developers Network, SequeLink, Sitefinity (and Design), Sitefinity, SpeedScript, Stylus Studio, TeamPulse, Telerik, Telerik (and Design), Test Studio, WebSpeed, WhatsConfigured, WhatsConnected, WhatsUp, and WS_FTP are registered trademarks of Progress Software Corporation or one of its affiliates or subsidiaries in the U.S. and/or other countries. Analytics360, AppServer, BusinessEdge, DataDirect Autonomous REST Connector, DataDirect Spy, SupportLink, DevCraft, Fiddler, iMail, JustAssembly, JustDecompile, JustMock, NativeScript Sidekick, OpenAccess, ProDataSet, Progress Results, Progress Software, ProVision, PSE Pro, SmartBrowser, SmartComponent, SmartDataBrowser, SmartDataObjects, SmartDataView, SmartDialog, SmartFolder, SmartFrame, SmartObjects, SmartPanel, SmartQuery, SmartViewer, SmartWindow, and WebClient are trademarks or service marks of Progress Software Corporation and/or its subsidiaries or affiliates in the U.S. and other countries. Java is a registered trademark of Oracle and/or its affiliates. Any other marks contained herein may be trademarks of their respective owners.

Updated: 2020/10/08

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 3 Copyright

4 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Contents

Table of Contents

Preface...... 11 Welcome to the Progress DataDirect for ODBC for Apache Cassandra™ Driver...... 11 What©s New in This Release? ...... 12 Conventions Used in This Guide...... 13 About the Product Documentation...... 15 Contacting Technical Support...... 16

Getting Started...... 17 Configuring and Connecting on Windows...... 18 Setting the Library Path Environment Variable (Windows)...... 18 Configuring a Data Source...... 18 Testing the Connection...... 19 Configuring and Connecting on UNIX and Linux...... 20 Environment Configuration...... 20 Test Loading the Driver...... 20 Setting the Library Path Environment Variable (UNIX and Linux)...... 21 Configuring a Data Source in the System Information File...... 21 Testing the Connection...... 22 Accessing Data With Third-Party Applications...... 22

What Is ODBC?...... 23 How Does It Work?...... 24 Why Do Application Developers Need ODBC?...... 24

About the Apache Cassandra Driver...... 25 Driver Requirements...... 26 Support for Multiple Environments...... 26 Support for Windows Environments...... 26 Support for UNIX and Linux Environments...... 28 ODBC Compliance...... 33 Version String Information...... 34 getFileVersionString Function...... 35 Data Types...... 35 Retrieving Data Type Information...... 37 Complex Type Normalization...... 38 Collection Types...... 38 Tuple and User-Defined Types...... 42

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 5 Contents

Nested Complex Types...... 44 Isolation and Lock Levels Supported...... 46 Binding Parameter Markers...... 46

Supported Features...... 47 Unicode Support...... 47 Using IP Addresses...... 48 Parameter Metadata Support...... 48 Insert and Update Statements...... 48 Select Statements...... 49 SQL Support...... 50 Number of Connections and Statements Supported...... 50

Using the Driver...... 51 Configuring and Connecting to Data Sources...... 51 Configuring the Product on UNIX/Linux...... 52 Data Source Configuration Using a GUI...... 61 Using a Connection String...... 74 Using a Logon Dialog Box...... 75 Performance Considerations...... 76 Using the SQL Engine Server...... 77 Configuring the SQL Engine Server on Windows...... 77 Configuring the SQL Engine Server on UNIX/Linux...... 80 Configuring Java Logging for the SQL Engine Server...... 82 Using Failover in a Cluster...... 82 Using Identifiers...... 83

Troubleshooting...... 85 Diagnostic Tools...... 85 ODBC Trace...... 85 The Test Loading Tool...... 88 ODBC Test...... 89 Logging...... 89 Configuring Logging...... 91 The demoodbc Application...... 92 The Example Application...... 93 Other Tools...... 93 Error Messages...... 93 Troubleshooting...... 94 Setup/Connection Issues...... 95 Interoperability Issues...... 96 Performance Issues...... 97 Out-of-Memory Issues...... 97

6 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Contents

Operation Timeouts...... 98

Connection Option Descriptions...... 101 Application Using Threads...... 104 Ascii Size ...... 105 Authentication Method...... 106 Cluster Nodes...... 106 Config Options...... 107 Create Map...... 108 Data Source Name...... 109 Decimal Precision ...... 110 Decimal Scale ...... 110 Description...... 111 Fetch Size...... 112 Host Name...... 113 IANAAppCodePage...... 114 Initialization String...... 115 JVM Arguments...... 115 JVM Classpath...... 116 Keyspace Name...... 117 Log Config File...... 118 Login Timeout...... 119 Native Fetch Size...... 119 Password...... 120 Port Number...... 121 Query Timeout...... 121 Read Consistency...... 122 Read Only...... 123 Report Codepage Conversion Errors...... 124 Result Memory Size...... 124 Schema Map...... 126 Server Port Number...... 127 SQL Engine Mode...... 128 Transaction Mode...... 129 User Name...... 129 Varchar Size...... 130 Varint Precision...... 130 Write Consistency...... 131

Supported SQL Functionality...... 133 (DDL)...... 133 Native and Refresh Escape Clauses...... 134 Delete...... 134

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 7 Contents

Insert...... 135 Refresh Map (EXT)...... 137 Select...... 137 Select Clause...... 139 Update...... 148 SQL Expressions...... 150 Column Names...... 151 Literals...... 151 Operators...... 153 Functions...... 157 Conditions...... 157 Subqueries...... 158 IN Predicate...... 158 EXISTS Predicate...... 159 UNIQUE Predicate...... 159 Correlated Subqueries...... 159

Part I: Reference...... 161

Code Page Values...... 163 IANAAppCodePage Values...... 163

ODBC API and Scalar Functions...... 169 API Functions...... 169 Scalar Functions...... 172 String Functions...... 173 Numeric Functions...... 175 Date and Time Functions...... 176 System Functions...... 178

Internationalization, Localization, and Unicode...... 179 Internationalization and Localization...... 179 Locale...... 180 Language...... 180 Country...... 180 Variant...... 181 Unicode Character Encoding...... 181 Background...... 181 Unicode Support in ...... 182 Unicode Support in ODBC...... 182 Unicode and Non-Unicode ODBC Drivers...... 183 Function Calls...... 183

8 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Contents

Data...... 185 Default Unicode Mapping...... 186 Driver Manager and Unicode Encoding on UNIX/Linux...... 186 References...... 187

Designing ODBC applications for performance optimization...... 189 Using catalog functions...... 190 Caching information to minimize the use of catalog functions...... 190 Avoiding search patterns...... 191 Using a dummy query to determine table characteristics...... 191 Retrieving data...... 192 Retrieving long data...... 192 Reducing the size of data retrieved...... 192 Using bound columns...... 193 Using SQLExtendedFetch instead of SQLFetch...... 193 Choosing the right data type...... 194 Selecting ODBC functions...... 194 Using SQLPrepare/SQLExecute and SQLExecDirect...... 194 Using arrays of parameters...... 194 Using the cursor library...... 195 Managing connections and updates...... 196 Managing connections...... 196 Managing commits in transactions...... 196 Choosing the right transaction model...... 197 Using positioned updates and deletes...... 197 Using SQLSpecialColumns...... 197

Using Indexes...... 199 Introduction...... 199 Improving Row Selection Performance...... 200 Indexing Multiple Fields...... 200 Deciding Which Indexes to Create...... 201 Improving Join Performance...... 202

Locking and Isolation Levels...... 203 Locking...... 203 Isolation Levels...... 204 Locking Modes and Levels...... 205

WorkAround Options...... 207

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 9 Contents

Threading...... 211

Index...... 213

10 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Preface

For details, see the following topics:

· Welcome to the Progress DataDirect for ODBC for Apache Cassandra™ Driver · What©s New in This Release? · Conventions Used in This Guide · About the Product Documentation · Contacting Technical Support

Welcome to the Progress DataDirect for ODBC for Apache Cassandra™ Driver

® TM This book is your user's guide and reference for the Progress DataDirect for ODBC for Apache Cassandra driver. The content of this book assumes that you are familiar with your operating system and its commands. It contains the following information:

· Getting Started on page 17 explains the basics for quickly configuring and testing the drivers. · What Is ODBC? on page 23 provides an explanation of ODBC. · About the Apache Cassandra Driver on page 25 explains the driver, supported environments and driver requirements. · Supported Features on page 47 explains features supported by the driver. · Using the Driver on page 51 guides you through configuring the driver. It also explains how to use the functionality supported by the driver such as Authentication and SSL encryption.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 11 Preface

· Troubleshooting on page 85 explains the tools to solve common problems and documents error messages. · The Connection Option Descriptions on page 101 section contains detailed descriptions of the connection options supported by the driver. · Supported SQL Functionality on page 133 provides supported SQL statements and extensions that are specific to the data source. · The Reference on page 161 section includes reference information about APIs, code page values, and performance tuning. If you are writing programs to access ODBC drivers, you need to obtain a copy of the ODBC Programmer's Reference for the Microsoft Open Connectivity Software Development Kit, available from Microsoft Corporation. For the latest information about your driver, refer to the readme file in your software package.

Note: This book refers the reader to Web pages using URLs for more information about specific topics, including Web URLs not maintained by Progress DataDirect. Because it is the nature of Web content to change frequently, Progress DataDirect can guarantee only that the URLs referenced in this book were correct at the time of publication.

What©s New in This Release?

Changes Since the 8.0.0 Release · Certifications: · Certified with Red Hat Enterprise 7.3 (driver version 08.00.0074 (B0238, U0156)) · Certified with Windows Server 2016 (driver version 08.00.0074 (B0238, U0156))

· Enhancements · The driver has been enhanced to support connection failover and client load balancing when connecting to a cluster.You can enable this new functionality by specifying a comma-separated list of member nodes using the new Cluster Nodes (ClusterNodes) connection option. See Using Failover in a Cluster on page 82 and Cluster Nodes on page 106 for details. · The driver has been enhanced to include timestamp in the internal packet logs by default. If you want to disable the timestamp logging in packet logs, set PacketLoggingOptions=1. The internal packet logging is not enabled by default. To enable it, set EnablePacketLogging=1. · The driver has been enhanced to support the Duration data type, which maps to the SQL_VARCHAR ODBC type. See Data Types on page 35 for details. · The driver has been enhanced to support all the data consistency levels for read and write operations that are supported by Apache Cassandra data stores. Data consistency levels are configured using the Read Consistency and Write Consistency connection options. For additional information, see Read Consistency on page 122 and Write Consistency on page 131.

· Changed Behavior · Java SE 7 has reached the end of its product life cycle and will no longer receive generally available security updates. As a result, the drivers will no longer support JVMs that are version Java SE 7 or earlier. Support for distributed versions of Java SE 7 and earlier will also end, including IBM SDK (Java Edition).

12 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Preface

· The following Windows platforms have reached the end of their product lifecycle and are no longer supported by the driver: · Windows 8.0 (versions 8.1 and higher are still supported) · Windows Vista (all versions) · Windows XP (all versions) · Windows Server 2003 (all versions)

Highlights of the 8.0.0 Release · Certified against Apache Cassandra versions 2.0, 2.1, 2.2, and 3.7. · Certified against DataStax Enterprise 4.7, 4.8, and 5.0. · Supports SQL read-write access to Apache Cassandra data sources. See Supported SQL Functionality on page 133 for details. · Supports CQL Binary Protocol. · The driver supports the core SQL-92 grammar. · Support for all ODBC Core and Level 1 functions and some Level 1 and Level 2 features. See ODBC Compliance on page 33 for details. · Supports user ID and password authentication. See Authentication Method on page 106 for details. · Supports Cassandra data types, including the complex types Tuple, user-defined types, Map, List and Set. See Data Types on page 35 for details. · Generates a relational view of Cassandra data. Tuple and user-defined types are flattened into a relational parent table, while collection types are mapped as relational child tables. See Complex Type Normalization on page 38 for details. · Supports Native and Refresh escape clauses to embed CQL commands in SQL-92 statements. See Native and Refresh Escape Clauses on page 134 for details. · Supports Cassandra©s tunable consistency functionality with Read Consistency on page 122 and Write Consistency on page 131 connection options. · Supports the handling of large result sets with Fetch Size on page 112, Native Fetch Size on page 119, and Result Memory Size on page 124 connection options. · Supports applications that do not support unbounded Cassandra data types through the Ascii Size on page 105, Decimal Precision on page 110, Decimal Scale on page 110, Varchar Size on page 130, and Varint Precision on page 130 connection options.

Conventions Used in This Guide

The following sections describe the conventions used to highlight information that applies to specific operating systems and typographical conventions.

Operating System Symbols The drivers are supported in the Windows, UNIX, and Linux environments. When the information provided is not applicable to all supported environments, the following symbols are used to identify that information:

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 13 Preface

The Windows symbol signifies text that is applicable only to Windows.

The UNIX symbol signifies text that is applicable only to UNIX and Linux.

Typography This guide uses the following typographical conventions:

Convention Explanation

italics Introduces new terms with which you may not be familiar, and is used occasionally for emphasis.

bold Emphasizes important information. Also indicates button, menu, and icon names on which you can act. For example, click Next.

BOLD UPPERCASE Indicates keys or key combinations that you can use. For example, press the ENTER key.

UPPERCASE Indicates SQL reserved words.

monospace Indicates syntax examples, values that you specify, or results that you receive.

monospaced Indicates names that are placeholders for values that you specify. For example, italics filename.

> Separates menus and their associated commands. For example, Select File > Copy means that you should select Copy from the File menu.

/ The slash also separates directory levels when specifying locations under UNIX.

vertical rule | Indicates an "OR" separator used to delineate items.

brackets [ ] Indicates optional items. For example, in the following statement: SELECT [DISTINCT], DISTINCT is an optional keyword. Also indicates sections of the Windows Registry.

braces { } Indicates that you must select one item. For example, {yes | no} means that you must specify either yes or no.

ellipsis . . . Indicates that the immediately preceding item can be repeated any number of times in succession. An ellipsis following a closing bracket indicates that all information in that unit can be repeated.

14 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Preface

About the Product Documentation

This guide provides specific information about your Progress DataDirect for ODBC driver.You can view this documentation in the HTML format installed with the product. The documentation is also available in PDF format .You can download the PDF version of the documentation at: https://www.progress.com/documentation/datadirect-connectors This guide is installed with the product as an HTML-based help system. This help system is located in the help subdirectory of the product installation directory.You can use the help system with any of the following browsers:

· Microsoft Edge on Windows 10 · Internet Explorer 7.x and higher · Firefox 3.x and higher · Safari 5.x · Google Chrome 44.x and earlier On all platforms, you can access the entire Help system by opening the following file from within your browser: install_dir/help/CassandraHelp/index.html where install_dir is the path to the product installation directory. Or, from a command-line environment, at a command prompt, enter: browser_exe install_dir/help/CassandraHelp/index.html where browser_exe is the name of your browser executable and install_dir is the path to the product installation directory. After the browser opens, the left pane displays the Table of Contents, Index, and Search tabs for the entire documentation library. When you have opened the main screen of the Help system in your browser, you can bookmark it in the browser for quick access later.

Note: Security features set in your browser can prevent the Help system from launching. A security warning message is displayed. Often, the warning message provides instructions for unblocking the Help system for the current session. To allow the Help system to launch without encountering a security warning message, the security settings in your browser can be modified. Check with your system administrator before disabling any security features.

Help is also available from the setup dialog box for each driver. When you click Help, your browser opens to the correct topic without opening the help Table of Contents. A grey toolbar appears at the top of the browser window.

This tool bar contains previous and next navigation buttons. If, after viewing the help topic, you want to see the entire library, click:

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 15 Preface

on the left side of the toolbar, which opens the left pane and displays the Table of Contents, Index, and Search tabs.

Contacting Technical Support

Progress DataDirect offers a variety of options to meet your support needs. Please visit our Web site for more details and for contact information: https://www.progress.com/support The Progress DataDirect Web site provides the latest support information through our global service network. The SupportLink program provides access to support contact details, tools, patches, and valuable information, including a list of FAQs for each product. In addition, you can search our Knowledgebase for technical bulletins and other information. When you contact us for assistance, please provide the following information:

· Your number or the serial number that corresponds to the product for which you are seeking support, or a case number if you have been provided one for your issue. If you do not have a SupportLink contract, the SupportLink representative assisting you will connect you with our Sales team. · Your name, phone number, email address, and organization. For a first-time call, you may be asked for full information, including location. · The Progress DataDirect product and the version that you are using. · The type and version of the operating system where you have installed your product. · Any database, database version, third-party software, or other environment information required to understand the problem. · A brief description of the problem, including, but not limited to, any error messages you have received, what steps you followed prior to the initial occurrence of the problem, any trace logs capturing the issue, and so on. Depending on the complexity of the problem, you may be asked to submit an example or reproducible application so that the issue can be re-created. · A description of what you have attempted to resolve the issue. If you have researched your issue on Web search engines, our Knowledgebase, or have tested additional configurations, applications, or other vendor products, you will want to carefully note everything you have already attempted. · A simple assessment of how the severity of the issue is impacting your organization. October 2020, Release 8.0.0 for the Progress DataDirect for ODBC for Apache Cassandra Driver, Version 0001

16 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 1

Getting Started

This chapter provides basic information about configuring your driver immediately after installation and testing your connection.To take full advantage of the features of the driver, read "About the Apache Cassandra Driver" and "Using the Driver." Information that the driver needs to connect to a database is stored in a data source. The ODBC specification describes three types of data sources: user data sources, system data sources (not a valid type on UNIX/Linux), and file data sources. On Windows, user and system data sources are stored in the registry of the local computer. The difference is that only a specific user can access user data sources, whereas any user of the machine can access system data sources. On Windows, UNIX, and Linux, file data sources, which are simply text files, can be stored locally or on a network computer, and are accessible to other machines. When you define and configure a data source, you store default connection values for the driver that are used each time you connect to a particular database.You can change these defaults by modifying the data source.

For details, see the following topics:

· Configuring and Connecting on Windows · Configuring and Connecting on UNIX and Linux · Accessing Data With Third-Party Applications

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 17 Chapter 1: Getting Started

Configuring and Connecting on Windows

The following basic information enables you to configure a data source and test connect with a driver immediately after installation. On Windows, you can configure and modify data sources through the ODBC Administrator using a driver Setup dialog box. Default connection values are specified through the options on the tabs of the Setup dialog box and are stored either as a user or system data source in the Windows Registry, or as a file data source in a specified location.

Setting the Library Path Environment Variable (Windows)

Before you can use your driver, you must set the PATH environment variable to include the path of the jvm.dll file of your Java™ Virtual Machine (JVM).

Note: The installer program sets the PATH environment variable to include the path of the jvm.dll file by default.

Configuring a Data Source

To configure a data source:

1. From the Progress DataDirect program group, start the ODBC Administrator and click either the User DSN, System DSN, or File DSN tab to display a list of data sources.

· User DSN: If you installed a default DataDirect ODBC user data source as part of the installation, select the appropriate data source name and click Configure to display the driver Setup dialog box. If you are configuring a new user data source, click Add to display a list of installed drivers. Select the appropriate driver and click Finish to display the driver Setup dialog box.

· System DSN: To configure a new system data source, click Add to display a list of installed drivers. Select the appropriate driver and click Finish to display the driver Setup dialog box. · File DSN: To configure a new file data source, click Add to display a list of installed drivers. Select the driver and click Advanced to specify attributes; otherwise, click Next to proceed. Specify a name for the data source and click Next. Verify the data source information; then, click Finish to display the driver Setup dialog box. The General tab of the Setup dialog box appears by default.

Note: The General tab displays only fields that are required for creating a data source. The fields on all other tabs are optional, unless noted otherwise in this book.

2. On the General tab, provide the following information; then, select the Schema Map tab. Data Source Name: Type a string that identifies this data source configuration, such as Accounting. Host Name: Type the URL of the interface to which you want to connect. Port Number: Type the port number of the server listener. The default is 9042.

18 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting on Windows

Keyspace Name: Type the name of the keyspace to which you want to connect. The default is system.

3. On the Schema Map tab, provide the following information; then, click Apply: Schema Map: Type the name and location of the configuration file where the relational map of native data is written. The driver either creates or looks for this file when connecting to the database. For example, C:\Users\Default\AppData\Local\Progress\DataDirect\Cassandra_Schema\MainServer.config. The default value is: application_data_folder\Local\Progress\DataDirect\Cassandra_Schema\host_name.config where:

application_data_folder

is the name and location of the application data folder.

host_name

is the value specified for the Host Name connection option.

See also Schema Map on page 126

Testing the Connection

To test the connection:

1. After you have configured the data source, you can click Test Connect on the Setup dialog box to attempt to connect to the data source using the connection options specified in the dialog box. The driver returns a message indicating success or failure. A logon dialog box appears as described in "Using a Logon Dialog Box." 2. Supply the requested information in the logon dialog box and click OK. Note that the information you enter in the logon dialog box during a test connect is not saved.

· If the driver can connect, it releases the connection and displays a Connection Established message. Click OK. · If the driver cannot connect because of an incorrect environment or connection value, it displays an appropriate error message. Click OK.

3. On the driver Setup dialog box, click OK. The values you have specified are saved and are the defaults used when you connect to the data source.You can change these defaults by using the previously described procedure to modify your data source.You can override these defaults by connecting to the data source using a connection string with alternate values. See "Using a Connection String" for information about using connection strings.

See also Using a Logon Dialog Box on page 75 Using a Connection String on page 74

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 19 Chapter 1: Getting Started

Configuring and Connecting on UNIX and Linux

The following basic information enables you to configure a data source and test connect with a driver immediately after installation. See "Configuring and Connecting to Data Sources" for detailed information about configuring the UNIX and Linux environments and data sources.

Note: In the following examples, xx in a driver filename represents the driver level number.

See also Configuring and Connecting to Data Sources on page 51

Environment Configuration

To configure the environment:

1. Check your permissions: You must log in as a user with full r/w/x permissions recursively on the entire product installation directory. 2. From your login shell, determine which shell you are running by executing: echo $SHELL

3. Run one of the following product setup scripts from the installation directory to set variables: odbc.sh or odbc.csh. For Korn, Bourne, and equivalent shells, execute odbc.sh. For a C shell, execute odbc.csh. After running the setup script, execute: env to verify that the installation_directory/lib directory has been added to your shared library path.

4. Set the ODBCINI environment variable. The variable must point to the path from the root directory to the system information file where your data source resides. The system information file can have any name, but the product is installed with a default file called odbc.ini in the product installation directory. For example, if you use an installation directory of /opt/odbc and the default system information file, from the Korn or Bourne shell, you would enter: ODBCINI=/opt/odbc/odbc.ini; export ODBCINI From the C shell, you would enter: setenv ODBCINI /opt/odbc/odbc.ini

Test Loading the Driver

The ivtestlib (32-bit drivers) and ddtestlib (64-bit drivers) test loading tools are provided to test load drivers and help diagnose configuration problems in the UNIX and Linux environments, such as environment variables not correctly set or missing database client components.This tool is installed in the /bin subdirectory in the product installation directory. It attempts to load a specified ODBC driver and prints out all available error information if the load fails.

20 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting on UNIX and Linux

For example, if the drivers are installed in /opt/odbc/lib, the following command attempts to load the 32-bit driver on Solaris, where xx represents the version number of the driver:

ivtestlib /opt/odbc/lib/ivcsndrxx.so

Note: On Solaris, AIX, and Linux, the full path to the driver does not have to be specified for the tool. The HP-UX version, however, requires the full path.

If the load is successful, the tool returns a success message along with the version string of the driver. If the driver cannot be loaded,the tool returns an error message explaining why.

Setting the Library Path Environment Variable (UNIX and Linux)

Before you can use the driver for Apache Cassandra, you must set the library path environment variable for your UNIX/Linux operating system to include the directory containing your JVM's libjvm.so [sl | a] file, and that directory's parent directory. NOTE FOR HP-UX: You also must set the LD_PRELOAD environment variable to the fully qualified path of the libjvm.so. 32-bit Driver: Library Path Environment Variable Set the library path environment variable to include the directory containing your 32-bit JVM's libjvm.so [sl | a] file, and that directory's parent directory. · LD_LIBRARY_PATH on Solaris, Linux, and HP-UX (Itanium) · SHLIB_PATH on HP PA-RISC · LIBPATH on AIX 64-bit Driver: Library Path Environment Variable Set the library path environment variable to include the directory containing your 64-bit JVM's libjvm.so [sl | a] file, and that directory's parent directory. · LD_LIBRARY_PATH on Solaris, HP-UX (Itanium), and Linux · LIBPATH on AIX

Configuring a Data Source in the System Information File

The default odbc.ini file installed in the installation directory is a template in which you create data source definitions on UNIX and Linux platforms.You enter your site-specific database connection information using a text editor. Each data source definition must include the keyword Driver=, which is the full path to the driver. The following examples show the minimum connection string options that must be set to complete a test connection, where xx represents iv for 32-bit or dd for 64-bit drivers, and yy represents the extension. The values for the options are samples and are not necessarily the ones you would use.

[ODBC Data Sources] Apache Cassandra=DataDirect 8.0 Apache Cassandra Driver

[Apache Cassandra] Driver=ODBCHOME/lib/xxcsndr28.yy HostName=CassandraServer KeyspaceName=CassandraKeyspace2

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 21 Chapter 1: Getting Started

PortNumber=9042 SchemaMap=/home/users/jsmith/Progress/DataDirect/Cassandra_Schema/CassandraServer.config

Connection Option Descriptions: HostName: Either the name or the IP address of the server to which you want to connect. KeySpaceName: The name of the keyspace to which you want to connect. The default is system. PortNumber: The port number of the server listener. The default is 9042. SchemaMap: The name and location of the configuration file where the relational map of native data is written. The driver either creates or looks for this file when connecting to the database. The default is: users_home_directory/Progress/DataDirect/Cassandra_Schema/host_name.config where:

users_home_directory

is the user©s home directory.

host_name

is the value specified for the HostName connection option.

See "Schema Map" for details.

See also Schema Map on page 126

Testing the Connection

The driver installation includes an ODBC application called example that can be used to connect to a data source and execute SQL.The application is located in the installation_directory/samples/example directory. To run the program after setting up a data source in the odbc.ini, enter example and follow the prompts to enter your data source name, user name, and password. If successful, a SQL> prompt appears and you can type in SQL statements such as SELECT * FROM table. If example is unable to connect, the appropriate error message is returned.

Accessing Data With Third-Party Applications

For procedures related to accessing data with common third-party applications, such as Tableau and Excel, refer to the Quick Start that corresponds to your data source and platform at https://www.progress.com/documentation/datadirect-connectors.

22 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 2

What Is ODBC?

The Open Database Connectivity (ODBC) interface by Microsoft allows applications to access data in database management systems (DBMS) using SQL as a standard for accessing the data. ODBC permits maximum interoperability, which means a single application can access different DBMS. Application end users can then add ODBC database drivers to link the application to their choice of DBMS. The ODBC interface defines:

· A library of ODBC function calls of two types: · Extended functions that support additional functionality, including scrollable cursors · Core functions that are based on the X/Open and SQL Access Group Call Level Interface specification

· SQL syntax based on the X/Open and SQL Access Group SQL CAE specification (1992) · A standard set of error codes · A standard way to connect and logon to a DBMS · A standard representation for data types The ODBC solution for accessing data led to ODBC database drivers, which are dynamic-link libraries on Windows and shared objects on UNIX and Linux. These drivers allow an application to gain access to one or more data sources. ODBC provides a standard interface to allow application developers and vendors of database drivers to exchange data between applications and data sources.

For details, see the following topics:

· How Does It Work? · Why Do Application Developers Need ODBC?

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 23 Chapter 2: What Is ODBC?

How Does It Work?

The ODBC architecture has four components:

· An application, which processes and calls ODBC functions to submit SQL statements and retrieve results · A Driver Manager, which loads drivers for the application · A driver, which processes ODBC function calls, submits SQL requests to a specific data source, and returns results to the application · A data source, which consists of the data to access and its associated operating system, DBMS, and network platform (if any) used to access the DBMS The following figure shows the relationship among the four components:

Why Do Application Developers Need ODBC?

Using ODBC, you, as an application developer can develop, compile, and ship an application without targeting a specific DBMS. In this scenario, you do not need to use embedded SQL; therefore, you do not need to recompile the application for each new environment.

24 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 3

About the Apache Cassandra Driver

The Progress DataDirect for ODBC for Apache Cassandra driver supports read-write access to Apache Cassandra versions 2.0 and higher. To support SQL access to Cassandra, the driver creates a relational map of native Cassandra data and translates SQL statements to CQL. Cassandra complex data types Map, List, Set, Tuple, and user-defined types are supported alongside primitive CQL types. The driver optimizes performance when executing joins by leveraging data relationships among Cassandra objects to minimize the amount of data that needs to be fetched over the network. Relationships among objects can be reported with the following metadata methods: SQLColumns, SQLForeignKeys, SQLGetTypeInfo, SQLPrimaryKeys, SQLSpecialColumns, SQLStatistics, and SQLTables. Furthermore, when performing joins, the driver leverages data relationships among Cassandra objects, minimizing the amount of data that needs to be fetched over the network.

For details, see the following topics:

· Driver Requirements · Support for Multiple Environments · ODBC Compliance · Version String Information · Data Types · Complex Type Normalization · Isolation and Lock Levels Supported · Binding Parameter Markers

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 25 Chapter 3: About the Apache Cassandra Driver

Driver Requirements

The driver requires a Java Virtual Machine (JVM): J2SE 6 or higher.

Support for Multiple Environments

Your Progress DataDirect driver is ODBC-compliant for Windows, UNIX, and Linux operating systems. This section explains the environment-specific differences when using the database drivers in your operating environment. The sections "Support for Windows Environments" and "Support for UNIX and Linux Environments" contain information specific to your operating environment. The following sections refer to threading models. See "Threading" for an explanation of threading.

Note: Support for operating environments and database versions are continually being added. For the latest information about supported platforms and databases, refer to the Progress DataDirect certification matrices page at https://www.progress.com/matrices/datadirect.

See also Support for Windows Environments on page 26 Support for UNIX and Linux Environments on page 28 Threading on page 211

Support for Windows Environments

The following are requirements for the 32- and 64-bit drivers on Windows operating systems.

32-Bit Driver

· All required network software that is supplied by your database system vendors must be 32-bit compliant. · If your application was built with 32-bit system libraries, you must use 32-bit driver. If your application was built with 64-bit system libraries, you must use 64-bit driver (see "64-bit Driver"). The database to which you are connecting can be either 32-bit or 64-bit enabled. · The following processors are supported: · x86: Intel · x64: Intel and AMD

· The following operating systems are supported for your Progress DataDirect for ODBC driver. All editions are supported unless otherwise noted.

· Windows Server 2016 · Windows Server 2012

26 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Support for Multiple Environments

· Windows Server 2008 · Windows 10 · Windows 8.1 · Windows 7

· A 32-bit Java Virtual Machine (JVM), J2SE 6 or higher, is required. Also, you must set the PATH environment variable to the directory containing your 32-bit JVM's jvm.dll file, and that directory's parent directory. · An application that is compatible with components that were built using Microsoft Visual Studio 2010 compiler and the standard Win32 threading model. · You must have ODBC header files to compile your application. For example, Microsoft Visual Studio includes these files.

See also 64-Bit Driver on page 27

64-Bit Driver

· All required network software that is supplied by your database system vendors must be 64-bit compliant. · The following processors are supported: · Intel · AMD

· The following operating systems are supported for your 64-bit driver. All editions are supported unless otherwise noted.

· Windows Server 2016 · Windows Server 2012 · Windows Server 2008 · Windows 10 · Windows 8.1 · Windows 7

· An application that is compatible with components that were built using Microsoft C/C++ Optimizing Compiler Version 14.00.40310.41 and the standard Windows 64 threading model. · A 64-bit JVM, J2SE 6 or higher, is required. Also, you must set the PATH environment variable to the directory containing your 32-bit JVM's jvm.dll file, and that directory's parent directory. · You must have ODBC header files to compile your application. For example, Microsoft Visual Studio includes these files.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 27 Chapter 3: About the Apache Cassandra Driver

Setup of the Driver The driver must be configured before it can be used. See "Getting Started" for information about using the Windows ODBC Administrator. See "Configuring and Connecting to Data Sources" for details about driver configuration.

See also Getting Started on page 17 Configuring and Connecting to Data Sources on page 51

Driver File Names for Windows The prefix for all 32-bit driver file names is iv. The prefix for all 64-bit driver file names is dd. The file extension is .dll, which indicates dynamic link libraries. For example, the 32-bit Apache Cassandra driver file name is ivcsndrnn.dll, where nn is the revision number of the driver. For the 8.0 version of the 32-bit driver, the file name is: ivcsndr28.dll For the 8.0 version of the 64-bit driver, the file name is: ddcsndr28.dll Refer to the readme file shipped with the product for a complete list of installed files.

Support for UNIX and Linux Environments

The following are requirements for the 32- and 64-bit drivers on UNIX/Linux operating systems.

32-Bit Driver

· All required network software that is supplied by your database system vendors must be 32-bit compliant. · If your application was built with 32-bit system libraries, you must use 32-bit drivers. If your application was built with 64-bit system libraries, you must use 64-bit drivers (see 64-Bit Drivers on page 30). The database to which you are connecting can be either 32-bit or 64-bit enabled. · For the driver for Apache Cassandra : A 32-bit Java Virtual Machine (JVM), J2SE 6 or higher, is required. Also, you must set the library path environment variable of your operating system to the directory containing your JVM's libjvm.so [sl | a] file and that directory's parent directory. The library path environment variable is:

· LD_LIBRARY_PATH on Linux, HP-UX Itanium, and Oracle Solaris · SHLIB_PATH on HP-UX PA-RISC · LIBPATH on AIX

AIX · IBM POWER processor · AIX 5L operating system, version 5.3 fixpack 5 and higher, 6.1, and 7.1

28 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Support for Multiple Environments

· An application compatible with components that were built using Visual Age C++ 6.0.0.0 and the AIX native threading model · Before you can use the driver, you must set the LIBPATH environment variable to include the paths containing the libjvm.so library and the libnio.so library, which are installed in a subdirectory of your Java Development Kit (JDK). For example, you would add the following paths for Java 6 installed in the /usr directory: :/usr/java6/jre/lib/ppc/classic:/usr/java6/jre/lib/ppc In this example, /usr/java6/jre/lib/ppc/classic is the location of libjvm.so, while /usr/java6/jre/lib/ppc is the location of libnio.so.

Note: The driver is compiled using the ±brtl loader option.Your application must support runtime linking functionality.

HP-UX · The following processors are supported: · PA-RISC · Intel Itanium II (IPF)

· The following operating systems are supported: · For PA-RISC: HP-UX 11i Versions 2 and 3 (B.11.23 and B.11.3x) · For IPF: HP-UX IPF 11i Versions 2 and 3 (B.11.23 and B.11.3x)

· For PA-RISC: An application compatible with components that were built using HP aC++ 3.60 and the HP-UX 11 native (kernel) threading model (posix draft 10 threads). All of the standard 32-bit UNIX drivers are supported on HP PA-RISC.

· For IPF: An application compatible with components that were built using HP aC++ 5.36 and the HP-UX 11 native (kernel) threading model (posix draft 10 threads)

Note: For PA-RISC users: Set the LD_PRELOAD environment variable to the libjvm.sl from your JVM installation.

Note: For Itanium users: · Do not link with the ±lc linker option. · Set the LD_PRELOAD environment variable to the libjvm.so from your JVM installation.

Linux · The following processors are supported: · x86: Intel · x64: Intel and AMD

· The following operating systems are supported: · CentOS Linux 4.x, 5.x, 6.x, and 7.x

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 29 Chapter 3: About the Apache Cassandra Driver

· Debian Linux 7.11 and 8.5 · Oracle Linux 4.x, 5.x, 6.x, and 7.x · Red Hat Enterprise Linux AS, ES, and WS version 4.x, 5.x, 6.x, and 7.x · SUSE Linux Enterprise Server 10.x, and 11.x · Ubuntu Linux 14.04 and 16.04

· An application compatible with components that were built using g++ GNU project C++ Compiler version 3.4.6 and the Linux native pthread threading model (Linuxthreads).

Oracle Solaris · The following processors are supported: · Oracle SPARC · x86: Intel · x64: Intel and AMD

· The following operating systems are supported: · For Oracle SPARC: Oracle Solaris 8, 9, 10, 11.x · For x86/x64: Oracle Solaris 10, Oracle Solaris 11.x

· For Oracle SPARC: An application compatible with components that were built using Sun Studio 11, C++ compiler version 5.8 and the Solaris native (kernel) threading model. · For x86/x64: An application compatible with components that were built using Oracle C++ 5.8 and the Solaris native (kernel) threading model

See also 64-Bit Drivers on page 30

64-Bit Drivers All required network software that is supplied by your database system vendors must be 64-bit compliant.

· For the driver for Apache Cassandra: A 64-bit Java Virtual Machine (JVM), J2SE 6 or higher, is required. Also, you must set the library path environment variable of your operating system to the directory containing your JVM's libjvm.so [sl | a] file and that directory's parent directory. · The library path environment variable is: · LD_LIBRARY_PATH on Linux, HP-UX Itanium, and Oracle Solaris · LIBPATH on AIX

AIX · IBM POWER processor · AIX 5L operating system, version 5.3 fixpack 5 and higher, 6.1, and 7.1 · An application compatible with components that were built using Visual Age C++ 6.0.0.0 and the AIX native threading model

30 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Support for Multiple Environments

· Before you can use the driver, you must set the LIBPATH environment variable to include the paths containing the libjvm.so library and the libnio.so library, which are installed in a subdirectory of your Java Development Kit (JDK). For example, you would add the following paths for Java 6 installed in the /usr directory: :/usr/java6_64/jre/lib/ppc64/classic:/usr/java6_64/jre/lib/ppc64 In this example, /usr/java6_64/jre/lib/ppc64/classic is the location of libjvm.so, while /usr/java6_64/jre/lib/ppc64 is the location of libnio.so.

Note: The driver is compiled using the ±brtl loader option.Your application must support runtime linking functionality.

HP-UX · HP-UX IPF 11i operating system, Versions 2 and 3 (B.11.23 and B.11.31) · HP aC++ v. 5.36 and the HP-UX 11 native (kernel) threading model (posix draft 10 threads)

Note: Do not link with the ±lc linker option.

Note: Set the LD_PRELOAD environment variable to the libjvm.so of your JVM installation.

Linux · The following processors are supported: · x64: Intel and AMD

· The following operating systems are supported: · CentOS Linux 4.x, 5.x, 6.x, and 7.x · Debian Linux 7.11 and 8.5 · Oracle Linux 4.x, 5.x, 6.x, and 7.x · Red Hat Enterprise Linux AS, ES, and WS version 4.x, 5.x, 6.x, and 7.x · SUSE Linux Enterprise Server 10.x, and 11.x · Ubuntu Linux 14.04 and 16.04 · SUSE Linux Enterprise Server 10.x, 11, and 12

· An application compatible with components that were built using g++ GNU project C++ Compiler version 3.4 and the Linux native pthread threading model (Linuxthreads)

Oracle Solaris · The following processors are supported: · Oracle SPARC · x64: Intel and AMD

· The following operating systems are supported: · For Oracle SPARC: Oracle Solaris 8, 9, 10, and 11.x

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 31 Chapter 3: About the Apache Cassandra Driver

· For x64: Oracle Solaris 10 and Oracle Solaris 11.x Express

· For Oracle SPARC: An application compatible with components that were built using Sun Studio 11, C++ compiler version 5.8 and the Solaris native (kernel) threading model · For x64: An application compatible with components that were built using Oracle C++ Compiler version 5.8 and the Solaris native (kernel) threading model

AIX If you are building 64-bit binaries, you must pass the define ODBC64. The example Application provides a demonstration of this. See the installed file example.txt for details. You must also include the correct compiler switches if you are building 64-bit binaries. For instance, to build example, you would use:

xlC_r ±DODBC64 -q64 -qlonglong -qlongdouble -qvftable -o example -I../include example.c -L../lib -lc_r -lC_r -lodbc

HP-UX 11 aCC The ODBC drivers require certain runtime library patches. The patch numbers are listed in the readme file for your product. HP-UX patches are publicly available from the HP Web site http://www.hp.com. HP updates the patch database regularly; therefore, the patch numbers in the readme file may be superseded by newer versions. If you search for the specified patch on an HP site and receive a message that the patch has been superseded, download and install the replacement patch. If you are building 64-bit binaries, you must pass the define ODBC64. The example Application provides a demonstration of this. See the installed file example.txt for details.You must also include the +DD64 compiler switch if you are building 64-bit binaries. For instance, to build example, you would use:

aCC -Wl,+s +DD64 -DODBC64 -o example -I../include example.c -L../lib -lodbc

Linux If you are building 64-bit binaries, you must pass the define ODBC64. The example Application provides a demonstration of this. See the installed file example.txt for details. You must also include the correct compiler switches if you are building 64-bit binaries. For instance, to build example, you would use:

g++ -o example -DODBC64 -I../include example.c -L../lib -lodbc -lodbcinst -lc

Oracle Solaris If you are building 64-bit binaries, you must pass the define ODBC64. The example Application provides a demonstration of this. See the installed file example.txt for details. You must also include the -xarch=v9 compiler switch if you are building 64-bit binaries. For instance, to build example, you would use:

CC -mt ±DODBC64 -xarch=v9 -o example -I../include example.c -L../lib -lodbc ±lCrun

32 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 ODBC Compliance

Setup of the Environment and the Drivers On UNIX and Linux, several environment variables and the system information file must be configured before the drivers can be used. See the following topics for additional information:

· Configuring and Connecting on UNIX and Linux contains a brief description of these variables. · Configuring and Connecting to Data Sources on page 51 provides details about driver configuration. · Configuring the Product on UNIX/Linux provides complete information about using the drivers on UNIX and Linux.

Driver Names for UNIX and Linux The drivers are ODBC API-compliant dynamic link libraries, referred to in UNIX and Linux as shared objects. The prefix for all 32-bit driver file names is iv. The prefix for all 64-bit driver file names is dd. The driver file names are lowercase and the extension is .so, the standard form for a shared object. For example, the 32-bit driver file name is ivcsndrnn.so, where nn is the revision number of the driver. For the driver on HP-UX PA-RISC only, the extension is .sl, for example, ivcsndrnn.sl. For the 8.0 version of the 32-bit driver, the file name is: ivcsndr28.so For the 8.0 version of the 64-bit driver, the file name is: ddcsndr28.so Refer to the readme file shipped with the product for a complete list of installed files.

ODBC Compliance

The Progress DataDirect for ODBC for Apache Cassandra driver is compliant with the Open Database Connectivity (ODBC) 3.52 specification.The driver is ODBC core compliant and supports some Level 1 and Level 2 features. See "ODBC API and Scalar Functions" for a list of the Core, Level 1, and Level 2 functions supported by the driver. The driver supports only the following Level 2 functions:

· SQLColumnPrivileges · SQLDescribeParam · SQLForeignKeys · SQLPrimaryKeys · SQLProcedures · SQLTablePrivileges

See also ODBC API and Scalar Functions on page 169

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 33 Chapter 3: About the Apache Cassandra Driver

Version String Information

The driver for Apache Cassandra has a version string of the format:

XX.YY.ZZZZ(BAAAA, UBBBB, JDDDDDD)

The Driver Manager on UNIX and Linux has a version string of the format:

XX.YY.ZZZZ(UBBBB)

The component for the Unicode conversion tables (ICU) has a version string of the format:

XX.YY.ZZZZ

where: XX is the major version of the product. YY is the minor version of the product. ZZZZ is the build number of the driver or ICU component. AAAA is the build number of the driver©s bas component. BBBB is the build number of the driver©s utl component. DDDDDD is the version of the Java components used by the driver. For example:

08.00.0017 (B0254, U0180, J000109) |__| |___| |___| |____| Driver Bas Utl Java

On Windows, you can check the version string through the properties of the driver DLL. Right-click the driver DLL and select Properties. The Properties dialog box appears. On the Details tab, the File Version will be listed with the other file properties. You can always check the version string of a driver on Windows by looking at the About tab of the driver's Setup dialog.

On UNIX and Linux, you can check the version string by using the test loading tool shipped with the product. This tool, ivtestlib for 32-bit drivers and ddtestlib for 64-bit drivers, is located in install_directory/bin. The syntax for the tool is:

ivtestlib shared_object

or

ddtestlib shared_object

For example, for the 32-bit driver on Oracle Solaris:

ivtestlib ivcsndr28.so

34 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Data Types

returns:

08.00.0001 (B0002, U0001, J000003)

For example, for the Driver Manager on Solaris:

ivtestlib libodbc.so

returns:

08.00.0001 (U0001)

For example, for the 64-bit Driver Manager on Solaris: ddtestlib libodbc.so returns: 08.00.0001 (U0001) For example, for the 32-bit ICU component on Solaris:

ivtestlib libivicu28.so 08.00.0001

Note: On AIX, Linux, and Solaris, the full path to the driver does not have to be specified for the test loading tool. The HP-UX version of the tool, however, requires the full path.

getFileVersionString Function

Version string information can also be obtained programmatically through the function getFileVersionString. This function can be used when the application is not directly calling ODBC functions. This function is defined as follows and is located in the driver©s shared object:

const unsigned char* getFileVersionString();

This function is prototyped in the qesqlext.h file shipped with the product.

Data Types

The following table lists the Apache Cassandra data types and their default mapping for ODBC.

Table 1: Default Mapping for Apache Cassandra Data Types

Apache Cassandra Data Type ODBC Data Type

ASCII1 SQL_VARCHAR (12)

Bigint SQL_BIGINT (-5)

Blob SQL_LONGVARBINARY (-4)

1 ASCII precision is set to 4000 by default, but you can use the AsciiSize connection option to configure Ascii precision. See "Ascii Size" for details.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 35 Chapter 3: About the Apache Cassandra Driver

Apache Cassandra Data Type ODBC Data Type

Boolean SQL_BIT (-7)

Counter2 SQL_BIGINT (-5)

Date SQL_TYPE_DATE (91)

Decimal3 SQL_DECIMAL (3)

Double SQL_DOUBLE (8)

Duration4 SQL_VARCHAR (12)

Float SQL_REAL (7)

Inet SQL_VARCHAR (12)

Int SQL_INTEGER (4)

List SQL_WLONGVARCHAR (-10)

Map SQL_WLONGVARCHAR (-10)

Set SQL_WLONGVARCHAR (-10)

Smallint SQL_SMALLINT (5)

Time SQL_TYPE_TIME (92)

Timestamp SQL_TYPE_TIMESTAMP (93)

TimeUUID SQL_CHAR (1)

TinyInt SQL_TINYINT (-6)

Tuple SQL_WLONGVARCHAR (-10)

Usertype SQL_WLONGVARCHAR (-10)

UUID SQL_CHAR (1)

Varchar5 SQL_WVARCHAR (9)

Varint6 SQL_DECIMAL (3)

2 Update is supported for Counter columns when all the other columns in the row comprise that row's primary key. See "Update" for details. 3 By default, Decimal precision is set to 38 and scale is set to 10; however, you can use the DecimalPrecision and DecimalScale connection options to configure Decimal precision and scale. See "Decimal Precision" and "Decimal Scale" for details. 4 Currently, the Duration type is supported only in simple columns, and not in Collection types. 5 Varchar precision is set to 4000 by default, but you can use the VarcharSize connection option to configure Varchar precision. See "Varchar Size" for details. 6 Varint precision is set to 38 by default, but you can use the VarintPrecision connection option to configure Varint precision. See "Varint Precision" for details.

36 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Data Types

See also Ascii Size on page 105 Update on page 148 Decimal Precision on page 110 Decimal Scale on page 110 Varchar Size on page 130 Varint Precision on page 130

Retrieving Data Type Information

At times, you might need to get information about the data types that are supported by the data source, for example, precision and scale.You can use the ODBC function SQLGetTypeInfo to do this. On Windows, you can use ODBC Test to call SQLGetTypeInfo against the ODBC data source to return the data type information. See "Diagnostic Tools" for details about ODBC Test. On UNIX, Linux, or Windows, an application can call SQLGetTypeInfo. Here is an example of a C function that calls SQLGetTypeInfo and retrieves the information in the form of a SQL result set.

void ODBC_GetTypeInfo(SQLHANDLE hstmt, SQLSMALLINT dataType) { RETCODE rc;

// There are 19 columns returned by SQLGetTypeInfo. // This example displays the first 3. // Check the ODBC 3.x specification for more information. // Variables to hold the data from each column char typeName[30]; short sqlDataType; unsigned long columnSize;

SQLINTEGER strlenTypeName, strlenSqlDataType, strlenColumnSize;

rc = SQLGetTypeInfo(hstmt, dataType); if (rc == SQL_SUCCESS) {

// Bind the columns returned by the SQLGetTypeInfo result set. rc = SQLBindCol(hstmt, 1, SQL_C_CHAR, &typeName, (SDWORD)sizeof(typeName), &strlenTypeName); rc = SQLBindCol(hstmt, 2, SQL_C_SHORT, &sqlDataType, (SDWORD)sizeof(sqlDataType), &strlenSqlDataType); rc = SQLBindCol(hstmt, 3, SQL_C_LONG, &columnSize, (SDWORD)sizeof(columnSize), &strlenColumnSize);

// Print column headings printf ("TypeName DataType ColumnSize\n"); printf ("------\n");

do {

// Fetch the results from executing SQLGetTypeInfo rc = SQLFetch(hstmt); if (rc == SQL_ERROR) { // Procedure to retrieve errors from the SQLGetTypeInfo function ODBC_GetDiagRec(SQL_HANDLE_STMT, hstmt); break; }

// Print the results if ((rc == SQL_SUCCESS) || (rc == SQL_SUCCESS_WITH_INFO)) {

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 37 Chapter 3: About the Apache Cassandra Driver

printf ("%-30s %10i %10u\n", typeName, sqlDataType, columnSize); }

} while (rc != SQL_NO_DATA); } }

See "Data Types" for information about how a database©s data types map to the standard ODBC data types.

See also Diagnostic Tools on page 85 Data Types on page 35

Complex Type Normalization

To support SQL access to Apache Cassandra, the driver maps the Cassandra data model to a relational schema. This process involves the normalization of complex types.You may need to be familiar with the normalization of complex types to formulate SQL queries correctly. The driver handles the normalization of complex types in the following manner: · If collection types (Map, List, and Set) are discovered, the driver normalizes the Cassandra table into a set of parent-child tables. Primitive types are mapped to a parent table, while each collection type is mapped to a child table that has a foreign key relationship to the parent table. · Non-nested Tuple and user-defined types (also referred to as Usertype) are flattened into a parent table alongside primitive types. · Any nested complex types (Tuple, user-defined types, Map, List, and Set) are exposed as JSON-style strings in the parent table. The normalization of complex types is described in greater detail in the following topics.

Collection Types

Cassandra collection types include the Map, List, and Set types. If collection types are discovered, the driver normalizes the native data into a set of parent-child tables. Primitive types are normalized in a parent table, while each collection type is normalized in a child table that has a foreign key relationship to the parent table. Take for example the following Cassandra table:

CREATE TABLE employee ( empid int PRIMARY KEY, phone map, client list, review set);

The following employee table is a tabular representation of the native Cassandra table with data included. In this example, four distinct relational tables are created. A parent table is created based on the empid column, and a child table is created for each of the three collection types (Map, List, and Set).

38 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Complex Type Normalization

Table 2: employee (native)

empid phone client review (primary key)

int map list set

103 home: 2855551122 Li 2013-12-07 mobile: 2855552347 Kumar 2015-01-22 office: 2855555566 Jones 2016-01-10 spouse: 2855556782

105 home: 2855555678 Yanev 2015-01-22 mobile: 2855553335 Bishop 2016-01-12 office: 2855555462 Bogdanov

The Parent Table The parent table is comprised of the primitive integer type column empid and takes its name from the native table. A SQL statement would identify the column as employee.empid.

Table 3: employee (relational parent)

empid (primary key)

int

103

105

A SQL insert on the employee parent table would take the form:

INSERT INTO employee (empid) VALUES (107)

The Map Child Table The Map collection is normalized into a three column child table called employee_phone. The name of the table is formulated by concatenating the name of the native table and the name of the Map column. A foreign key relationship to the parent table is maintained via the employee_empid column, and the Map©s key value pairs are resolved into separate keycol and valuecol columns. In a SQL statement, these columns would be identified as the employee_phone.employee_empid, employee_phone.keycol, and employee_phone.valuecol, respectively.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 39 Chapter 3: About the Apache Cassandra Driver

Table 4: employee_phone (relational child of the map column)

employee_empid keycol valuecol (foreign key)

int varchar varint

103 home 2855551122

103 mobile 2855552347

103 office 2855555566

103 spouse 2855556782

105 home 2855555678

105 mobile 2855553335

105 office 2855555462

A SQL insert on the employee_phone child table would take the form7:

INSERT INTO employee_phone (employee_empid, keycol, valuecol) VALUES (107, ©mobile©, 2855552391)

The List Child Table The List collection is normalized into a three column child table called employee_client. The name of the table is formulated by concatenating the name of the native table and the name of the List column. A foreign key relationship to the parent table is maintained via the employee_empid column; the order of the elements in the List is maintained via the current_list_index column; and the elements themselves are contained in the client column. SQL statements would identify these columns as employee_client.employee_empid, employee_client.current_list_index, and employee_client.client, respectively.

Table 5: employee_client (relational child of the list column)

employee_empid current_list_index client (foreign key)

int int varchar

103 0 Li

103 1 Kumar

103 2 Jones

7 The driver supports an insert on a child table prior to an insert on a parent table, circumventing referential integrity constraints associated with traditional RDBMS. To maintain integrity between parent and child tables, it is recommended that an insert first be performed on the parent table for each foreign key value added to the child. If such an insert is not first performed, the driver automatically inserts a row into the parent table that contains only the primary key values and NULL values for all non-primary key columns.

40 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Complex Type Normalization

employee_empid current_list_index client (foreign key)

int int varchar

105 0 Yanev

105 1 Bishop

105 2 Bogdanov

A SQL insert on the employee_client child table would take the form7:

INSERT INTO employee_client (employee_empid, client) VALUES (107, ©Nelson©)

The Set Child Table The Set collection is normalized into a two column child table called employee_review. The name of the table is formulated by concatenating the name of the native table and the name of the Set column. A foreign key relationship to the parent table is maintained via the employee_empid column, while the elements of the Set are given in natural order in the review column. In this child table, SQL statements would identify these columns as employee_review.employee_empid and employee_review.review

Table 6: employee_review (relational child of the set column)

employee_empid review (foreign key)

int date

103 2013-12-07

103 2015-01-22

103 2016-01-10

105 2015-01-22

105 2016-01-12

A SQL insert on the employee_client child table would take the form7:

INSERT INTO employee_review (employee_empid, review) VALUES (107, ©2015-01-20©)

Update Support Update is supported for primitive types, non-nested Tuple types, and non-nested user-defined types. Update is also supported for value columns (valuecol) in non-nested Map types.The driver does not support updates on List types, Set types, or key columns (keycol) in Map types because the values in each are part of the primary key of their respective child tables and primary key columns cannot be updated. If an Update is attempted when not allowed, the driver issues the following error message:

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 41 Chapter 3: About the Apache Cassandra Driver

[DataDirect][Cassandra ODBC Driver][Cassandra]syntax error or access rule violation: UPDATE not permitted for column: column_name

Tuple and User-Defined Types

The driver supports Tuple and user-defined complex types which were introduced with Apache Cassandra 2.1. As long as there are no complex types nested in either the Tuple or user-defined types, the driver normalizes Tuple and user-defined types by flattening them into a relational version of the native Cassandra table. Take for example the following Cassandra table:

CREATE TABLE agents1 ( agentid int PRIMARY KEY, email varchar, contact tuple);

The following agents1 table is a tabular representation of the native Cassandra table with data included.

Table 7: agents1 (native)

agentid email contact (primary key)

int varchar tuple

272 [email protected] tv newspaper blog

564 [email protected] radio tv magazine

For the relational version of agents1, all fields are retained as separate columns, and columns with primitive types (agentid and email) correspond directly to columns in the native table. In turn, tuple fields are flattened into columns using a _ naming pattern. The driver normalizes agents1 in the following manner.

Table 8: agents1 (relational)

agentid email contact_1 contact_2 contact_3 (primary key)

int varchar varchar varchar varchar

272 [email protected] tv newspaper blog

564 [email protected] radio tv magazine

42 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Complex Type Normalization

A SQL command would take the following form:

INSERT INTO agents1 (agentid,email,contact_1,contact_2,contact_3) VALUES (839,©[email protected]©,©radio©,©tv©,©magazine©)

The driver also flattens user-defined types when normalizing native Cassandra tables. In the following example, the native Cassandra agents2 table incorporates the user-defined address type.

CREATE TYPE address ( street varchar, city varchar, state varchar, zip int); CREATE TABLE agents2 ( agentid int PRIMARY KEY, email varchar, location frozen

);

The following agents2 table is a tabular representation of the native Cassandra table with data included.

Table 9: agents2 (native)

agentid email location (primary key)

int varchar address

095 [email protected] street: 1551 Main Street city: Pittsburgh state: PA zip: 15237

237 [email protected] street: 422 First Street city: Richmond state: VA zip: 23235

As with the previous example, all fields are retained as separate columns in the relational version of the table, and columns with primitive types (agentid and email) correspond directly to columns in the native table. Here a _ naming pattern is used to flatten the fields of the user-defined address type into columns. The driver normalizes agents2 in the following manner.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 43 Chapter 3: About the Apache Cassandra Driver

Table 10: agents2 (native)

agentid email location_street location_city location_state location_zip (primary key)

int varchar varchar varchar varchar int

095 [email protected] 1551 Main Pittsburgh PA 15237 Street

237 [email protected] 422 First Richmond VA 23235 Street

A SQL command would take the following form:

INSERT INTO agents2 (agentid,email,location_street,location_city,location_state,location_zip) VALUES (839,©[email protected]©,©9 Fifth Street©, ©Morrisville©, ©NC©, 27566)

Nested Complex Types

The nesting of complex types within Tuple and user-defined types is permitted in CQL. The driver does not normalize such nested types, but rather the data is passed as a JSON-style string. For example, consider the table contacts which contains the columns id and contact. While id is a primitive int column, contact is a user-defined info column which contains name, email, and location fields. The location field itself is a nested user-defined address column which contains street, city, state, and zip fields. In CQL, the structure of this table would take the following form:

CREATE TYPE address ( street varchar, city varchar, state varchar, zip int); CREATE TYPE info ( name varchar, email varchar, location frozen

); CREATE TABLE contacts ( id int PRIMARY KEY, contact frozen);

The following tabular representation of the contacts table shows how the driver returns data when complex types are nested in other complex types. Because the complex user-defined type address is embedded in the complex user-defined type info, the entire contact column is returned by the driver as a JSON string.

Note: You can retrieve this string data by calling the DatabaseMetaData.getColumns() method.

44 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Complex Type Normalization

Table 11: contacts (relational)

id contact (primary key)

int info>

034 {name: ©Jude©, email: ©[email protected]©, location: {street: ©101 Main Street©, city: ©Albany©, state:©NY©, zip: 12210}}

056 {name: ©Karen©, email: ©[email protected]©, location: {street: ©150 First Street©, city: ©Portland©, state: ©OR©, zip: 97214}}

When executing SQL commands involving nested complex types, the data must be passed as a JSON string. Furthermore, the syntax you use to connote the JSON string depends on whether you are passing the string directly in a SQL command or binding the JSON string as a parameter to a variable in the application.

Note: Hints for parsing JSON-style strings are provided in the Remark column of the getColumns() result.

Connoting the JSON-Style String in a SQL Statement When passing the string directly in a SQL command, you must use the correct SQL syntax and escapes to maintain the structure of the data. To begin, the entire JSON string must be passed in single quotation marks (©). Furthermore, if the JSON string contains nested strings, two single quotation marks are used to indicate string values. The first quotation mark is an escape connoting the second embedded quotation mark. The following command inserts a new row into the contacts table.

Note: In accordance with typical programming language syntax, the Insert statement is placed in double quotation marks. However, in the JSON string, two single quotation marks are used to indicate string values. The first quotation mark is an escape connoting the second embedded quotation mark.

SQLExecDirect( pstmt, "INSERT INTO contacts (id, contact)VALUES (075, ©{name: ©©Albert©©, email: ©©[email protected]©©, location: {street: ©©12 North Street©©, city: ©©Durham©©, state:©©NC©©, zip: 27704}}©)", SQL_NTS);

After the insert has been executed, the Select command SELECT contact FROM contacts WHERE id = 75 returns:

{name: ©Albert©, email: ©[email protected]©, location: {street: ©12 North Street©, city: ©Durham©, state:©NC©, zip: 27704 } }

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 45 Chapter 3: About the Apache Cassandra Driver

Connoting the JSON-Style String as a Parameter Variable When binding the JSON string as a parameter to a variable in the application, you must follow your programming language syntax by placing the JSON string in double quotation marks. Escapes are not used to connote embedded single quotation marks. For example:

STRING string_variable = "{name: ©Albert©, email: ©[email protected]©, location: {street: ©12 North Street©, city: ©Durham©, state:©NC©, zip: 27704}}"

Isolation and Lock Levels Supported

The driver supports isolation level 0 (Read Uncommitted).

See also Locking and Isolation Levels on page 203 Transaction Mode on page 129

Binding Parameter Markers

An ODBC application can prepare a query that contains dynamic parameters. Each parameter in a SQL statement must be associated, or bound, to a variable in the application before the statement is executed. When the application binds a variable to a parameter, it describes that variable and that parameter to the driver. Therefore, the application must supply the following information:

· The data type of the variable that the application maps to the dynamic parameter · The SQL data type of the dynamic parameter (the data type that the database system assigned to the parameter marker) The two data types are identified separately using the SQLBindParameter function.You can also use descriptor APIs as described in the Descriptor section of the ODBC specification (version 3.0 and higher). The driver relies on the binding of parameters to know how to send information to the database system in its native format. If an application furnishes incorrect parameter binding information to the ODBC driver, the results will be unpredictable. For example, the statement might not be executed correctly. To ensure interoperability, your driver uses only the parameter binding information that is provided by the application.

46 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 4

Supported Features

This section describes some of the supported features that will allow you to take full advantage of the Progress DataDirect for ODBC for Apache Cassandra driver.

For details, see the following topics:

· Unicode Support · Using IP Addresses · Parameter Metadata Support · SQL Support · Number of Connections and Statements Supported

Unicode Support

The driver is fully Unicode enabled. On UNIX and Linux platforms, the driver supports both UTF-8 and UTF-16. On Windows platforms, the driver supports UCS-2/UTF-16 only. The driver supports the Unicode ODBC W (Wide) function calls, such as SQLConnectW.This allows the Driver Manager to transmit these calls directly to the driver. Otherwise, the Driver Manager would incur the additional overhead of converting the W calls to ANSI function calls, and vice versa. See "UTF-16 Applications on UNIX and Linux" for related details. Also, see "Internationalization, Localization and Unicode" for a more detailed explanation of Unicode.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 47 Chapter 4: Supported Features

See also UTF-16 Applications on UNIX and Linux on page 60 Internationalization, Localization, and Unicode on page 179

Using IP Addresses

The driver supports Internet Protocol (IP) addresses in the IPv4 and IPv6 formats. If your network supports named servers, the server name specified in the data source can resolve to an IPv4 or IPv6 address. In the following connection string example, the IP address for the Apache Cassandra server is specified in IPv6 format:

DRIVER=DataDirect Apache Cassandra Driver; HostName=2001:DB8:0000:0000:8:800:200C:417A;PORT=9042; KN=CASSANDRAKEYSPACE2;UID=JOHN;PWD=XYZZYYou; SchemaMap=C:\Users\Default\AppData\Local\Progress\DataDirect\ Cassandra_Schema\MainServer.config

In addition to the normal IPv6 format, the driver supports IPv6 alternative formats for compressed and IPv4/IPv6 combination addresses. For example, the following connection string specifies the server using IPv6 format, but uses the compressed syntax for strings of zero bits:

DRIVER=DataDirect Apache Cassandra Driver; HostName=2001:DB8:0:0:8:800:200C:417A;PORT=9042; KN=CASSANDRAKEYSPACE2;UID=JOHN;PWD=XYZZYYou; SchemaMap=C:\Users\Default\AppData\Local\Progress\DataDirect\ Cassandra_Schema\MainServer.config

Similarly, the following connection string specifies the server using a combination of IPv4 and IPv6:

DRIVER=DataDirect Apache Cassandra Driver; HostName=2001:DB8:0:0:8:800:123.456.78.90;PORT=9042; DB=CASSANDRAKEYSPACE2;UID=JOHN;PWD=XYZZYYou; SchemaMap=C:\Users\Default\AppData\Local\Progress\DataDirect\ Cassandra_Schema\MainServer.config

For complete information about IPv6 formats, go to the following URL: http://tools.ietf.org/html/rfc4291#section-2.2

Parameter Metadata Support

The driver supports returning parameter metadata as described in this section.

Insert and Update Statements

The driver supports returning parameter metadata for the following forms of Insert and Update statements: · INSERT INTO foo VALUES(?, ?, ?) · INSERT INTO foo (col1, col2, col3) VALUES(?, ?, ?)

48 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Parameter Metadata Support

· UPDATE foo SET col1=?, col2=?, col3=? WHERE col1 operator ? [{AND | OR} col2 operator ?] where:

operator

is any of the following SQL operators: =, <, >, <=, >=, and <> .

Select Statements

The driver supports returning parameter metadata for Select statements that contain parameters in ANSI SQL 92 entry-level predicates, for example, such as COMPARISON, BETWEEN, IN, LIKE, and EXISTS predicate constructs. Refer to the ANSI SQL reference for detailed syntax. Parameter metadata can be returned for a Select statement if one of the following conditions is true:

· The statement contains a predicate value expression that can be targeted against the source tables in the associated FROM clause. For example:

SELECT * FROM foo WHERE bar > ?

In this case, the value expression "bar" can be targeted against the table "foo" to determine the appropriate metadata for the parameter.

· The statement contains a predicate value expression part that is a nested query.The nested query©s metadata must describe a single column. For example:

SELECT * FROM foo WHERE (SELECT x FROM y WHERE z = 1) < ?

The following Select statements show further examples for which parameter metadata can be returned:

SELECT col1, col2 FROM foo WHERE col1 = ? AND col2 > ? SELECT ... WHERE colname = (SELECT col2 FROM t2 WHERE col3 = ?) SELECT ... WHERE colname LIKE ? SELECT ... WHERE colname BETWEEN ? AND ? SELECT ... WHERE colname IN (?, ?, ?) SELECT ... WHERE EXISTS(SELECT ... FROM T2 WHERE col1 < ?)

ANSI SQL 92 entry-level predicates in a WHERE clause containing GROUP BY, HAVING, or ORDER BY statements are supported. For example:

SELECT * FROM t1 WHERE col = ? ORDER BY 1

Joins are supported. For example:

SELECT * FROM t1,t2 WHERE t1.col1 = ?

Fully qualified names and aliases are supported. For example:

SELECT a, b, c, d FROM T1 AS A, T2 AS B WHERE A.a = ? AND B.b = ?

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 49 Chapter 4: Supported Features

SQL Support

The Apache Cassandra driver provides support for standard SQL (primarily SQL 92), and a set of SQL extensions.

See also Supported SQL Functionality on page 133

Number of Connections and Statements Supported

The driver supports multiple connections and multiple statements per connection.

50 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 5

Using the Driver

This section guides you through the configuring and connecting to data sources. In addition, it explains how to use the functionality supported by your driver.

For details, see the following topics:

· Configuring and Connecting to Data Sources · Performance Considerations · Using the SQL Engine Server · Using Failover in a Cluster · Using Identifiers

Configuring and Connecting to Data Sources

After you install the driver, you configure data sources to connect to the database. See "Getting Started" for an explanation of different types of data sources. The data source contains connection options that allow you to tune the driver for specific performance. If you want to use a data source but need to change some of its values, you can either modify the data source or override its values at connection time through a connection string. If you choose to use a connection string, you must use specific connection string attributes. See "Using a Connection String" for an alphabetical list of driver connection string attributes and their initial default values.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 51 Chapter 5: Using the Driver

See also Getting Started on page 17 Using a Connection String on page 74

Configuring the Product on UNIX/Linux

This chapter contains specific information about using your driver in the UNIX and Linux environments. See "Environment Variables" for additional platform information.

See also Environment Variables on page 52

Environment Variables The first step in setting up and configuring the driver for use is to set several environment variables. The following procedures require that you have the appropriate permissions to modify your environment and to read, write, and execute various files.You must log in as a user with full r/w/x permissions recursively on the entire Progress DataDirect for ODBC installation directory.

Library Search Path The library search path variable can be set by executing the appropriate shell script located in the ODBC home directory. From your login shell, determine which shell you are running by executing:

echo $SHELL

C shell login (and related shell) users must execute the following command before attempting to use ODBC-enabled applications:

source ./odbc.csh

Bourne shell login (and related shell) users must initialize their environment as follows:

. ./odbc.sh

Executing these scripts sets the appropriate library search path environment variable:

· LD_LIBRARY_PATH on HP-UX IPF, Linux, and Oracle Solaris · LIBPATH on AIX · SHLIB_PATH on HP-UX PA-RISC The library search path environment variable must be set so that the ODBC core components and drivers can be located at the time of execution. After running the setup script, execute:

env

to verify that the installation_directory/lib directory has been added to your shared library path.

52 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting to Data Sources

ODBCINI Setup installs in the product installation directory a default system information file, named odbc.ini, that contains data sources. See "Data Source Configuration on UNIX/Linux" for an explanation of the odbc.ini file. The system administrator can choose to rename the file and/or move it to another location. In either case, the environment variable ODBCINI must be set to point to the fully qualified path name of the odbc.ini file. For example, to point to the location of the file for an installation on /opt/odbc in the C shell, you would set this variable as follows:

setenv ODBCINI /opt/odbc/odbc.ini

In the Bourne or Korn shell, you would set it as:

ODBCINI=/opt/odbc/odbc.ini;export ODBCINI

As an alternative, you can choose to make the odbc.ini file a hidden file and not set the ODBCINI variable. In this case, you would need to rename the file to .odbc.ini (to make it a hidden file) and move it to the user's $HOME directory. The driver searches for the location of the odbc.ini file as follows: 1. The driver checks the ODBCINI variable 2. The driver checks $HOME for .odbc.ini If the driver does not locate the system information file, it returns an error.

See also Data Source Configuration on UNIX/Linux on page 55

ODBCINST Setup installs in the product installation directory a default file, named odbcinst.ini, for use with DSN-less connections. See "DSN-less Connections" for an explanation of the odbcinst.ini file. The system administrator can choose to rename the file or move it to another location. In either case, the environment variable ODBCINST must be set to point to the fully qualified path name of the odbcinst.ini file. For example, to point to the location of the file for an installation on /opt/odbc in the C shell, you would set this variable as follows:

setenv ODBCINST /opt/odbc/odbcinst.ini

In the Bourne or Korn shell, you would set it as:

ODBCINST=/opt/odbc/odbcinst.ini;export ODBCINST

As an alternative, you can choose to make the odbcinst.ini file a hidden file and not set the ODBCINST variable. In this case, you would need to rename the file to .odbcinst.ini (to make it a hidden file) and move it to the user's $HOME directory. The driver searches for the location of the odbcinst.ini file as follows: 1. The driver checks the ODBCINST variable 2. The driver checks $HOME for .odbcinst.ini If the driver does not locate the odbcinst.ini file, it returns an error.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 53 Chapter 5: Using the Driver

See also DSN-less Connections on page 58

DD_INSTALLDIR This variable provides the driver with the location of the product installation directory so that it can access support files. DD_INSTALLDIR must be set to point to the fully qualified path name of the installation directory. For example, to point to the location of the directory for an installation on /opt/odbc in the C shell, you would set this variable as follows:

setenv DD_INSTALLDIR /opt/odbc

In the Bourne or Korn shell, you would set it as:

DD_INSTALLDIR=/opt/odbc;export DD_INSTALLDIR

The driver searches for the location of the installation directory as follows: 1. The driver checks the DD_INSTALLDIR variable 2. The driver checks the odbc.ini or the odbcinst.ini files for the InstallDir keyword (see "Configuration Through the System Information (odbc.ini) File" for a description of the InstallDir keyword) If the driver does not locate the installation directory, it returns an error. The next step is to test load the driver.

See also Configuration Through the System Information (odbc.ini) File on page 55

The Test Loading Tool The second step in preparing to use a driver is to test load it. The ivtestlib (32-bit driver) and ddtestlib (64-bit driver) test loading tools are provided to test load drivers and help diagnose configuration problems in the UNIX and Linux environments, such as environment variables not correctly set or missing database client components. This tool is installed in the /bin subdirectory in the product installation directory. It attempts to load a specified ODBC driver and prints out all available error information if the load fails. The test loading tool is provided to test load drivers and help diagnose configuration problems in the UNIX and Linux environments, such as environment variables not correctly set or missing database client components. This tool is installed in the bin subdirectory in the product installation directory. It attempts to load a specified ODBC driver and prints out all available error information if the load fails. For example, if the driver is installed in /opt/odbc/lib, the following command attempts to load the 32-bit driver on Solaris, where xx represents the version number of the driver:

ivtestlib /opt/odbc/lib/ivcsndrxx.so

Note: On Solaris, AIX, and Linux, the full path to the driver does not have to be specified for the tool. The HP-UX version, however, requires the full path.

If the load is successful, the tool returns a success message along with the version string of the driver. If the driver cannot be loaded, the tool returns an error message explaining why.

54 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting to Data Sources

See "Version String Information" for details about version strings. The next step is to configure a data source through the system information file.

See also Version String Information on page 34

Data Source Configuration on UNIX/Linux In the UNIX and Linux environments, a system information file is used to store data source information. Setup installs a default version of this file, called odbc.ini, in the product installation directory. This is a plain text file that contains data source definitions.

Configuration Through the System Information (odbc.ini) File To configure a data source manually, you edit the odbc.ini file with a text editor. The content of this file is divided into three sections. At the beginning of the file is a section named [ODBC Data Sources] containing data_source_name=installed-driver pairs, for example:

Apache Cassandra=DataDirect 8.0 Apache Cassandra Driver.

The driver uses this section to match a data source to the appropriate installed driver. The [ODBC Data Sources] section also includes data source definitions. The default odbc.ini contains a data source definition for the driver. Each data source definition begins with a data source name in square brackets, for example, [Apache Cassandra]. The data source definitions contain connection string attribute=value pairs with default values.You can modify these values as appropriate for your system. "Connection Option Descriptions" describes these attributes. See "Sample Default odbc.ini File" for sample data sources. The second section of the file is named [ODBC File DSN] and includes one keyword:

[ODBC File DSN] DefaultDSNDir=

This keyword defines the path of the default location for file data sources (see "File Data Sources").

Note: This section is not included in the default odbc.ini file that is installed by the product installer.You must add this section manually.

The third section of the file is named [ODBC] and includes several keywords, for example:

[ODBC] IANAAppCodePage=4 InstallDir=/opt/odbc Trace=0 TraceFile=odbctrace.out TraceDll=/opt/odbc/lib/ivtrc28.so ODBCTraceMaxFileSize=102400 ODBCTraceMaxNumFiles=10

The IANAAppCodePage keyword defines the default value that the UNIX/Linux driver uses if individual data sources have not specified a different value. See "IANAAAppCodePage" in "Connection Option Descriptions" and "Code Page Values" for details. The default value is 4.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 55 Chapter 5: Using the Driver

The InstallDir keyword must be included in this section. The value of this keyword is the path to the installation directory under which the /lib and /locale directories are contained. The installation process automatically writes your installation directory to the default odbc.ini file. For example, if you choose an installation location of /opt/odbc, then the following line is written to the [ODBC] section of the default odbc.ini:

InstallDir=/opt/odbc

Note: If you are using only DSN-less connections through an odbcinst.ini file and do not have an odbc.ini file, then you must provide [ODBC] section information in the [ODBC] section of the odbcinst.ini file. The driver and Driver Manager always check first in the [ODBC] section of an odbc.ini file. If no odbc.ini file exists or if the odbc.ini file does not contain an [ODBC] section, they check for an [ODBC] section in the odbcinst.ini file. See "DSN-less Connections" for details.

ODBC tracing allows you to trace calls to the ODBC driver and create a log of the traces for troubleshooting purposes. The following keywords all control tracing: Trace, TraceFile, TraceDLL, ODBCTraceMaxFileSize, and ODBCTraceMaxNumFiles. For a complete description of these keywords and discussion of tracing, see "ODBC Trace."

See also Connection Option Descriptions on page 101 Sample Default odbc.ini File on page 56 File Data Sources on page 59 IANAAppCodePage on page 114 Code Page Values on page 163 DSN-less Connections on page 58 ODBC Trace on page 85

Sample Default odbc.ini File The following is a sample odbc.ini file that the installer program installs in the installation directory. All occurrences of ODBCHOME are replaced with your installation directory path during installation of the file. Values that you must supply are enclosed by angle brackets (< >). If you are using the installed odbc.ini file, you must supply the values and remove the angle brackets before that data source section will operate properly. Commented lines are denoted by the # symbol. This sample shows a 32-bit driver with the driver file name beginning with iv. A 64-bit driver file would be identical except that driver name would begin with dd and the list of data sources would include only the 64-bit drivers.

[ODBC Data Sources] Apache Cassandra=DataDirect 8.0 Apache Cassandra

[Apache Cassandra] Driver=ODBCHOME/lib/ivcsndr28.so Description=DataDirect 8.0 Apache Cassandra ApplicationUsingThreads=1 AsciiSize=4000 AuthenticationMethod=0 ClusterNodes= ConfigOptions= CreateMap=2 DataSourceName= DecimalPrecision=38 DecimalScale=10 FetchSize=100 HostName=

56 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting to Data Sources

InitializationString= JVMArgs=-Xmx256m JVMClasspath= KeySpaceName= LogConfigFile= LoginTimeout=15 LogonID= NativeFetchSize=10000 Password= PortNumber=9042 ReadConsistency=4 ReadOnly=1 ReportCodepageConversionErrors=0 ResultMemorySize=-1 SchemaMap= ServerPortNumber=19934 SQLEngineMode=0 TransactionMode=0 VarcharSize=4000 VarintPrecision=38 WriteConsistency=4

[ODBC] IANAAppCodePage=4 InstallDir=ODBCHOME Trace=0 TraceFile=odbctrace.out TraceDll=ODBCHOME/lib/ivtrc28.so ODBCTraceMaxFileSize=102400 ODBCTraceMaxNumFiles=10

[ODBC File DSN] DefaultDSNDir= UseCursorLib=0

To modify or create data sources in the odbc.ini file, use the following procedures.

· To modify a data source: a) Using a text editor, open the odbc.ini file. b) Modify the default attributes in the data source definitions as necessary based on your system specifics, for example, enter the host name and port number of your system in the appropriate location. Consult the "Apache Cassandra Attribute Names" table in "Connection Option Descriptions" for other specific attribute values.

c) After making all modifications, save the odbc.ini file and close the text editor.

Important: The "Apache Cassandra Attribute Names" table in "Connection Option Descriptions" lists both the long and short names of the attribute. When entering attribute names into odbc.ini, you must use the long name of the attribute. The short name is not valid in the odbc.ini file.

· To create a new data source: a) Using a text editor, open the odbc.ini file. b) Copy an appropriate existing default data source definition and paste it to another location in the file. c) Change the data source name in the copied data source definition to a new name. The data source name is between square brackets at the beginning of the definition, for example, [Apache Cassandra]. d) Modify the attributes in the new definition as necessary based on your system specifics, for example, enter the host name and port number of your system in the appropriate location.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 57 Chapter 5: Using the Driver

Consult the "Apache Cassandra Attribute Names" table in "Connection Option Descriptions" for other specific attribute values.

e) In the [ODBC] section at the beginning of the file, add a new data_source_name=installed-driver pair containing the new data source name and the appropriate installed driver name. f) After making all modifications, save the odbc.ini file and close the text editor.

Important: The "Apache Cassandra Attribute Names" table in "Connection Option Descriptions" lists both the long and short name of the attribute. When entering attribute names into odbc.ini, you must use the long name of the attribute. The short name is not valid in the odbc.ini file.

See also Connection Option Descriptions on page 101

The Example Application Progress DataDirect ships an application, named example, that is installed in the /samples/example subdirectory of the product installation directory. Once you have configured your environment and data source, use the example application to test passing SQL statements. To run the application, enter example and follow the prompts to enter your data source name, user name, and password. If successful, a SQL> prompt appears and you can type in SQL statements, such as SELECT * FROM table_name. If example is unable to connect to the database, an appropriate error message appears. Refer to the example.txt file in the example subdirectory for an explanation of how to build and use this application. For more information, see The Example Application in Progress DataDirect for ODBC Drivers Reference.

DSN-less Connections Connections to a data source can be made via a connection string without referring to a data source name (DSN-less connections). This is done by specifying the DRIVER= keyword instead of the DSN= keyword in a connection string, as outlined in the ODBC specification. A file named odbcinst.ini must exist when the driver encounters DRIVER= in a connection string. Setup installs a default version of this file in the product installation directory (see "ODBCINST" for details about relocating and renaming this file).This is a plain text file that contains default DSN-less connection information. You should not normally need to edit this file. The content of this file is divided into several sections. At the beginning of the file is a section named [ODBC Drivers] that lists installed drivers, for example,

DataDirect 8.0 Apache Cassandra Driver=Installed

This section also includes additional information for each driver. The next section of the file is named [Administrator]. The keyword in this section, AdminHelpRootDirectory, is required for the Linux ODBC Administrator to locate its help system. The installation process automatically provides the correct value for this keyword. The final section of the file is named [ODBC]. The [ODBC] section in the odbcinst.ini file fulfills the same purpose in DSN-less connections as the [ODBC] section in the odbc.ini file does for data source connections. See "Configuration Through the System Information (odbc.ini) File" for a description of the other keywords this section.

58 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting to Data Sources

Note: The odbcinst.ini file and the odbc.ini file include an [ODBC] section. If the information in these two sections is not the same, the values in the odbc.ini [ODBC] section override those of the odbcinst.ini [ODBC] section.

See also ODBCINST on page 53 Configuration Through the System Information (odbc.ini) File on page 55

Sample odbcinst.ini File The following is a sample odbcinst.ini. All occurrences of ODBCHOME are replaced with your installation directory path during installation of the file. Commented lines are denoted by the # symbol. This sample shows a 32-bit driver with the driver file name beginning with iv; a 64-bit driver file would be identical except that driver names would begin with dd.

[ODBC Drivers] DataDirect 8.0 Apache Cassandra=Installed

[DataDirect 8.0 Apache Cassandra] Driver=ODBCHOME/lib/ivcsndr28.so JarFile=ODBCHOME/java/lib/cassandra.jar APILevel=1 ConnectFunctions=YYY CPTimeout=60 DriverODBCVer=3.52 FileUsage=0 HelpRootDirectory=ODBCHOME/CassandraHelp Setup= SQLLevel=0 UsageCount=1

[ODBC] #This section must contain values for DSN-less connections #if no odbc.ini file exists. If an odbc.ini file exists, #the values from that [ODBC] section are used.

IANAAppCodePage=4 InstallDir=ODBCHOME Trace=0 TraceFile=odbctrace.out TraceDll=ODBCHOME/lib/ivtrc28.so ODBCTraceMaxFileSize=102400 ODBCTraceMaxNumFiles=10

File Data Sources The Driver Manager on UNIX and Linux supports file data sources. The advantage of a file data source is that it can be stored on a server and accessed by other machines, either Windows, UNIX, or Linux. See "Getting Started" for a general description of ODBC data sources on both Windows and UNIX. A file data source is simply a text file that contains connection information. It can be created with a text editor. The file normally has an extension of .dsn. For example, a file data source for the driver would be similar to the following:

[ODBC] Driver=DataDirect 8.0 Apache Cassandra Port=9042 HostName=Cassandra2

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 59 Chapter 5: Using the Driver

KeyspaceName=CassandraKeyspace2 SchemaMap=/home/users/jsmith/Progress/DataDirect/Cassandra_Schema/Cassandra2.config LogonID=jsmith

It must contain all basic connection information plus any optional attributes. Because it uses the DRIVER= keyword, an odbcinst.ini file containing the driver location must exist (see "DSN-less Connections"). The file data source is accessed by specifying the FILEDSN= instead of the DSN= keyword in a connection string, as outlined in the ODBC specification. The complete path to the file data source can be specified in the syntax that is normal for the machine on which the file is located. For example, on Windows:

FILEDSN=C:\Program Files\Common Files\ODBC\DataSources\Cassandra2.dsn

or, on UNIX and Linux:

FILEDSN=/home/users/jsmith/filedsn/Cassandra2.dsn

If no path is specified for the file data source, the Driver Manager uses the DefaultDSNDir property, which is defined in the [ODBC File DSN] setting in the odbc.ini file to locate file data sources (see "Data Source Configuration on UNIX/Linux" for details). If the [ODBC File DSN] setting is not defined, the Driver Manager uses the InstallDir setting in the [ODBC] section of the odbc.ini file. The Driver Manager does not support the SQLReadFileDSN and SQLWriteFileDSN functions. As with any connection string, you can specify attributes to override the default values in the data source:

FILEDSN=/home/users/jsmith/filedsn/Cassandra2.dsn;UID=jsmith;PWD=test01

See also Getting Started on page 17 DSN-less Connections on page 58 Data Source Configuration on UNIX/Linux on page 55

UTF-16 Applications on UNIX and Linux Because the DataDirect Driver Manager allows applications to use either UTF-8 or UTF-16 Unicode encoding, applications written in UTF-16 for Windows platforms can also be used on UNIX and Linux platforms. The Driver Manager assumes a default of UTF-8 applications; therefore, two things must occur for it to determine that the application is UTF-16:

· The definition of SQLWCHAR in the ODBC header files must be switched from "char *" to "short *". To do this, the application uses #define SQLWCHARSHORT. · The application must set the encoding for the environment or connection using one of the following attributes. If your application passes UTF-8 encoded strings to some connections and UTF-16 encoded strings to other connections in the same environment, encoding should be set for the connection only; otherwise, either method can be used.

· To configure the encoding for the environment, set the ODBC environment attribute SQL_ATTR_APP_UNICODE_TYPE to a value of SQL_DD_CP_UTF16, for example: rc = SQLSetEnvAttr(*henv, SQL_ATTR_APP_UNICODE_TYPE,(SQLPOINTER)SQL_DD_CP_UTF16, SQL_IS_INTEGER);

· To configure the encoding for the connection only, set the ODBC connection attribute SQL_ATTR_APP_UNICODE_TYPE to a value of SQL_DD_CP_UTF16. For example: rc = SQLSetConnectAttr(hdbc, SQL_ATTR_APP_UNICODE_TYPE, SQL_DD_CP_UTF16, SQL_IS_INTEGER);

60 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting to Data Sources

Data Source Configuration Using a GUI

On Windows, data sources are stored in the Windows Registry.You can configure and modify data sources through the ODBC Administrator using a driver Setup dialog box, as described in this section. When the driver is first installed, the values of its connection options are set by default. These values appear on the driver Setup dialog box tabs when you create a new data source.You can change these default values by modifying the data source. In the following procedure, the description of each tab is followed by a table that lists the connection options for that tab and their initial default values. This table links you to a complete description of the options and their connection string attribute equivalents. The connection string attributes are used to override the default values of the data source if you want to change these values at connection time. To configure a Apache Cassandra data source:

1. Start the ODBC Administrator by selecting its icon from the Progress DataDirect for ODBC program group. 2. Select a tab:

· User DSN: If you are configuring an existing user data source, select the data source name and click Configure to display the driver Setup dialog box. If you are configuring a new user data source, click Add to display a list of installed drivers. Select the driver and click Finish to display the driver Setup dialog box. · System DSN: If you are configuring an existing system data source, select the data source name and click Configure to display the driver Setup dialog box. If you are configuring a new system data source, click Add to display a list of installed drivers. Select the driver and click Finish to display the driver Setup dialog box. · File DSN: If you are configuring an existing file data source, select the data source file and click Configure to display the driver Setup dialog box. If you are configuring a new file data source, click Add to display a list of installed drivers; then, select a driver. Click Advanced if you want to specify attributes; otherwise, click Next to proceed. Specify a name for the data source and click Next. Verify the data source information; then, click Finish to display the driver Setup dialog box.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 61 Chapter 5: Using the Driver

3. The General tab of the Setup dialog box appears by default. Figure 1: General tab

On this tab, provide values for the options in the following table; then, click Apply. The table provides links to descriptions of the connection options. The General tab displays fields that are required for creating a data source. The fields on all other tabs are optional, unless noted otherwise.

Connection Options: General Description

Specifies the name of a data source in your Windows Registry or odbc.ini Data Source Name on page 109 file. Default: None

Specifies an optional long description of a data source.This description is not used as a runtime connection attribute, but does appear in the Description on page 111 ODBC.INI section of the Registry and in the odbc.ini file. Default: None

The name or the IP address of the server to which you want to connect. Host Name on page 113 Default: None

Specifies the port number of the server listener. Port Number on page 121 Default: 9042

62 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting to Data Sources

Connection Options: General Description

Specifies the name of the keyspace to which you want to connect. Keyspace Name on page 117 Default: system

A comma-separated list of member nodes in your cluster to which the driver attempts to connect. Specifying a value for this option enables Cluster Nodes on page 106 connect time failover and load balancing. Default: None

4. At any point during the configuration process, you can click Test Connect to attempt to connect to the data source using the connection options specified in the driver Setup dialog box. A logon dialog box appears (see "Using a Logon Dialog Box" for details). Note that the information you enter in the logon dialog box during a test connect is not saved. 5. To further configure your driver, click on the following tabs. The corresponding sections provide details on the fields specific to each configuration tab:

· SQL Engine tab allows you to configure the SQL Engine©s behavior. · Advanced tab allows you to configure advanced behavior. · Schema Map tab allows you to configure mapping behavior. · Security tab allows you to configure security settings.

6. Click OK. When you click OK, the values you have specified become the defaults when you connect to the data source.You can change these defaults by using this procedure to reconfigure your data source.You can override these defaults by connecting to the data source using a connection string with alternate values.

See also Using a Logon Dialog Box on page 75

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 63 Chapter 5: Using the Driver

SQL Engine Tab The SQL Engine Tab allows you to specify additional data source settings. The fields are optional unless otherwise noted. On this tab, provide values for the options in the following tables; then, click Apply. Figure 2: SQL Engine tab

The SQL Engine can be run in one of two modes: direct mode or server mode. When set to direct mode, both the driver and its SQL engine run in the ODBC application©s address space. Some applications may experience problems loading the JVM because the process exceeds the available heap space. To avoid this issue, you can configure the driver to operate in server mode. Server mode allows the driver to connect to an SQL engine JVM running as a separate service. By default, the driver is set to 0 - Auto. In this setting, the SQL Engine attempts to run in server mode first, but will failover to direct mode if server mode is unavailable. If you prefer that the SQL engine runs exclusively in a particular mode, set the SQL Engine Mode option to 1 ± Server to run only in server mode or 2 ± Direct to run only in direct mode.

64 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting to Data Sources

Table 12: SQL Engine Tab Connection Options

Connection Options: SQL Default Engine

SQL Engine Mode on page 128 If set to 0 - Auto, the SQL engine attempts to run in server mode first; however, if server mode is unavailable, it runs in direct mode. If set to 1 - Server, the SQL engine runs in server mode.The SQL engine operates in a separate process from the driver within its own JVM. If the SQL engine is unavailable, the connection will fail. If set to 2 - Direct, the SQL engine runs in direct mode. The driver and its SQL engine run in a single process within the same JVM.

Important: When the SQL engine is configured to run in server mode (0-Auto | 1-Server), you must start the SQL Engine service before using the driver (see "Starting the SQL Engine Server" for more information). Multiple drivers on different clients can use the same service.

Important: Changes you make to the server mode configuration affect all DSNs sharing the service.

Default: 0 - Auto

JVM Arguments on page 115 A string that contains the arguments that are passed to the JVM that the driver is starting. The location of the JVM must be specified on the driver library path. Values that include special characters or spaces must be enclosed in curly braces { } when used in a connection string. Default: For the 32-bit driver when the SQL Engine Mode is set to 2 - Direct: -Xmx256m For all other configurations: -Xmx1024m

JVM Classpath on page 116 Specifies the CLASSPATH for the Java Virtual Machine (JVM) used by the driver. The CLASSPATH is the search string the JVM uses to locate the Java jar files the driver needs. Separate multiple jar files by a semi-colon on Windows platforms and by a colon on all other platforms. CLASSPATH values with multiple jar files must be enclosed in curly braces { } when used in a connection string.

Note: If no value is specified, the driver automatically detects the CLASSPATHs for all ODBC drivers installed on your machine.

Default: Empty String

When set to 0 - Auto or 1 ± Server, additional configuration settings that are specific to server mode are exposed in the setup dialog. The settings for server mode are read only in the Driver Setup Dialog. For a description of these settings, see the table below. To define the settings for server mode, click Edit Server Settings from the SQL Engine tab. The SQL Engine Service Setup dialog box appears.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 65 Chapter 5: Using the Driver

Caution: Modifying the Server Settings will affect all DSNs using this service.

Note: You must be an administrator to modify the server mode settings. Otherwise, the Edit Server Settings button does not appear on the SQL Engine tab.

You use the SQL Engine Service Setup dialog box to configure server mode and to start or stop the service. See "Configuring Server Mode" for detailed information.

Table 13: Server Mode Configuration Options

Configuration Options: Description SQL Engine Service

Server Port Number on page Specifies a valid port on which the SQL engine listens for requests from the 127 driver. Default: For the 32-bit driver: 19934 For the 64-bit driver: 19933

Java Path Specifies fully qualified path to the JVM executable that you want to use to run the SQL Engine Server. The path must not contain double quotation marks. Default: The fully qualified path to the Java SE 6 or higher JVM executable (java.exe)

If you finished configuring your driver, proceed to Step 6 on page 63 in "Data Source Configuration Using a GUI." Optionally, you can further configure your driver by clicking on the following tabs. The following sections provide details for the fields specific to each configuration tab:

· General tab allows you to configure options that are required for creating a data source. · Advanced tab allows you to configure advanced behavior. · Schema Map tab allows you to configure mapping behavior. · Security tab allows you to configure security settings.

See also Data Source Configuration Using a GUI on page 61

66 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting to Data Sources

Advanced Tab The Advanced Tab allows you to specify additional data source settings.The fields are optional unless otherwise noted. On this tab, provide values for the options in the following table; then, click Apply. Figure 3: Advanced tab

Connection Options: Description Advanced

Create Map on page 108 Determines whether the driver creates the internal files required for a relational view of the native data when establishing a connection. If set to 0 - No, the driver uses the current group of internal files specified by the Schema Map connection option. If the files do not exist, the connection fails. If set to 1 - ForceNew, the driver deletes the current group of files specified by the Schema Map connection option and creates new files in the same location. If set to 2 - NotExist, the driver uses the current group of files specified by the Schema Map connection option. If the files do not exist, the driver creates them. Default: 2 - NotExist

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 67 Chapter 5: Using the Driver

Connection Options: Description Advanced

Transaction Mode on page Specifies how the driver handles manual transactions. 129 If set to 0 - No Transactions, the data source and the driver do not support transactions. Metadata indicates that the driver does not support transactions. If set to 1 - Ignore, the data source does not support transactions and the driver always operates in auto- mode. Default: 0 - No Transactions

Read Consistency on page Specifies how many replicas must respond to a read request before returning 122 data to the client application. If set to 1 - one, data is returned from the closest replica. This setting provides the highest availability, but increases the likelihood of stale data being read. If set to 4 - quorum, data is returned after a quorum of replicas has responded from any data center. If set to 5 - all, data is returned to the application after all replicas have responded.This setting provides the highest consistency and lowest availability. Default: 4 - quorum

Write Consistency on page Determines the number of replicas on which the write must succeed before 131 returning an acknowledgment to the client application. If set to 1 - one, a write must succeed on at least one replica node. If set to 4 - quorum, a write must succeed on a quorum of replica nodes. If set to 5 - all, a write must succeed on all replica nodes in the cluster for that partition key.This setting provides the highest consistency and lowest availability. Default: 4 - quorum

Application Using Threads Determines whether the driver works with applications using multiple ODBC on page 104 threads. If disabled, the driver does not work with multi-threaded applications. If using the driver with single-threaded applications, this value avoids additional processing required for ODBC thread-safety standards. If set to enabled, the driver works with single-threaded and multi-threaded applications. Default: Enabled

Read Only on page 123 Specifies whether the connection has read-only access to the data source. If enabled, the connection has read-only access. If disabled, the connection is opened for read/write access, and you can use all commands supported by the product. Default: Disabled

68 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting to Data Sources

Connection Options: Description Advanced

Login Timeout on page 119 The number of seconds the driver waits for a connection to be established before returning control to the application and generating a timeout error. Default: 15

Fetch Size on page 112 Specifies the number of rows that the driver processes before returning data to the application when executing a Select. If set to 0, the driver fetches and processes all of the rows of the result before returning control to the application. If set to x, the driver limits the number of rows that may be processed and returned to the application for a single fetch request. Default: 100

Native Fetch Size on page Specifies the number of rows of data the driver attempts to fetch from the native 119 data source on each request submitted to the server. If set to 0, the driver requests that the server return all rows for each request submitted to the server. Block fetching is not used. If set to x, the driver attempts to fetch up to a maximum of the specified number of rows on each request submitted to the server. Default: 10000

Result Memory Size on page Specifies the maximum size, in megabytes, of an intermediate result set that 124 the driver holds in memory. If set to -1, the maximum size of an intermediate result set that the driver holds in memory is determined by a percentage of the max Java heap size. When this threshold is reached, the driver writes a portion of the result set to disk. If set to 0, the driver holds intermediate results in memory regardless of size. Setting Result Memory Size to 0 can increase performance for any result set that can easily fit within the JVM©s free heap space, but can decrease performance for any result set that can barely fit within the JVM©s free heap space. If set to x, the driver holds intermediate results in memory that are no larger than the size specified. When this threshold is reached, the driver writes a portion of the result set to disk. Default: -1

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 69 Chapter 5: Using the Driver

Connection Options: Description Advanced

Report Codepage Specifies how the driver handles code page conversion errors that occur when Conversion Errors on page a character cannot be converted from one character set to another. 124 If set to 0 - Ignore Errors, the driver substitutes 0x1A for each character that cannot be converted and does not return a warning or error. If set to 1 - Return Error, the driver returns an error instead of substituting 0x1A for unconverted characters. If set to 2 - Return Warning, the driver substitutes 0x1A for each character that cannot be converted and returns a warning. Default: 0 - Ignore Errors

Log Config File on page 118 Specifies the filename of the configuration file used to initialize the driver logging mechanism. Default: ddlogging.properties

Initialization String on page One or multiple SQL commands to be executed by the driver after it has 115 established the connection to the database and has performed all initialization for the connection. If the execution of a SQL command fails, the connection attempt also fails and the driver returns an error indicating which SQL command or commands failed. Default: None

Extended Options: Type a semi-colon separated list of connection options and their values. Use this configuration option to set the value of undocumented connection options that are provided by Progress DataDirect Customer Support.You can include any valid connection option in the Extended Options string, for example:

KeyspaceName=mykeyspace;UndocumentedOption1=value [;UndocumentedOption2=value;]

If the Extended Options string contains option values that are also set in the setup dialog or data source, the values of the options specified in the Extended Options string take precedence. However, connection options that are specified on a connection string override any option value specified in the Extended Options string. If you finished configuring your driver, proceed to Step 6 on page 63 in "Data Source Configuration on Windows." Optionally, you can further configure your driver by clicking on the following tabs.The following sections provide details on the fields specific to each configuration tab:

· General tab allows you to configure options that are required for creating a data source. · SQL Engine tab allows you to configure the SQL Engine©s behavior. · Schema Map tab allows you to configure mapping behavior. · Security tab allows you to configure security settings.

See also Data Source Configuration Using a GUI on page 61

70 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting to Data Sources

Schema Map Tab The Schema Map tab allows you to configure the options that control the relational mapping of your data. The fields are optional unless otherwise noted. On this tab, provide values for the options in the following table; then, click Apply. Figure 4: Schema Map tab

Connection Options: Description Advanced

Schema Map on page 126 Specifies the name and location of the configuration file where the relational map of native data is written. The driver looks for this file when connecting to a server. If the file does not exist, the driver creates one. Default: application_data_folder\Local\Progress\DataDirect \Cassandra_Schema\host_name.config

Ascii Size on page 105 Specifies the precision reported for ASCII columns in column and result-set metadata. Default: 4000

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 71 Chapter 5: Using the Driver

Connection Options: Description Advanced

Varchar Size on page 130 Specifies the precision reported for Varchar columns in column and result-set metadata. Default: 4000

Decimal Precision on page Specifies the precision reported for Decimal columns in column and result-set 110 metadata. Default: 38

Decimal Scale on page 110 Specifies the maximum scale reported for Decimal columns in column and result-set metadata. Default: 10

Varint Precision on page 130 Specifies the precision reported for Varint columns in column and result-set metadata. Default: 38

Config Options on page 107 Note: The Config Options connection option is not currently supported by the driver.

Determines how the mapping of the native data model to the relational data model is configured, customized, and updated. Default:None

If you finished configuring your driver, proceed to Step 6 on page 63 in "Data Source Configuration on Windows." Optionally, you can further configure your driver by clicking on the following tabs.The following sections provide details on the fields specific to each configuration tab:

· General tab allows you to configure options that are required for creating a data source. · SQL Engine tab allows you to configure the SQL Engine©s behavior. · Advanced tab allows you to configure advanced behavior. · Security tab allows you to configure security settings.

See also Data Source Configuration Using a GUI on page 61

72 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting to Data Sources

Security Tab The Security tab allows you to specify your security settings. The fields are optional unless otherwise noted. On this tab, provide values for the options in the following table; then, click Apply. Figure 5: Security tab

Connection Options: Description Advanced

User Name on page 129 The default user ID that is used to connect to your database. Default: None

Authentication Method on Specifies the method the driver uses to authenticate the user to the server when page 106 a connection is established. If the specified authentication method is not supported by the database server, the connection fails and the driver generates an error. If set to -1 - No Authentication, the driver sends the user ID and password in clear text to the server for authentication. If set to 0 - User ID/Password, the driver sends the user ID in clear text and an encrypted password to the server for authentication. Default: 0 - User ID/Password

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 73 Chapter 5: Using the Driver

If you finished configuring your driver, proceed to Step 6 on page 63 in "Data Source Configuration on Windows." Optionally, you can further configure your driver by clicking on the following tabs.The following sections provide details on the fields specific to each configuration tab:

· General tab allows you to configure options that are required for creating a data source. · SQL Engine tab allows you to configure the SQL Engine©s behavior. · Advanced tab allows you to configure advanced behavior. · Schema Map tab allows you to configure mapping behavior.

See also Data Source Configuration Using a GUI on page 61

Using a Connection String

If you want to use a connection string for connecting to a database, or if your application requires it, you must specify either a DSN (data source name), a File DSN, or a DSN-less connection in the string. The difference is whether you use the DSN=, FILEDSN=, or the DRIVER= keyword in the connection string, as described in the ODBC specification. A DSN or FILEDSN connection string tells the driver where to find the default connection information. Optionally, you may specify attribute=value pairs in the connection string to override the default values stored in the data source. The DSN connection string has the form:

DSN=data_source_name[;attribute=value[;attribute=value]...]

The FILEDSN connection string has the form:

FILEDSN=filename.dsn[;attribute=value[;attribute=value]...]

The DSN-less connection string specifies a driver instead of a data source. All connection information must be entered in the connection string because the information is not stored in a data source. The DSN-less connection string has the form:

DRIVER=[{]driver_name[}][;attribute=value[;attribute=value]...]

"Connection Option Descriptions" lists the long and short names for each attribute, as well as the initial default value when the driver is first installed.You can specify either long or short names in the connection string. An example of a DSN connection string with overriding attribute values for Apache Cassandra for Linux/UNIX/Windows is:

DSN=Apache Cassandra;UID=JOHN;PWD=XYZZY

A FILEDSN connection string is similar except for the initial keyword:

FILEDSN=Apache Cassandra;UID=JOHN;PWD=XYZZY

A DSN-less connection string must provide all necessary connection information:

DRIVER=DataDirect Apache Cassandra;UID=JOHN;PWD=XYZZY;HOST=CassandraServer;PORT=9042;DB=Cassandra1; SM=/home/users/jsmith/Progress/DataDirect/Cassandra_Schema/CassandraServer.config

74 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Configuring and Connecting to Data Sources

See also Connection Option Descriptions on page 101

Using a Logon Dialog Box

Some ODBC applications display a logon dialog box when you are connecting to a data source. In these cases, the host name has already been specified. Figure 6: Logon to Apache Cassandra dialog box

In this dialog box, provide the following information:

1. In the Host Name field, type the name or the IP address of the server to which you want to connect to which you want to connect. 2. In the Port Number field, type the port number of the server listener. 3. In the Schema Map field, type the name and location of the configuration file where the relational map of native data is written. See "Schema Map" for details. 4. In the Keyspace Name field, type the name of the keyspace to which you want connect. 5. Type your logon ID in the User Name field. 6. Type your password in the Password field. 7. From the Authentication Method drop-down box, select one the following:

· For no authentication, select -1 - No Authentication. · To authenticate by passing the user ID in clear text and an encrypted password, select 0 - User ID/Password.

8. Click OK to complete the logon.

See also Schema Map on page 126

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 75 Chapter 5: Using the Driver

Performance Considerations

Application Using Threads (ApplicationUsingThreads): The driver coordinates concurrent database operations (operations from different threads) by acquiring locks. Although locking prevents errors in the driver, it also decreases performance. If your application does not make ODBC calls from different threads, the driver has no reason to coordinate operations. In this case, the ApplicationUsingThreads attribute should be disabled (set to 0).

Note: If you are using a multi-threaded application, you must enable the Application Using Threads option.

Fetch Size (FetchSize) and Native Fetch Size (NativeFetchSize): The connection options Fetch Size and Native Fetch Size can be used to adjust the trade-off between throughput and response time. In general, setting larger values for Fetch Size and Native Fetch Size will improve throughput, but can reduce response time. For example, if an application attempts to fetch 100,000 rows from the native data source and Native Fetch Size is set to 500, the driver must make 200 round trips across the network to get the 100,000 rows. If, however, Native Fetch Size is set to 10000, the driver only needs to make 10 round trips to retrieve 100,000 rows. Network round trips are expensive, so generally, minimizing these round trips increases throughput. For many applications, throughput is the primary performance measure, but for interactive applications, such as Web applications, response time (how fast the first set of data is returned) is more important than throughput. For example, suppose that you have a Web application that displays data 50 rows to a page and that, on average, you view three or four pages. Response time can be improved by setting Fetch Size to 50 (the number of rows displayed on a page) and Native Fetch Size to 200. With these settings, the driver fetches all of the rows from the native data source that you would typically view in a single session and only processes the rows needed to display the first page.

Note: Fetch Size provides a suggestion to the driver as to the number of rows it should internally process before returning control to the application.The driver may fetch fewer rows to conserve memory when processing exceptionally wide rows.

JVM Arguments (JVMArgs): Used in conjunction with the Result Memory Size connection option, you can address memory and performance concerns by adjusting the max Java heap size using the JVM Arguments connection option. By increasing the max Java heap size, you increase the amount of data the driver accumulates in memory. This can reduce the likelihood of out-of-memory errors and improve performance by ensuring that result sets fit easily within the JVM©s free heap space. In addition, when a limit is imposed by setting Result Memory Size to -1, increasing the max Java heap size can improve performance by reducing the need to write to disk, or removing it altogether. Read Consistency (ReadConsistency): The Read Consistency connection option manages the trade-off between the consistency and availability of your data. By setting Read Consistency to all, data is returned to the application only after all replicas have responded.With this setting, the data returned is highly consistent. However, it may take longer for data to be returned to the application, and, in some scenarios, operation timeouts can occur. In contrast, setting Read Consistency to quorum (default) or one reduces the number of replicas required to respond to a read request. While the data may not be as consistent, results are returned more quickly to the application. Result Memory Size (ResultMemorySize): Result Memory Size can affect performance in two main ways. First, if the size of the result set is larger than the value specified for Result Memory Size, the driver writes a portion of the result set to disk. Since writing to disk is an expensive operation, performance losses will be incurred. Second, when you remove any limit on the size of an intermediate result set by setting Result Memory Size to 0, you can realize performance gains for result sets that easily fit within the JVM©s free heap space. However, the same setting can diminish performance for result sets that barely fit within the JVM©s free heap space.

76 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Using the SQL Engine Server

Write Consistency (WriteConsistency): The Write Consistency connection option determines the number of replicas on which the write must succeed before returning an acknowledgment to the client application. By setting this option to all, a write must succeed on all replica nodes in the cluster for that partition key. This setting provides high consistency. However, it may take longer for the successful execution of a write operation. In contrast, setting Write Consistency to quorum (default) or one reduces the number of replicas that must acknowledge the completion of a write command. While the data may not be as consistent across clusters, the write operation is completed more quickly.

See also Application Using Threads on page 104 Fetch Size on page 112 Native Fetch Size on page 119 JVM Arguments on page 115 Read Consistency on page 122 Result Memory Size on page 124 Write Consistency on page 131

Using the SQL Engine Server

Some applications may experience problems loading the JVM required for the SQL engine because the process exceeds the available heap space. If your application experiences problems loading the JVM, you can configure the driver to operate in server mode. In direct mode, the driver operates with the SQL engine and JVM running in a single process. While in server mode, the driver©s SQL engine runs in a separate process with its own JVM instead of trying to load the SQL engine and JVM in the same process used by the driver. For Windows, the driver is configured to attempt to run in server mode first by default. However, if server mode is unavailable, the SQL engine will failover to run in direct mode. For non-Windows platforms, the driver operates in direct mode by default.

Note: You must be an administrator to start or stop the service, or to configure any settings for the service.

See the following sections for details on configuring the SQL Engine Server on your platform.

Configuring the SQL Engine Server on Windows

The following sections describe how to configure, start, and stop the SQL Engine Server on Windows platforms. On Windows, the driver is configured to run in Auto mode by default. This means that driver attempts to run in server mode first; however, if server mode is unavailable, the SQL engine will failover to run in direct mode.

Configuring Server Mode on Windows

1. Set the SQL Engine Mode connection option to a value of 0 - Auto or 1 - Server. All fields on the SQL Engine tab become read only, and the Edit Server Settings button appears.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 77 Chapter 5: Using the Driver

Note: Server mode is enabled when the SQL Engine Mode connection option is set to 0 - Auto or 1- Server. When set 0 - Auto, the SQL engine attempts to run in server mode first, but will failover to direct mode if server mode is unavailable. When set to 1 - Server, the SQL engine mode runs exclusively in server mode.

2. Click Edit Server Setting to display the ODBC Cassandra SQL Engine Service Setup dialog box. Use this dialog box to define settings for Server Mode and to start and stop the Progress DataDirect Cassandra SQL Engine service. The SQL Engine Service Setup dialog box appears.

JVM Arguments: A string that contains the arguments that are passed to the JVM that the driver is starting. The location of the JVM must be specified on your PATH. See "JVM Arguments." JVM Class Path: Specifies the CLASSPATH for the JVM used by the driver. See "JVM Classpath." Server Port Number: Specifies a valid port on which the SQL engine listens for requests from the driver. By default, the server listens on port 19933 for 64-bit installations and 19934 for 32-bit installations. See "Server Port Number" for more information. Java Path: Specifies fully qualified path to the Java SE 6 or higher JVM executable that you want to use to run the SQL Engine Server. The path must not contain double quotation marks. Services: Shows the Cassandra ODBC SQL engine service that runs as a separate process instead of being loaded within the process of an ODBC application. Start (Stop): Starts or stops the Cassandra service. A message window is displayed, confirming that the Cassandra service was started or stopped. Apply: Applies the changes.

3. When you complete your changes, click Apply. 4. Click OK to save the changes and return to the SQL Engine tab or click Cancel.

78 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Using the SQL Engine Server

See also JVM Arguments on page 115 JVM Classpath on page 116 Server Port Number on page 127

Starting the SQL Engine Server on Windows In server mode, you must start the SQL engine server before using the driver. Before starting the SQL engine server, choose a directory to store the local database files. Make sure that you have the correct permissions to write to this directory. By default, the JVM Classpath is set to the cassandra.jar file in the installation directory. To start the SQL engine server:

1. Start the ODBC Administrator by selecting its icon from the Progress DataDirect for ODBC program group. 2. Select a tab:

· User DSN: If you are configuring an existing user data source, select the data source name and click Configure to display the driver Setup dialog box. If you are configuring a new user data source, click Add to display a list of installed drivers. Select the driver and click Finish to display the driver Setup dialog box.

· System DSN: If you are configuring an existing system data source, select the data source name and click Configure to display the driver Setup dialog box. If you are configuring a new system data source, click Add to display a list of installed drivers. Select the driver and click Finish to display the driver Setup dialog box.

· File DSN: If you are configuring an existing file data source, select the data source file and click Configure to display the driver Setup dialog box. If you are configuring a new file data source, click Add to display a list of installed drivers; then, select a driver. Click Advanced if you want to specify attributes; otherwise, click Next to proceed. Specify a name for the data source and click Next. Verify the data source information; then, click Finish to display the driver Setup dialog box.

3. On the ODBC Cassandra Driver Setup dialog box, select the SQL Engine tab; then, select 0 - Auto or 1- Server from the SQL Engine Mode drop-down list.

Note: Server mode is enabled when the SQL Engine Mode connection option is set to 0 - Auto or 1- Server. When set 0 - Auto, the SQL engine attempts to run in server mode first, but will failover to direct mode if server mode is unavailable. When set to 1 - Server, the SQL engine mode runs exclusively in server mode.

4. Click Edit Server Settings. 5. When you complete your changes, click Apply. 6. Verify that Progress DataDirect Cassandra SQL Engine is selected in the Services drop-down list, and then, click Start to start the service. A message window appears to confirm that the service is running. Click OK. 7. Click OK to close the ODBC Cassandra SQL Engine Service Setup dialog box.

Note: If you made changes after starting the service, a message window is displayed:

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 79 Chapter 5: Using the Driver

If you want the service to run with the new settings, click No. Then, click Stop to stop the service, and then click Start to restart the service. Then, click OK to close the ODBC Cassandra SQL Engine Service Setup dialog box.

Stopping the SQL Engine Server on Windows To stop the SQL engine server:

1. Open the ODBC Cassandra Driver Setup dialog box and select the SQL Engine tab. 2. Select 0 - Auto or 1 - Server from the SQL Engine Mode drop-down list. Then, click Edit Server Settings.

Note: Server mode is enabled when the SQL Engine Mode connection option is set to 0 - Auto or 1- Server. When set 0 - Auto, the SQL engine attempts to run in server mode first, but will failover to direct mode if server mode is unavailable. When set to 1 - Server, the SQL engine mode runs exclusively in server mode.

3. Click Stop to stop the service. A message window appears to confirm that the service is stopped. Click OK. 4. Click OK to close the ODBC Cassandra SQL Engine Service Setup dialog box.

Configuring the SQL Engine Server on UNIX/Linux

The following sections describe how to configure, start, and stop the SQL Engine Server on UNIX and Linux platforms. By default, the driver operates in direct mode by default on UNIX and Linux platforms.

Configuring and Starting the SQL Engine Server on UNIX/Linux In server mode, you must start the SQL engine server before using the driver. Be aware that you must have permissions to write to the directory specified by the SchemaMap option to start the SQL engine server. To configure the SQL engine server, specify values for the Java options in the following JVM argument:

java -Xmxm -cp "" com.ddtek.cassandracloud..Server -port -Dhttp.proxyHost= -Dhttp.proxPort= -Dhttp.proxyUser= -Dhttp.proxyPassword=

See the "SQL Engine Server Java Options" table for a description of these options. For example:

java -Xmx1024m -cp "/opt/Progress/DataDirect/ODBC_80_64bit/java/lib/cassandra.jar" com.ddtek.cassandracloud.sql.Server -port 19933 [email protected] -Dhttp.proxPort=12345 -Dhttp.proxyUser=JohnQPublic -Dhttp.proxyPassword=secret

To start the SQL engine service, execute the JVM Argument after configuring the Java options. A confirmation message is returned once the server is online.

80 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Using the SQL Engine Server

Table 14: SQL Engine Server Java Options

Java Option Description

Required Java Options

-cp Specifies the CLASSPATH for the Java Virtual Machine (JVM) used by the driver. The CLASSPATH is the search string the JVM uses to locate the Java jar files the driver needs. The Cassandra driver©s JVM is located on the following path: install_dir/java/lib/cassandra.jar

-port Specifies a valid port on which the SQL engine listens for requests from the driver. We recommend specifying one of the following values:

· 19934 (32-bit drivers) · 19933 (64-bit drivers)

Optional Java Options

-Xmx Specifies the maximum memory heap size, in megabytes, for the JVM. The default size is determined by your JVM. We recommend specifying a size no smaller than 1024.

Note: Although this option is not required to start the SQL engine server, we highly recommend specifying a value.

-Dhttp.proxyHost Specifies the Hostname of the Proxy Server. The value specified can be a host name, a fully qualified domain name, or an IPv4 or IPv6 address.

-Dhttp.proxyPort Specifies the port number where the Proxy Server is listening for HTTP and/or HTTPS requests.

-Dhttp.proxyUser Specifies the user name needed to connect to the Proxy Server.

-Dhttp.proxyPassword Specifies the password needed to connect to the Proxy Server.

Stopping the SQL Engine Server on UNIX/Linux To stop the SQL engine server, choose one of the following:

· Using an application, execute SHUTDOWN SQL. · From a command line, press Ctrl + C. If successful, a message is returned to confirm that the service has stopped.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 81 Chapter 5: Using the Driver

Configuring Java Logging for the SQL Engine Server

Java logging can be configured by placing a logging configuration file named ddlog.properties in the directory specified by the SchemaMap option (see "Schema Map" for details). The simple way to create one of these is to make a copy of the ddlog.properties file, which is located in your driver installation directory, in the install_dir/Sample/Example subdirectory. For more information on logging in Cassandra, see "Configuring Logging."

See also Schema Map on page 126 Configuring Logging on page 91

Using Failover in a Cluster

To ensure availability to Cassandra data sources, the driver supports connection failover and client load balancing when connecting to clusters: · Connection failover provides protection for new connections to a cluster. If a member node of the cluster is unavailable, for example, because of hardware failure or traffic overload, the driver fails over to another member node to attempt the connection. If a connection to a node is dropped, the driver does not fail over the connection. · Client load balancing helps distribute new connections in your environment so that no one server is overwhelmed with connection requests. When client load balancing is enabled, the order in which member nodes are tried is random. You can enable connection failover and load balancing by specifying a list of the member nodes for your cluster using the Cluster Nodes (ClusterNodes) option. The list should be comprised of a comma-separated server names or IP addresses for your member nodes and their corresponding port numbers. Note that IPv6 values must be wrapped in double quotation marks. For example:

ClusterNodes=server1:9042,255.125.1.11:9043,"2001:DB8:0000:0000:8:800:200C:417A":9044

Note: Specifying a value for the Cluster Nodes option can be used in place of or in addition to the values of the Host Name (HostName) and Port Number (PortNumber) options. If values are specified all of these options, the node specified by the Host Name and Port Number are included in the list of nodes to which the driver randomly attempts to connect.

When connection failover and load balancing are enabled, the driver randomly selects a node from the list to first attempt a connection. If that connection fails, the driver again randomly selects another node from this list until all the nodes in the list have been tried or a connection is successfully established.

See also Cluster Nodes on page 106

82 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Using Identifiers

Using Identifiers

Identifiers are used to refer to objects exposed by the driver, such as tables and columns. The driver supports both quoted and unquoted identifiers for naming objects. The maximum length of both quoted and unquoted identifiers is 48 characters for table names and 128 characters for column names. Quoted identifiers must be enclosed in double quotation marks ("").The characters supported in quoted identifiers depends on the version of Cassandra being used. For details on valid characters, refer to the Cassandra documentation for your database version. Naming conflicts can arise from restrictions imposed by third party applications, from the normalization of native data, or from the truncation of object names. The driver avoids naming conflicts by appending an underscore separator and integer (for example, _1) to identifiers with the same name. For example, if a third party application restricts the naming of three columns such that each column retains the name address, the driver would expose the columns in the following manner: · address · address_1 · address_2

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 83 Chapter 5: Using the Driver

84 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 6

Troubleshooting

This part guides you through troubleshooting your Progress DataDirect for ODBC for Apache Cassandra driver. It provides you with solutions to common problems and documents error messages that you may receive.

For details, see the following topics:

· Diagnostic Tools · Error Messages · Troubleshooting

Diagnostic Tools

This chapter discusses the diagnostic tools you use when configuring and troubleshooting your ODBC environment.

ODBC Trace

ODBC tracing allows you to trace calls to ODBC drivers and create a log of the traces.

Creating a Trace Log Creating a trace log is particularly useful when you are troubleshooting an issue. To create a trace log:

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 85 Chapter 6: Troubleshooting

1. Enable tracing (see "Enabling Tracing" for more information). 2. Start the ODBC application and reproduce the issue. 3. Stop the application and turn off tracing. 4. Open the log file in a text editor and review the output to help you debug the problem.

For a complete explanation of tracing, refer to the following Progress DataDirect Knowledgebase document: http://knowledgebase.progress.com/articles/Article/3049

See also Enabling Tracing on page 86

Enabling Tracing Progress DataDirect provides a tracing library that is enhanced to operate more efficiently, especially in production environments, where log files can rapidly grow in size. The DataDirect tracing library allows you to control the size and number of log files. On Windows, you can enable tracing through the Tracing tab of the ODBC Data Source Administrator. On UNIX and Linux, you can enable tracing by directly modifying the [ODBC] section in the system information (odbc.ini) file.

Windows ODBC Administrator

On Windows, open the ODBC Data Source Administrator and select the Tracing tab. To specify the path and name of the trace log file, type the path and name in the Log File Path field or click Browse to select a log file. If no location is specified, the trace log resides in the working directory of the application you are using. Click Select DLL in the Custom Trace DLL pane to select the DataDirect enhanced tracing library, xxtrcyy.dll, where xx represents either iv (32-bit version) or dd (64-bit version), and yy represents the driver level number, for example, ivtrc28.dll.The library is installed in the \Windows\System32 directory. After making changes on the Tracing tab, click Apply for them to take effect. Enable tracing by clicking Start Tracing Now. Tracing continues until you disable it by clicking Stop Tracing Now. Be sure to turn off tracing when you are finished reproducing the issue because tracing decreases the performance of your ODBC application. When tracing is enabled, information is written to the following trace log files: · Trace log file (trace_filename.log) in the specified directory. · Trace information log file (trace_filenameINFO.log). This file is created in the same directory as the trace log file and logs the following SQLGetInfo information: · SQL_DBMS_NAME · SQL_DBMS_VER · SQL_DRIVER_NAME · SQL_DRIVER_VER · SQL_DEFAULT_TXN_ISOLATION

86 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Diagnostic Tools

The DataDirect enhanced tracing library allows you to control the size and number of log files. The file size limit of the log file (in KB) is specified by the Windows Registry key ODBCTraceMaxFileSize. Once the size limit is reached, a new log file is created and logging continues in the new file until it reaches its file size limit, after which another log file is created, and so on. The maximum number of files that can be created is specified by the Registry key ODBCTraceMaxNumFiles. Once the maximum number of log files is created, tracing reopens the first file in the sequence, deletes the content, and continues logging in that file until the file size limit is reached, after which it repeats the process with the next file in the sequence. Subsequent files are named by appending sequential numbers, starting at 1 and incrementing by 1, to the end of the original file name, for example, SQL1.LOG, SQL2.LOG, and so on. The default values of ODBCTraceMaxFileSize and ODBCTraceMaxNumFiles are 102400 KB and 10, respectively. To change these values, add or modify the keys in the following Windows Registry section: [HKEY_CURRENT_USER\SOFTWARE\ODBC\ODBC.INI\ODBC]

Warning: Do not edit the Registry unless you are an experienced user. Consult your system administrator if you have not edited the Registry before.

Edit each key using your values and close the Registry.

System Information (odbc.ini) File

The [ODBC] section of the system information file includes several keywords that control tracing:

Trace=[0 | 1] TraceFile=trace_filename TraceDll=ODBCHOME/lib/xxtrcyy.zz ODBCTraceMaxFileSize=file_size ODBCTraceMaxNumFiles=file_number TraceOptions=0

where:

Trace=[0 | 1]

Allows you to enable tracing by setting the value of Trace to 1. Disable tracing by setting the value to 0 (the default). Tracing continues until you disable it. Be sure to turn off tracing when you are finished reproducing the issue because tracing decreases the performance of your ODBC application.

TraceFile=trace_filename

Specifies the path and name of the trace log file. If no path is specified, the trace log resides in the working directory of the application you are using.

TraceDll=ODBCHOME/lib/xxtrcyy.zz

Specifies the library to use for tracing. The driver installation includes a DataDirect enhanced library to perform tracing, xxtrcyy.zz, where xx represents either iv (32-bit version) or dd (64-bit version), yy represents the driver level number, and zz represents either so or sl. For example, ivtrc28.so is the 32-bit version of the library. To use a custom shared library instead, enter the path and name of the library as the value for the TraceDll keyword. The DataDirect enhanced tracing library allows you to control the size and number of log files with the ODBCTraceMaxFileSize and ODBCTraceMaxNumFiles keywords.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 87 Chapter 6: Troubleshooting

ODBCTraceMaxFileSize=file_size

The ODBCTraceMaxFileSize keyword specifies the file size limit (in KB) of the log file. Once this file size limit is reached, a new log file is created and logging continues in the new file until it reaches the file size limit, after which another log file is created, and so on. The default is 102400.

ODBCTraceMaxNumFiles=file_number

The ODBCTraceMaxNumFiles keyword specifies the maximum number of log files that can be created. The default is 10. Once the maximum number of log files is created, tracing reopens the first file in the sequence, deletes the content, and continues logging in that file until the file size limit is reached, after which it repeats the process with the next file in the sequence. Subsequent files are named by appending sequential numbers, starting at 1 and incrementing by 1, to the end of the original file name, for example, odbctrace1.out, odbctrace2.out, and so on.

TraceOptions=[0 | 1 | 2 | 3]

The ODBCTraceOptions keyword specifies whether to print the current timestamp, parent process ID, process ID, and thread ID for all ODBC functions to the output file. The default is 0. · If set to 0, the driver uses standard ODBC tracing. · If set to 1, the log file includes a timestamp on ENTRY and EXIT of each ODBC function. · If set to 2, the log file prints a header on every line. By default, the header includes the parent process ID and process ID. · If set to 3, both TraceOptions=1 and TraceOptions=2 are enabled. The header includes a timestamp as well as a parent process ID and process ID. Example In the following example of trace settings, tracing has been enabled, the name of the log file is odbctrace.out, the library for tracing is ivtrc28.so, the maximum size of the log file is 51200 KB, and the maximum number of log files is 8. Timestamp and other information is included in odbctrace.out.

Trace=1 TraceFile=ODBCHOME/lib/odbctrace.out TraceDll=ODBCHOME/lib/ivtrc28.so ODBCTraceMaxFileSize=51200 ODBCTraceMaxNumFiles=8 TraceOptions=3

The Test Loading Tool

Before using the test loading tool, be sure that your environment variables are set correctly. See "Environment Variables" for details about environment variables. The ivtestlib (32-bit drivers) and ddtestlib (64-bit drivers) test loading tools are provided to test load drivers and help diagnose configuration problems in the UNIX and Linux environments, such as environment variables not correctly set or missing database client components.This tool is installed in the /bin subdirectory in the product installation directory. It attempts to load a specified ODBC driver and prints out all available error information if the load fails. For example, if the drivers are installed in /opt/odbc/lib, the following command attempts to load the 32-bit driver on Solaris, where xx represents the version number of the driver:

ivtestlib /opt/odbc/lib/ivcsndrxx.so

88 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Diagnostic Tools

Note: On Solaris, AIX, and Linux, the full path to the driver does not have to be specified for the tool. The HP-UX version, however, requires the full path.

If the load is successful, the tool returns a success message along with the version string of the driver. If the driver cannot be loaded, the tool returns an error message explaining why. See "Version String Information" for details about version strings.

See also Environment Variables on page 52 Version String Information on page 34

ODBC Test

On Windows, Microsoft® ships with its ODBC SDK an ODBC-enabled application, named ODBC Test, that you can use to test ODBC drivers and the ODBC Driver Manager. ODBC 3.52 includes both ANSI and Unicode-enabled versions of ODBC Test. To use ODBC Test, you must understand the ODBC API, the C language, and SQL. For more information about ODBC Test, refer to the Microsoft ODBC SDK Guide.

Logging

The driver for Apache Cassandra provides a flexible and comprehensive logging mechanism of its Java components that allows logging to be incorporated seamlessly with the logging of your application or enabled and configured independently from the application.The logging mechanism can be instrumental in investigating and diagnosing issues. It also provides valuable insight into the type and number of operations requested by the application from the driver and requested by the driver from the remote data source. This information can help you tune and optimize your application.

Logging Components The driver uses the Java Logging API to configure and control the loggers (individual logging components) used by the driver. The Java Logging API is built into the JVM. The Java Logging API allows applications or components to define one or more named loggers. Messages written to the loggers can be given different levels of importance. For example, warnings that occur in the driver can be written to a logger at the WARNING level, while progress or flow information can be written to a logger at the INFO or FINER level. Each logger used by the driver can be configured independently.The configuration for a logger includes what level of log messages are written, the location to which they are written, and the format of the log message. The Java Logging API defines the following levels:

· SEVERE · CONFIG · FINE

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 89 Chapter 6: Troubleshooting

· FINER · FINEST · INFO · WARNING

Note: Log messages logged by the driver only use the CONFIG, FINE, FINER, and FINEST logging levels.

Setting the log threshold of a logger to a particular level causes the logger to write log messages of that level and higher to the log. For example, if the threshold is set to FINE, the logger writes messages of levels FINE. CONFIG, and SEVERE to its log. Messages of level FINER or FINEST are not written to the log. The driver exposes loggers for the following functional areas:

· Driver to SQL Communication · SQL Engine · Web service adapter

Driver to SQL Communication Logger

Name datadirect.cloud.drivercommunication

Description Logs all calls made by the driver to the SQL Engine and the responses from the SQL Engine back to the driver.

Message Levels CONFIG - Errors and Warnings encountered by the communication protocol are logged at this level. FINER - The message type and arguments for requests and responses sent between the driver and SQL Engine are logged at this level. Data transferred between the driver and SQL Engine is not logged. FINEST - Data transferred between the driver and SQL Engine is logged at this level.

Default OFF

SQL Engine Logger

Name datadirect.cloud.sql.level

Description Logs the operations that the SQL engine performs while executing a query. Operations include preparing a statement to be executed, executing the statement, and fetching the data, if needed. These are internal operations that do not necessarily directly correlate with Web service calls made to the remote data source.

90 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Diagnostic Tools

Message Levels CONFIG - Any errors or warnings detected by the SQL engine are written at this level. FINE - In addition to the same information logged by the CONFIG level, SQL engine operations are logged at this level. In particular, the SQL statement that is being executed is written at this level. FINER - In addition to the same information logged by the CONFIG and FINE levels, data sent or received in the process of performing an operation is written at this level.

Wire Protocol Adapter Logger

Name datadirect.cloud.adapter.level

Description Logs the calls the driver makes to the remote data source and the responses it receives from the remote data source.

Message Levels CONFIG - Any errors or warnings detected by the wire protocol adapter are written at this level. FINE - In addition to the information logged by the CONFIG level, information about calls made by the wire protocol adapter and responses received by the wire protocol adapter are written at this level. In particular, the calls made to execute the query and the calls to fetch or send the data are logged. The log entries for the calls to execute the query include the Apache Cassandra specific query being executed. The actual data sent or fetched is not written at this level. FINER - In addition to the information logged by the CONFIG and FINE levels, this level provides additional information. FINEST - In addition to the information logged by the CONFIG, FINE, and FINER levels, data associated with the calls made by the wire protocol adapter is written.

Configuring Logging

You can configure logging using a standard Java properties file in either of the following ways:

· Using the properties file that is shipped with your JVM. See Using the JVM on page 91 for details. · Using the driver. See Using the Driver on page 92 for details.

Using the JVM If you want to configure logging using the properties file that is shipped with your JVM, use a text editor to modify the properties file in your JVM. Typically, this file is named logging.properties and is located in the JRE/lib subdirectory of your JVM. The JRE looks for this file when it is loading. You can also specify which properties file to use by setting the java.util.logging.config.file system property. At a command prompt, enter:

java -Djava.util.logging.config.file=properties_file

where properties_file is the name of the properties file you want to load.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 91 Chapter 6: Troubleshooting

Using the Driver If you want to configure logging using the driver, you can use either of the following approaches:

· Use a single properties file for all Apache Cassandra connections. · Use a different properties file for each schema map. For example, if you have two definitions (johnsmith.xxx and pattijohnson.xxx, for example), you can load one properties file for the johnsmith.xxx database and load another properties file for the pattijohnson.xxx database.

Note: By default, the name of the schema map is the user ID specified for the connection.You can specify the name of the schema map using the SchemaMap attribute. See "Connection Option Descriptions" for details on using LogConfigFile and other connection options.

By default, the driver looks for the file named ddlogging.properties in the current working directory to load for all Apache Cassandra connections. If the SQLEngineMode connection option is set to Server, the driver uses the ddlogging.properties file that is specified by the SchemaMap connection option. If a properties file is specified for the LogConfigFile connection option, the driver uses the following process to determine which file to load: 1. The driver looks for the file specified by the LogConfigFile connection option. 2. If the driver cannot find the file in Step 1 on page 92, it looks for a properties file named database_name.logging.properties in the directory containing the embedded database for the connection, where database_name is the name of the embedded database. 3. If the driver cannot find the file in Step 2 on page 92, it looks for a properties file named ddlog.properties in the current working directory. 4. If the driver cannot find the file in Step 3 on page 92, it abandons its attempt to load a properties file. If any of these files exist, but the logging initialization fails for some reason while using that file, the driver writes a warning to the standard output (System.out), specifying the name of the properties file being used. A sample properties filenamed ddlogging.properties is installed in the install_dir\samples subdirectory of your product installation directory, where install_dir is your product installation directory. For example, you can find the ddlogging.properties file in install_dir\Samples\Bulkstrm, install_dir\Samples\Bulk, and install_dir\Samples\Example.You can copy this file to the current working directory of your application or embedded database directory, and modify it using a text editor for your needs.

See also Connection Option Descriptions on page 101

The demoodbc Application

DataDirect provides a simple C application, named demoodbc, that is useful for:

· Executing SELECT * FROM emp, where emp is a database table.The scripts for building the emp database tables (one for each supported database) are in the demo subdirectory in the product installation directory. · Testing database connections. · Creating reproducibles. · Persisting data to an XML data file.

92 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Error Messages

The demoodbc application is installed in the /samples/demo subdirectory in the product installation directory. Refer to demoodbc.txt or demoodbc64.txt in the demo directory for an explanation of how to build and use this application.

The Example Application

Progress DataDirect provides a simple C application, named example, that is useful for:

· Executing any type of SQL statement · Testing database connections · Testing SQL statements · Verifying your database environment The example application is installed in the /samples/example subdirectory in the product installation directory. Refer to example.txt or example64.txt in the example directory for an explanation of how to build and use this application.

Other Tools

The Progress DataDirect Support Web site provides other diagnostic tools that you can download to assist you with troubleshooting.These tools are not shipped with the product. Refer to the Progress DataDirect Web page: https://www.progress.com/support/evaluation/download-resources/download-tools Progress DataDirect also provides a knowledgebase that is useful in troubleshooting problems. Refer to the Progress DataDirect Knowledgebase page: http://progresscustomersupport-survey.force.com/ConnectKB

Error Messages

Error messages can be generated from:

· ODBC driver · Database system · ODBC driver manager An error reported on an ODBC driver has the following format:

[vendor] [ODBC_component] message

where ODBC_component is the component in which the error occurred. For example, an error message from the Progress DataDirect for ODBC for Apache Cassandra driver would look like this:

[DataDirect] [ODBC Apache Cassandra Driver] Invalid precision specified.

If you receive this type of error, check the last ODBC call made by your application for possible problems or contact your ODBC application vendor.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 93 Chapter 6: Troubleshooting

An error that occurs in the data source includes the data store name, in the following format:

[vendor] [ODBC_component] [data_store] message

With this type of message, ODBC_component is the component that received the error specified by the data store. For example, you may receive the following message from a Apache Cassandra database:

[DataDirect] [ODBC Apache Cassandra Driver] [Apache Cassandra] Specified length too long for CHAR column

This type of error is generated by the database system. Check your Apache Cassandra system documentation for more information or consult your database administrator.

On Windows, the Microsoft Driver Manager is a DLL that establishes connections with drivers, submits requests to drivers, and returns results to applications. An error that occurs in the Driver Manager has the following format:

[vendor] [ODBC XXX] message

For example, an error from the Microsoft Driver Manager might look like this:

[Microsoft] [ODBC Driver Manager] Driver does not support this function

If you receive this type of error, consult the Programmer's Reference for the Microsoft ODBC Software Development Kit available from Microsoft.

On UNIX and Linux, the Driver Manager is provided by Progress DataDirect. For example, an error from the DataDirect Driver Manager might look like this:

[DataDirect][ODBC lib] String data code page conversion failed.

UNIX and Linux error handling follows the X/Open XPG3 messaging catalog system. Localized error messages are stored in the subdirectory:

locale/localized_territory_directory/LC_MESSAGES

where localized_territory_directory depends on your language. For instance, German localization files are stored in locale/de/LC_MESSAGES, where de is the locale for German. If localized error messages are not available for your locale, then they will contain message numbers instead of text. For example:

[DataDirect] [ODBC 20101 driver] 30040

Troubleshooting

If you are having an issue while using your driver, first determine the type of issue that you are encountering:

94 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Troubleshooting

· Setup/connection · Performance · Interoperability (ODBC application, ODBC driver, ODBC Driver Manager, or data source) · Out-of-Memory This chapter describes these three types of issues, provides some typical causes of the issues, lists some diagnostic tools that are useful to troubleshoot the issues, and, in some cases, explains possible actions you can take to resolve the issues.

Setup/Connection Issues

You are experiencing a setup/connection issue if you are encountering an error or hang while you are trying to make a database connection with the ODBC driver or are trying to configure the ODBC driver. Some common errors that are returned by the ODBC driver if you are experiencing a setup/connection issue include: · Specified driver could not be loaded. · Data source name not found and no default driver specified. · Cannot open shared library: libodbc.so. · Unable to connect to destination. · Invalid username/password; logon denied.

Troubleshooting the Issue Some common reasons that setup/connection issues occur are: · The library path environment variable is not set correctly.

Note: The 32-bit and 64-bit drivers for Apache Cassandra require that you set the library path environment for your operating system to the directory containing your 32-bit JVM's libjvm.so [sl | a] file, and that directory's parent directory before using the driver.

HP-UX ONLY:

· When setting the library path environment variable on HP-UX operating systems, specifying the parent directory is not required. · You also must set the LD_PRELOAD environment variable to the fully qualified path of the libjvm.so [sl]. The library path environment variable is: 32-bit Drivers

· PATH on Windows · LD_LIBRARY_PATH on Solaris, Linux and HP-UX Itanium · SHLIB_PATH on HP-UX PA_RISC · LIBPATH on AIX

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 95 Chapter 6: Troubleshooting

64-bit Drivers

· PATH on Windows · LD_LIBRARY_PATH on Solaris, HP-UX Itanium, and Linux · LIBPATH on AIX

· The database and/or listener are not started. · The ODBCINI environment variable is not set correctly for the ODBC drivers on UNIX and Linux. · The ODBC driver's connection attributes are not set correctly in the system information file on UNIX and Linux. See "Data Source Configuration on UNIX/Linux" for more information. For example, the host name or port number are not correctly configured. See "Connection Option Descriptions" for a list of connection string attributes that are required for each driver to connect properly to the underlying database.

For UNIX and Linux users: See "Configuring the Product on UNIX/Linux" for more information. See also "The Test Loading Tool" for information about a helpful diagnostic tool.

See also Data Source Configuration on UNIX/Linux on page 55 Connection Option Descriptions on page 101 Configuring the Product on UNIX/Linux on page 52 The Test Loading Tool on page 88

Interoperability Issues

Interoperability issues can occur with a working ODBC application in any of the following ODBC components: ODBC application, ODBC driver, ODBC Driver Manager, and/or data source. See "What Is ODBC?" for more information about ODBC components. For example, any of the following problems may occur because of an interoperability issue: · SQL statements may fail to execute. · Data may be returned/updated/deleted/inserted incorrectly. · A hang or core dump may occur.

See also What Is ODBC? on page 23

Troubleshooting the Issue Isolate the component in which the issue is occurring. Is it an ODBC application, an ODBC driver, an ODBC Driver Manager, or a data source issue? To troubleshoot the issue:

1. Test to see if your ODBC application is the source of the problem. To do this, replace your working ODBC application with a more simple application. If you can reproduce the issue, you know your ODBC application is not the cause.

96 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Troubleshooting

On Windows, you can use ODBC Test, which is part of the Microsoft ODBC SDK, or the example application that is shipped with your driver. See "ODBC Test" and "The example Application" for details.

On UNIX and Linux, you can use the example application that is shipped with your driver. See "The example Application" for details.

2. Test to see if the data source is the source of the problem. To do this, use the native database tools that are provided by your database vendor. 3. If neither the ODBC application nor the data source is the source of your problem, troubleshoot the ODBC driver and the ODBC Driver Manager. In this case, we recommend that you create an ODBC trace log to provide to Technical Support. See "ODBC Trace" for details.

See also ODBC Test on page 89 The Example Application on page 93 ODBC Trace on page 85

Performance Issues

Developing performance-oriented ODBC applications is not an easy task.You must be willing to change your application and test it to see if your changes helped performance. Microsoft's ODBC Programmer's Reference does not provide information about system performance. In addition, ODBC drivers and the ODBC Driver Manager do not return warnings when applications run inefficiently. Some general guidelines for developing performance-oriented ODBC applications include:

· Use catalog functions appropriately. · Retrieve only required data. · Select functions that optimize performance. · Manage connections and updates. See "Designing ODBC Applications for Performance Optimization" for complete information.

See also Designing ODBC applications for performance optimization on page 189

Out-of-Memory Issues

When processing large sets of data, out-of-memory errors can occur when the size of an intermediate result exceeds the available memory allocated to the JVM. If you are encountering these errors, you can tune Fetch Size, Result Memory Size, and JVM Arguments connection options to fit your environment:

· Reduce Fetch Size to reduce demands on the driver©s internal memory. By lowering the maximum number of rows as specified by Fetch Size, you lower the number of rows the driver is required to process before returning data to the application. Thus, you reduce demands on the driver©s internal memory, and, in turn, decrease the likelihood of out-of-memory errors.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 97 Chapter 6: Troubleshooting

· To tune Result Memory Size, decrease the value specified until results are successfully returned. Intermediate results larger than the specified setting will be written to disk as opposed to held in memory.When configured correctly, this avoids memory limitations by not relying on memory to process larger intermediate results. Be aware that while writing to disk reduces the risk of out-of-memory errors, it also negatively impacts performance. For optimal performance, decrease this value only to a size necessary to avoid errors.

Note: By default, Result Memory Size is set to -1, which sets the maximum size of intermediate results held in memory to a percentage of the max Java heap size. If you received errors using the default configuration, use the max Java heap size divided by 4 as a starting point when tuning this option.

· Increase the JVM heap size using the JVM Arguments connection option. By increasing the max Java heap size, you increase the amount of data the driver can accumulate in memory and avoid out-of-memory errors. See "Fetch Size", "Result Memory Size", and "JVM Arguments" for additional information.

See also Fetch Size on page 112 Result Memory Size on page 124 JVM Arguments on page 115

Operation Timeouts

Cassandra imposes timeouts on read and write operations to prevent a given operation from negatively impacting the performance of the cluster. If you encounter an operation timeout, you can take the following actions to promote operation success.

· Adjust the ReadConsistency connection property.You can speed up a query by reducing the number of replicas required to respond to a read request. Therefore, you can reduce the likelihood of a timeout by setting ReadConsistency to a value that requires fewer replicas to respond. · Adjust the WriteConsistency connection property.You can speed up a write operation by reducing the number of replicas required to acknowledge success. Therefore, you can reduce the likelihood of a timeout by setting WriteConsistency to a value that requires fewer replicas to acknowledge the execution of the write operation. · Decrease the value of the NativeFetchSize connection property. By decreasing NativeFetchSize, you reduce the amount of data that must be transmitted between the driver and the native data source. For read operations, the smaller the chunks of data requested, the faster the cluster can assemble results for transmission to the driver. For write operations, smaller chunks of data allow the driver to communicate more efficiently with the native data source and thus expedite write operations.

Note: Setting NativeFetchSize too low negatively impacts performance by requiring unnecessary round trips across the network.

· Optimize your query by taking one or more of the following actions. 1. Limit the number of results returned. 2. Add indexes to Cassandra tables and base operations on indexes as appropriate. 3. Use Where clause filtering that can be pushed down to Cassandra, allowing operations to be evaluated and handled quickly. Refer to the following DataStax Web pages for more information about Where clause functionality and limitations.

98 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Troubleshooting

· A deep look at the CQL WHERE clause · Filtering data using WHERE

· Adjust Cassandra network timeout settings in the cassandra.yaml configuration file. These settings can be adjusted to promote read operation success by increasing the size of the timeout window. Refer to your Apache Cassandra documentation for details.

See also Read Consistency on page 122 Write Consistency on page 131 Native Fetch Size on page 119 Where Clause on page 143

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 99 Chapter 6: Troubleshooting

100 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 7

Connection Option Descriptions

The following connection option descriptions are listed alphabetically by the GUI name that appears on the driver Setup dialog box. The connection string attribute name, along with its short name, is listed immediately underneath the GUI name. For example: Application Using Threads Attribute ApplicationUsingThreads (AUT) In most cases, the GUI name and the attribute name are the same; however, some exceptions exist. If you need to look up an option by its connection string attribute name, please refer to the alphabetical table of connection string attribute names. Also, a few connection string attributes, for example, Password, do not have equivalent options that appear on the GUI. They are in the list of descriptions alphabetically by their attribute names. The following table lists the connection string attributes supported by the driver for Apache Cassandra.

Table 15: Apache Cassandra Attribute Names

Attribute (Short Name) Default

ApplicationUsingThreads (AUT) 1 (Enabled)

AsciiSize (ASZ) 4000

AuthenticationMethod (AM) 0 (User Id/Password)

ClusterNodes (CN) None

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 101 Chapter 7: Connection Option Descriptions

Attribute (Short Name) Default

ConfigOptions (CO) Note: The ConfigOptions connection option is not currently supported by the driver.

None

CreateMap (CM) 2 (NotExist)

DataSourceName (DSN) None

DecimalPrecision (DP) 38

DecimalScale (DS) 10

Description (n/a) None

FetchSize (FS) 100 (rows)

HostName (HOST) None

IANAAppCodePage (IACP) (UNIX and Linux only) 4 (ISO 8559-1 Latin-1)

InitializationString (IS) None

JVMArgs (JVMA) For the 32-bit driver when the SQL Engine Mode is set to 2 (Direct): -Xmx256m For all other configurations: -Xmx1024m

JVMClasspath (JVMC) The default is an empty string, which means that the driver automatically detects the CLASSPATHs for all ODBC drivers installed on your machine and specifies them when launching the JVM.

KeyspaceName (KN) system

LogConfigFile (LCF) None

LoginTimeout (LT) 15

LogonID (UID) None

NativeFetchSize (NFS) 10000 (rows)

Password (PWD) None

PortNumber (PORT) 9042

102 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Attribute (Short Name) Default

ReadConsistency (RC) 4 (quorum)

ReadOnly (RO) 1 (Enabled)

ReportCodepageConversionErrors (RCCE) 0 (Ignore Errors)

ResultMemorySize (RMS) -1

SchemaMap (SM) For Windows: application_data_folder\Local\Progress\DataDirect\Cassandra_Schema\host_name.config For UNIX/Linux: users_home_directory/Progress/DataDirect/Cassandra_Schema/host_name.config

ServerPortNumber (SPN) For the 32-bit driver: 19936 For the 64-bit driver: 19935

SQLEngineMode (SEM) For Windows: 0 (Auto) For UNIX/Linux: 2 (Direct)

TransactionMode (TM) 0 (No Transactions)

VarcharSize (VCS) 4000

VarintPrecision (VP) 38

WriteConsistency (WC) 4 (quorum)

For details, see the following topics:

· Application Using Threads · Ascii Size · Authentication Method · Cluster Nodes · Config Options · Create Map · Data Source Name · Decimal Precision

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 103 Chapter 7: Connection Option Descriptions

· Decimal Scale · Description · Fetch Size · Host Name · IANAAppCodePage · Initialization String · JVM Arguments · JVM Classpath · Keyspace Name · Log Config File · Login Timeout · Native Fetch Size · Password · Port Number · Query Timeout · Read Consistency · Read Only · Report Codepage Conversion Errors · Result Memory Size · Schema Map · Server Port Number · SQL Engine Mode · Transaction Mode · User Name · Varchar Size · Varint Precision · Write Consistency

Application Using Threads

Attribute ApplicationUsingThreads (AUT)

104 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Ascii Size

Purpose Determines whether the driver works with applications using multiple ODBC threads.

Valid Values 0 | 1

Behavior If set to 1 (Enabled), the driver works with single-threaded and multi-threaded applications. If set to 0 (Disabled), the driver does not work with multi-threaded applications. If using the driver with single-threaded applications, this value avoids additional processing required for ODBC thread-safety standards.

Notes · This connection option can affect performance.

Default 1 (Enabled)

GUI Tab Advanced tab

See also Performance Considerations on page 76

Ascii Size

Attribute AsciiSize (ASZ)

Purpose Specifies the precision reported for ASCII columns in column and result-set metadata. This option allows you to set the precision for ASCII columns when using an application that does not support unbounded data types.

Valid Values x where:

x

is an integer greater than 0 (zero).

Default 4000

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 105 Chapter 7: Connection Option Descriptions

Notes · In most scenarios, an error is returned if the size of an ASCII value specified in a statement exceeds the precision determined by this option. However, when executing a Select statement, the driver will return data containing values that are larger than the specified precision.

GUI Tab Schema Map tab

Authentication Method

Attribute AuthenticationMethod (AM)

Purpose Specifies the method the driver uses to authenticate the user to the server when a connection is established. If the specified authentication method is not supported by the database server, the connection fails and the driver generates an error.

Valid Values -1 | 0

Behavior If set to -1 (No Authentication), the driver does not attempt to authenticate with the server. If set to 0 (User ID/Password), the driver sends the user ID in clear text and an encrypted password to the server for authentication.

Default 0 (User ID/Password)

GUI Tab Security tab

Cluster Nodes

Attribute ClusterNodes (CN)

Purpose A comma-separated list of member nodes in your cluster to which the driver attempts to connect. Specifying a value for this option enables connect time failover and load balancing. For details, see "Using Failover in a Cluster."

106 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Config Options

Valid Values host_name | IP_address:port_number [, ...] where:

host_name

is the name of a server for a member node in the replica-set cluster to which you want to connect.

IP_address

is the IP address of the server for a member node in the replica-set cluster to which you want to connect. The IP address can be specified in IPv4 or IPv6 format. Note that IPv6 values must be wrapped in double-quotation marks. See "Using IP Addresses" for details about these formats.

port_number

is the port number of the server listener for the corresponding member node.

For example:

server1:9042,255.125.1.11:9043,"2001:DB8:0000:0000:8:800:200C:417A":9044

Notes · When specifying a value for this option, the driver randomly selects from the list for a server node to first attempt a connection. If that connection fails, the driver again randomly selects another server node from this list until all servers in the list have been tried or a connection is successfully established.

Default None

GUI Tab General tab

See Also · Using Failover in a Cluster on page 82 · Using IP Addresses on page 48

Config Options

Attribute ConfigOptions (CO)

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 107 Chapter 7: Connection Option Descriptions

Purpose

Note: The Config Options connection option is not currently supported by the driver. When configuring the driver, you should not specify a value for this option.

Determines how the mapping of the native data model to the relational data model is configured, customized, and updated.

Notes · This option is primarily used for initial configuration of the driver for a particular user. It is not intended for use with every connection. By default, the driver configures itself and this option is normally not needed. If Config Options is specified on a connection after the initial configuration, the values specified for Config Options must match the values specified for the initial configuration.

Valid Values { key = value [; key = value ]} where:

key

is the attribute name of a supported configuration option.

value

specifies the setting for this configuration option.

Default None

GUI Tab Schema Map tab

Create Map

Attribute CreateMap (CM)

Purpose Determines whether the driver creates the internal files required for a relational view of the native data when establishing a connection.

Valid Values 0 | 1 | 2

108 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Data Source Name

Behavior If set to 0 (No), the driver uses the current group of internal files specified by the Schema Map connection option. If the files do not exist, the connection fails. If set to 1 (ForceNew), the driver deletes the current group of files specified by the Schema Map connection option and creates new files in the same location. If set to 2 (NotExist), the driver uses the current group of files specified by the Schema Map connection option. If the files do not exist, the driver creates them.

Notes · The internal files share the same directory as the schema map©s configuration file.This directory is specified by the value you enter for the Schema Map connection option.

Default 2 (NotExist)

GUI Tab Advanced tab

Data Source Name

Attribute DataSourceName (DSN)

Purpose Specifies the name of a data source in your Windows Registry or odbc.ini file.

Valid Values string where:

string

is the name of a data source.

Default None

GUI Tab General tab

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 109 Chapter 7: Connection Option Descriptions

Decimal Precision

Attribute DecimalPrecision (DP)

Purpose Specifies the precision reported for Decimal columns in column and result-set metadata. This option allows you to set the precision for Decimal columns when using an application that does not support unbounded data types.

Valid Values x where:

x

is an integer greater than 0 (zero).

Notes · The value specified for this option must be greater than or equal to the setting of the Decimal Scale connection option; otherwise, an error is returned when attempting to establish a connection. · In most scenarios, an error is returned if the size of an Decimal value specified in a statement exceeds the precision determined by this option. However, when executing a Select statement, the driver will return data containing values that are larger than the specified precision.

Default 38

GUI Tab Schema Map tab

See also · Decimal Scale on page 110

Decimal Scale

Attribute DecimalScale (DS)

110 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Description

Purpose Specifies the maximum scale reported for Decimal columns in column and result-set metadata. This option allows you to set the scale for Decimal columns when using an application that does not support unbounded data types.

Valid Values x where:

x

is an integer greater than 0 (zero).

Default 10

Notes · The value specified for this option cannot exceed the setting of the Decimal Precision connection option; otherwise, an error is returned when attempting to establish a connection. · In most scenarios, an error is returned if the scale of a Decimal value specified in a statement exceeds the scale determined by this option. However, when executing a Select statement, the driver will return data containing values that have a scale larger than that specified by this option.

GUI Tab Schema Map tab

See Also · Decimal Precision on page 110

Description

Attribute Description (n/a)

Purpose Specifies an optional long description of a data source. This description is not used as a runtime connection attribute, but does appear in the ODBC.INI section of the Registry and in the odbc.ini file.

Valid Values string where:

string

is a description of a data source.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 111 Chapter 7: Connection Option Descriptions

Default None

GUI Tab General tab

Fetch Size

Attribute FetchSize (FS)

Purpose Specifies the number of rows that the driver processes before returning data to the application when executing a Select. This value provides a suggestion to the driver as to the number of rows it should internally process before returning control to the application.The driver may fetch fewer rows to conserve memory when processing exceptionally wide rows.

Valid Values 0 | x where:

x

is a positive integer indicating the number of rows that should be processed.

Behavior If set to 0, the driver processes all the rows of the result before returning control to the application. When large data sets are being processed, setting FetchSize to 0 can diminish performance and increase the likelihood of out-of-memory errors. If set to x, the driver limits the number of rows that may be processed for each fetch request before returning control to the application.

Notes · To optimize throughput and conserve memory, the driver uses an internal algorithm to determine how many rows should be processed based on the width of rows in the result set. Therefore, the driver may process fewer rows than specified by FetchSize when the result set contains exceptionally wide rows. Alternatively, the driver processes the number of rows specified by FetchSize when the result set contains rows of unexceptional width. · FetchSize can be used to adjust the trade-off between throughput and response time. Smaller fetch sizes can improve the initial response time of the query. Larger fetch sizes can improve overall response times at the cost of additional memory. · You can use FetchSize to reduce demands on memory and decrease the likelihood of out-of-memory errors. Simply, decrease FetchSize to reduce the number of rows the driver is required to process before returning data to the application.

112 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Host Name

· The ResultMemorySize connection property can also be used to reduce demands on memory and decrease the likelihood of out-of-memory errors. · FetchSize is related to, but different than, NativeFetchSize. NativeFetchSize specifies the number of rows of raw data that the driver fetches from the native data source, while FetchSize specifies how many of these rows the driver processes before returning control to the application. Processing the data includes converting native data types to SQL data types used by the application. If FetchSize is greater than NativeFetchSize, the driver may make multiple round trips to the data source to get the requested number of rows before returning control to the application.

Default 100 (rows)

GUI Tab Advanced tab

Host Name

Attribute HostName (HOST)

Purpose The host name or the IP address of the server to which you want to connect.

Valid Values host_name | IP_address where:

host_name

is the name of the server to which you want to connect.

IP_address

is the IP address of the server to which you want to connect.

The IP address can be specified in either IPv4 or IPv6 format, or a combination of the two. See "Using IP Addresses" for details about these formats.

Default None

GUI Tab General tab

See Also Using IP Addresses on page 48

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 113 Chapter 7: Connection Option Descriptions

IANAAppCodePage

Attribute IANAAppCodePage (IACP)

Purpose An Internet Assigned Numbers Authority (IANA) value.You must specify a value for this option if your application is not Unicode enabled or if your database character set is not Unicode. See "Internationalization, Localization, and Unicode" for details. The driver uses the specified IANA code page to convert "W" (wide) functions to ANSI. The driver and Driver Manager both check for the value of IANAAppCodePage in the following order: · In the connection string · In the Data Source section of the system information file (odbc.ini) · In the ODBC section of the system information file (odbc.ini) If the driver does not find an IANAAppCodePage value, the driver uses the default value of 4 (ISO 8859-1 Latin-1).

Valid Values IANA_code_page where:

IANA_code_page

is one of the valid values listed in ªValues for the Attribute IANAAppCodePage" in "IANAAppCodePage Values". The value must match the database character encoding and the system locale.

Default 4 (ISO 8559-1 Latin-1)

GUI Tab NA

See Also · Internationalization, Localization, and Unicode on page 179 · IANAAppCodePage Values on page 163

114 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Initialization String

Initialization String

Attribute InitializationString (IS)

Purpose One or multiple SQL commands to be executed by the driver after it has established the connection to the database and has performed all initialization for the connection. If the execution of a SQL command fails, the connection attempt also fails and the driver returns an error indicating which SQL command or commands failed.

Valid Values string where:

string

is one or multiple SQL commands.

Multiple commands must be separated by semicolons. In addition, if this option is specified in a connection URL, the entire value must be enclosed in parentheses when multiple commands are specified.

Example Because fetching metadata and generating mapping files can significantly increase the time it takes to connect to Apache Cassandara, the driver caches this information on the client the first time the driver connects on behalf of each user. The cached metadata is used in subsequent connections made by the user instead of re-fetching the metadata from Apache Cassandra. To force the driver to re-fetch the metadata information for a connection, use the InitializationString property to pass the REFRESH MAP CASSANDRA command in the connection URL. For example:

DSN=Cassandra;UID={CassandraID};PWD=secret;InitializationString=(REFRESH MAP CASSANDRA)

Default None

GUI Tab Advanced tab

JVM Arguments

Attribute JVMArgs (JVMA)

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 115 Chapter 7: Connection Option Descriptions

Purpose A string that contains the arguments that are passed to the JVM that the driver is starting. The location of the JVM must be specified on the driver library path. For information on setting the location of the JVM in your environment, see: · Setting the Library Path Environment Variable (Windows) on page 18 · Setting the Library Path Environment Variable (UNIX and Linux) on page 21. When specifying the heap size for the JVM, note that the JVM tries to allocate the heap memory as a single contiguous range of addresses in the application's memory address space. If the application©s address space is fragmented so that there is no contiguous range of addresses big enough for the amount of memory specified for the JVM, the driver fails to load, because the JVM cannot allocate its heap. This situation is typically encountered only with 32-bit applications, which have a much smaller application address space. If you encounter problems with loading the driver in an application, try reducing the amount of memory requested for the JVM heap. If possible, switch to a 64-bit version of the application.

Valid Values string where:

string

contains arguments that are defined by the JVM. Values that include special characters or spaces must be enclosed in curly braces { } when used in a connection string.

Example To set the heap size used by the JVM to 256 MB and the http proxy information, specify:

{-Xmx256m -Dhttp.proxyHost=johndoe -Dhttp.proxyPort=808}

To set the heap size to 256 MB and configure the JVM for remote debugging, specify:

{-Xmx256m -Xrunjdwp:transport=dt_socket, address=9003,server=y,suspend=n -Xdebug}

Default -Xmx256m

GUI Tab SQL Engine tab

JVM Classpath

Attribute JVMClasspath (JVMC)

Purpose Specifies the CLASSPATH for the Java Virtual Machine (JVM) used by the driver. The CLASSPATH is the search string the JVM uses to locate the Java jar files the driver needs.

116 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Keyspace Name

If you do not specify a value, the driver automatically detects the CLASSPATHs for all ODBC drivers installed on your machine and specifies them when launching the JVM.

Valid Values string where:

string

specifies the CLASSPATH. Separate multiple jar files by a semi-colon on Windows platforms and by a colon on Linux and UNIX platforms. CLASSPATH values with multiple jar files must be enclosed in curly braces { } when used in a connection string.

If your process employs multiple drivers that use a JVM, the value of the JVM Classpath for all affected drivers must include an absolute path to all the jar files for drivers used in that process. The value specified for this option must be identical for all drivers. For example, if you are using the Salesforce and Cassandra drivers on Windows, you would specify a value of {c:\install_dir\java\lib\cassandra.jar; c:\install_dir\java\lib\sforce.jar} for both drivers. If the value for any of the affected drivers differs from that of the other drivers, the drivers will return an error at connection that the JVM is already running.

Example On Windows:

{.;c:\install_dir\java\lib\cassandra.jar}

On UNIX:

{.:/home/user1/install_dir/java/lib/cassandra.jar}

Default The default is an empty string, which means that the driver automatically detects the CLASSPATHs for all ODBC drivers installed on your machine and specifies them when launching the JVM.

GUI Tab SQL Engine tab

Keyspace Name

Attribute KeyspaceName (KSN)

Purpose Specifies the name of the keyspace to which you want to connect.This value is also used as the default qualifier for unqualified table names in queries.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 117 Chapter 7: Connection Option Descriptions

Valid Values keyspace_name where:

keyspace_name

is the name of a valid keyspace. If the driver cannot find the specified keyspace, the connection fails; however, if you do not specify a value, the default is the keyspace defined by the system administrator for each user.

Notes · If KeyspaceName is not specified, Cassandra©s internal keyspace name system will be used.

Default system

GUI Tab General tab

Log Config File

Attribute LogConfigFile (LCF)

Purpose Specifies the filename of the configuration file used to initialize the driver logging mechanism. If the driver cannot locate the specified file when establishing the connection, the connection fails and the driver returns an error.

Valid Values string where:

string

is the relative or fully qualified path of the configuration file used to initialize the driver logging mechanism. If the specified file does not exist, the driver continues searching for an appropriate configuration file as described in "Using the Driver".

Default ddlogging.properties

GUI Tab Advanced tab

118 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Login Timeout

See Also Using the Driver on page 92

Login Timeout

Attribute LoginTimeout (LT)

Purpose The number of seconds the driver waits for a connection to be established before returning control to the application and generating a timeout error. To override the value that is set by this connection option for an individual connection, set a different value in the SQL_ATTR_LOGIN_TIMEOUT connection attribute using the SQLSetConnectAttr() function.

Valid Values -1 | 0 | x where:

x

is a positive integer that represents a number of seconds.

Behavior If set to -1, the connection request does not time out. The driver silently ignores the SQL_ATTR_LOGIN_TIMEOUT attribute. If set to 0, the connection request does not time out, but the driver responds to the SQL_ATTR_LOGIN_TIMEOUT attribute. If set to x, the connection request times out after the specified number of seconds unless the application overrides this setting with the SQL_ATTR_LOGIN_TIMEOUT attribute.

Default 15

GUI Tab Advanced tab

Native Fetch Size

Attribute NativeFetchSize (NFS)

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 119 Chapter 7: Connection Option Descriptions

Purpose Specifies the number of rows of data the driver attempts to fetch from the native data source on each request submitted to the server.

Valid Values 0 | x where:

x

is a positive integer that defines the number of rows.

Behavior If set to 0, the driver requests that the server return all rows for each request submitted to the server. Block fetching is not used. If set to x, the driver attempts to fetch up to a maximum of the specified number of rows on each request submitted to the server.

Notes Native Fetch Size is related to, but different than, Fetch Size. Native Fetch Size specifies the number of rows of raw data that the driver fetches from the native data source, while Fetch Size specifies how many of these rows the driver processes before returning control to the application. Processing the data includes converting native data types to SQL data types used by the application. If Fetch Size is greater than Native Fetch Size, the driver may make multiple round trips to the data source to get the requested number of rows before returning control to the application.

Default 10000 (rows)

GUI Tab Advanced tab

Password

Attribute Password (PWD)

Purpose Specifies the password to use to connect to your database. Contact your system administrator to obtain your password.

Important: Setting the password using a data source is not recommended.The data source persists all options, including the Password option, in clear text.

120 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Port Number

Valid Values password where:

password

is a valid password. The password is case-sensitive.

Default None

GUI Tab Logon Dialog

Port Number

Attribute PortNumber (PORT)

Purpose Specifies the port number of the server listener.

Valid Values port_number where:

port_number

is the port number of the server listener. Check with your database administrator for the correct number.

Default 9042

GUI Tab General tab

Query Timeout

Attribute QueryTimeout (QT)

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 121 Chapter 7: Connection Option Descriptions

Description The number of seconds for the default query timeout for all statements that are created by a connection. To override the value set by this connection option for an individual statement, set a different value in the SQL_ATTR_QUERY_TIMEOUT statement attribute on the SQLSetStmtAttr() function.

Valid Values -1 | 0 | x where:

x

is a number of seconds.

Behavior If set to -1, the query does not time out.The driver silently ignores the SQL_ATTR_QUERY_TIMEOUT attribute. If set to 0, the query does not time out, but the driver responds to the SQL_ATTR_QUERY_TIMEOUT attribute. If set to x, all queries time out after the specified number of seconds unless the application overrides this value by setting the SQL_ATTR_QUERY_TIMEOUT attribute.

Default 0

GUI Tab Advanced tab

Read Consistency

Attribute ReadConsistency (RC)

Purpose Specifies how many replicas must respond to a read request before returning data to the client application.

Valid Values 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10

Behavior If set to 1 (one), data is returned from the closest replica. This setting provides the highest availability, but increases the likelihood of stale data being read. If set to 2 (two), data is returned from two of the closest replicas. If set to 3 (three), data is returned from three of the closest replicas. If set to 4 (quorum), data is returned after a quorum of replicas has responded from any data center.

122 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Read Only

If set to 5 (all), data is returned to the application after all replicas have responded. This setting provides the highest consistency and lowest availability. If set to 6 (local_quorum), data is returned after a quorum of replicas in the same data center as the coordinator node has responded. This setting avoids latency of inter-data center communication. (Cassandra 2.x only) If set to 7 (each_quorum), data is returned after a quorum of replicas in each data center of the cluster has responded. Not supported for reads If set to 8 (serial), the data is read without proposing a new addition or update. Uncommitted transactions are committed as part of the read. If set to 9 (local_serial), the data within a data center is read without proposing a new addition or update. Uncommitted transactions within the data center are committed as part of the read. If set to 10 (local_one), data is returned from the closest replica in the local data center.

Notes · If the server does not support the Read Consistency value specified, the connection attempt fails and the driver returns a consistency level error. · Refer to Apache Cassandra documentation for more information about configuring consistency levels, including usage scenarios.

Default 4 (quorum)

GUI Tab Advanced tab

Read Only

Attribute ReadOnly (RO)

Purpose Specifies whether the connection has read-only access to the data source.

Valid Values 0 | 1

Behavior If set to 1 (Enabled), the connection has read-only access. The following commands are the only commands that you can use when a connection is read-only: The driver returns an error if any other command is executed. If set to 0 (Disabled), the connection is opened for read/write access, and you can use all commands supported by the product.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 123 Chapter 7: Connection Option Descriptions

Default 1 (Enabled)

GUI Tab Advanced tab

Report Codepage Conversion Errors

Attribute ReportCodepageConversionErrors (RCCE)

Purpose Specifies how the driver handles code page conversion errors that occur when a character cannot be converted from one character set to another. An error message or warning can occur if an ODBC call causes a conversion error, or if an error occurs during code page conversions to and from the database or to and from the application.The error or warning generated is Code page conversion error encountered. In the case of parameter data conversion errors, the driver adds the following sentence: Error in parameter x, where x is the parameter number.The standard rules for returning specific row and column errors for bulk operations apply.

Valid Values 0 | 1 | 2

Behavior If set to 0 (Ignore Errors), the driver substitutes 0x1A for each character that cannot be converted and does not return a warning or error. If set to 1 (Return Error), the driver returns an error instead of substituting 0x1A for unconverted characters. If set to 2 (Return Warning), the driver substitutes 0x1A for each character that cannot be converted and returns a warning.

Default 0 (Ignore Errors)

GUI Tab Advanced tab

Result Memory Size

Attribute ResultMemorySize (RMS)

124 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Result Memory Size

Purpose Specifies the maximum size, in megabytes, of an intermediate result set that the driver holds in memory. When this threshold is reached, the driver writes a portion of the result set to disk in temporary files.

Valid Values -1 | 0 | x where:

x

is a positive integer.

Behavior If set to -1, the maximum size of an intermediate result set that the driver holds in memory is determined by a percentage of the max Java heap size. When this threshold is reached, the driver writes a portion of the result set to disk. If set to 0, the SQL Engine holds intermediate results in memory regardless of size. Setting Result Memory Size to 0 can increase performance for any result set that can easily fit within the JVM©s free heap space, but can decrease performance for any result set that can barely fit within the JVM©s free heap space. If set to x, the SQL Engine holds intermediate results in memory that are no larger than the size specified. When this threshold is reached, the driver writes a portion of the result set to disk.

Notes · By default, Result Memory Size is set to -1. When set to -1, the maximum size of an intermediate result that the driver holds in memory is a percentage of the max Java heap size. When processing large sets of data, out-of-memory errors can occur when the size of the result set exceeds the available memory allocated to the JVM. In this scenario, you can tune Result Memory Size to suit your environment.To begin, set Result Memory Size equal to the max Java heap size divided by 4. Proceed by decreasing this value until out-of-memory errors are eliminated. As a result, the maximum size of an intermediate result set the driver holds in memory will be reduced, and some portion of the result set will be written to disk. Be aware that while writing to disk reduces the risk of out-of-memory errors, it also negatively impacts performance. For optimal performance, decrease this value only to a size necessary to avoid errors. · You can also address memory and performance concerns by adjusting the max Java heap size using the JVM Arguments connection option. By increasing the max Java heap size, you increase the amount of data the driver accumulates in memory. This can reduce the likelihood of out-of-memory errors and improve performance by ensuring that result sets fit easily within the JVM©s free heap space. In addition, when a limit is imposed by setting Result Memory Size to -1, increasing the max Java heap size can improve performance by reducing the need to write to disk, or removing it altogether. · The Fetch Size connection option can also be used to reduce demands on the driver©s internal memory and decrease the likelihood of out-of-memory errors.

Default -1

GUI Tab Advanced tab

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 125 Chapter 7: Connection Option Descriptions

See also · Performance Considerations on page 76 · Fetch Size on page 112 · JVM Arguments on page 115

Schema Map

Attribute SchemaMap (SMP)

Purpose Specifies the name and location of the configuration file used to create the relational map of native data. The driver looks for this file when connecting to the server. If the file does not exist, the driver creates one.

Valid Values string where:

string

is the absolute path and filename of the configuration file, including the .config extension. For example, if SchemaMap is set to a value of C:\Users\Default\AppData\Local\Progress\DataDirect\Cassandra_Schema\MyServer.config, the driver either creates or looks for the configuration file MyServer.config in the directory C:\Users\Default\AppData\Local\Progress\DataDirect\Cassandra_Schema\.

Notes · When connecting to a server, the driver looks for the SchemaMap configuration file. If the configuration file does not exist, the driver creates a SchemaMap configuration file using the name and location you have provided. If you do not provide a name and location for a SchemaMap configuration file, the driver creates it using default values. · On UNIX/Linux, if no value is specified for the SchemaMap or LogonID options, the driver creates a file named USER.config at connection. · The driver uses the path specified in this connection option to store additional internal files. · You can refresh the internal files related to an existing relational view of your data by using the SQL extension Refresh Map. Refresh Map runs a discovery against your native data and updates your internal files accordingly.

Default · For Windows XP and Windows Server 2003 · user_profile\Application Data\Local\Progress\DataDirect\Cassandra_Schema\host_name.config

· For other Windows platforms

126 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Server Port Number

· User data source: user_profile\AppData\Local\Progress\DataDirect\Cassandra_Schema\host_name.config · System data source: C:\Users\Default\AppData\Local\Progress\DataDirect\Cassandra_Schema\host_name.config

· For UNIX/Linux · users_home_directory/Progress/DataDirect/Cassandra_Schema/host_name.config

user_profile

is the user name of your OS profile.

host_name

is the value specified for the Host Name connection option.

users_home_directory

is the user©s home directory.

GUI Tab Schema Map tab

Server Port Number

Attribute ServerPortNumber (SPN)

Purpose Specifies a valid port on which the SQL engine listens for requests from the driver.

Valid Values port_name where:

port_name

is the port number of the server listener. Check with your system administrator for the correct number.

Notes · This option is ignored unless SQL Engine Mode is set to 0 (Auto) or 1 (Server).

Default For the 32-bit driver: 19934

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 127 Chapter 7: Connection Option Descriptions

For the 64-bit driver: 19933

GUI Tab SQL Engine tab

SQL Engine Mode

Attribute SQLEngineMode (SEM)

Purpose Specifies whether the driver's SQL engine runs in the same process as the driver (direct mode) or runs in a process that is separate from the driver (server mode).You must be an administrator to modify the server mode configuration values, and to start or stop the SQL engine service.

Valid Values 0 | 1 | 2

Behavior If set to 0 (Auto), the SQL engine attempts to run in server mode first; however, if server mode is unavailable, it runs in direct mode. To use server mode with this value, you must start the SQL Engine service before using the driver (see "Starting the SQL Engine Server" for more information). If set to 1 (Server), the SQL engine runs in server mode. The SQL engine operates in a separate process from the driver within its own JVM.You must start the SQL Engine service before using the driver (see "Starting the SQL Engine Server" for more information). If set to 2 (Direct), the SQL engine runs in direct mode. The driver and its SQL engine run in a single process within the same JVM.

Notes · Important: Changes you make to the server mode configuration affect all DSNs sharing the service.

Default For Windows: 0 (Auto) For UNIX/Linux: 2 (Direct)

GUI Tab SQL Engine tab

See Also · Starting the SQL Engine Server on Windows on page 79

128 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Transaction Mode

Transaction Mode

Attribute TransactionMode (TM)

Purpose Specifies how the driver handles manual transactions.

Valid Values 0 | 1

Behavior If set to 1 (Ignore), the data source does not support transactions and the driver always operates in auto-commit mode. Calls to set the driver to manual commit mode and to commit transactions are ignored. Calls to rollback a transaction cause the driver to return an error indicating that no transaction is started. Metadata indicates that the driver supports transactions and the ReadUncommitted transaction isolation level. If set to 0 (No Transactions), the data source and the driver do not support transactions. Metadata indicates that the driver does not support transactions.

Default 0 (No Transactions)

GUI Tab Advanced tab

User Name

Attribute LogonID (UID)

Purpose The default user ID that is used to connect to your database.Your ODBC application may override this value or you may override it in the logon dialog box or connection string.

Valid Values userid where:

userid

is a valid user ID with permissions to access the database.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 129 Chapter 7: Connection Option Descriptions

GUI Tab Security tab

Varchar Size

Attribute VarcharSize (VCS)

Purpose Specifies the precision reported for Varchar columns in column and result-set metadata. This option allows you to set the precision for Varchar columns when using an application that does not support unbounded data types.

Valid Values x where:

x

is an integer greater than 0 (zero).

Default 4000

Notes · In most scenarios, an error is returned if the size of a Varchar value specified in a statement exceeds the precision determined by this option. However, when executing a Select statement, the driver will return data containing values that are larger than the specified precision.

GUI Tab Schema Map tab

Varint Precision

Attribute VarintPrecision (VP)

Purpose Specifies the precision reported for Varint columns in column and result-set metadata. This option allows you to set a the precision for Varint columns when using an application that does not support unbounded data types.

130 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Write Consistency

Valid Values x where:

x

is an integer greater than 0 (zero).

Default 38

Notes · In most scenarios, an error is returned if the size of a Varint value specified in a statement exceeds the precision determined by this option. However, when executing a Select statement, the driver returns data containing values that are larger than the specified precision.

GUI Tab Schema Map tab

Write Consistency

Attribute WriteConsistency (WC)

Purpose Determines the number of replicas on which the write must succeed before returning an acknowledgment to the client application.

Valid Values 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |10

Behavior If set to 0 (any), a write must succeed on at least one node. Even if all replica nodes for the given partition key are down, the write can succeed after a hinted handoff has been written. This setting provides the lowest consistency and highest availability. If set to 1 (one), a write must succeed on at least one replica node. If set to 2 (two), a write must succeed on at least two replica nodes. If set to 3 (three), a write must succeed on at least three replica nodes. If set to 4 (quorum), a write must succeed on a quorum of replica nodes. If set to 5 (all), a write must succeed on all replica nodes in the cluster for that partition key.This setting provides the highest consistency and lowest availability. If set to 6 (local_quorum), a write must succeed on a quorum of replica nodes in the same data center as the coordinator node. This setting avoids latency of inter-data center communication.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 131 Chapter 7: Connection Option Descriptions

If set to 7 (each_quorum), a write must succeed on a quorum of replica nodes across a data center. (Cassandra 2.x only) If set to 8 (serial), the driver prevents unconditional updates to achieve linearizable consistency for lightweight transactions. (Cassandra 2.x only) If set to 9 (local_serial), the driver prevents unconditional updates to achieve linearizable consistency for lightweight transactions within the data center. If set to 10 (local_one), a write must succeed on at least one replica node in the local data center.

Notes · An update operation can result in a consistency level error if the server does not support the WriteConsistency value specified. · Refer to Apache Cassandra documentation for more information about configuring consistency levels, including usage scenarios.

Default 4 (quorum)

GUI Tab Advanced tab

132 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 8

Supported SQL Functionality

The driver provides support for SQL statements and extensions described in this section. SQL extensions are denoted by an (EXT) in the topic title.

For details, see the following topics:

· Data Definition Language (DDL) · Delete · Insert · Refresh Map (EXT) · Select · Update · SQL Expressions · Subqueries

Data Definition Language (DDL)

The driver supports data store-specific DDL through the Native and Refresh escape clauses. See "Native and Refresh Escape Clauses" for details.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 133 Chapter 8: Supported SQL Functionality

Native and Refresh Escape Clauses

The driver supports ODBC-native-escape and ODBC-refresh-schema-escape clauses to embed data store-specific commands in SQL-92 statements. The ODBC-native-escape clause allows you to execute native commands directly through the client application, while the ODBC-refresh-schema-escape clause is used to incorporate any changes introduced by the ODBC-native-escape clause into the driver©s relational map of the data.

Note: The native and refresh escape clauses are mainly intended for the execution of DDL commands, such as ALTER, CREATE, and DROP. A returning clause for the native escape is not currently supported by the driver. Therefore, results cannot be retrieved using the native escape clause.

The ODBC-native-escape clause is defined as follows: ODBC-native-escape ::= ODBC-esc-initiator native (command_text)} where:

command_text

is a data store-specific command.

The ODBC-refresh-schema-escape clause is defined as follows: ODBC-refresh-schema-escape ::= ODBC-esc-initiator refresh ODBC-esc-terminator The following example shows the execution of two data store-specific commands with a refresh of the driver©s relational map of the data. Note that each Native escape clause must have its own execute. The Refresh escape, however, can be used in the same execute statement as the Native escape.

SQLExecDirect( pstmt, "{native (CREATE TABLE emp (empid int primary key, title varchar))}", SQL_NTS); SQLExecDirect( pstmt, "{native (CREATE TABLE dept (deptid int primary key, city varchar))}{refresh}", SQL_NTS);

See also Supported SQL Functionality on page 133

Delete

Purpose The Delete statement is used to delete rows from a table.

Syntax DELETE FROM table_name [WHERE search_condition] where:

134 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Insert

table_name

specifies the name of the table from which you want to delete rows.

search_condition

is an expression that identifies which rows to delete from the table.

Notes · The Where clause determines which rows are to be deleted. Without a Where clause, all rows of the table are deleted, but the table is left intact. See "Where Clause" for information about the syntax of Where clauses. Where clauses can contain subqueries.

Example A This example shows a Delete statement on the emp table. DELETE FROM emp WHERE emp_id = ©E10001© Each Delete statement removes every record that meets the conditions in the Where clause. In this case, every record having the employee ID E10001 is deleted. Because employee IDs are unique in the employee table, at most, one record is deleted.

Example B This example shows using a subquery in a Delete clause. DELETE FROM emp WHERE dept_id = (SELECT dept_id FROM dept WHERE dept_name = ©Marketing©) The records of all employees who belong to the department named Marketing are deleted.

Notes · Delete is supported for primitive types and non-nested complex types. See "Complex Type Normalization" for details. · To enable Insert, Update, and Delete, set the ReadOnly connection property to false. · A Where clause can be used to restrict which rows are deleted.

See also · Where Clause on page 143 · Complex Type Normalization on page 38

Insert

Purpose The Insert statement is used to add new rows to a local table.You can specify either of the following options: · List of values to be inserted as a new row · Select statement that copies data from another table to be inserted as a set of new rows

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 135 Chapter 8: Supported SQL Functionality

In Cassandra, Inserts are in effect Upserts. When an Insert is performed on a row that already exists, the row will be updated.

Syntax INSERT INTO table_name [(column_name[,column_name]...)] {VALUES (expression [,expression]...) | select_statement}

table_name

is the name of the table in which you want to insert rows.

column_name

is optional and specifies an existing column. Multiple column names (a column list) must be separated by commas. A column list provides the name and order of the columns, the values of which are specified in the Values clause. If you omit a column_name or a column list, the value expressions must provide values for all columns defined in the table and must be in the same order that the columns are defined for the table. Table columns that do not appear in the column list are populated with the default value, or with NULL if no default value is specified.

expression

is the list of expressions that provides the values for the columns of the new record. Typically, the expressions are constant values for the columns. Character string values must be enclosed in single quotation marks ('). See "Literals" for more information.

select_statement

is a query that returns values for each column_name value specified in the column list. Using a Select statement instead of a list of value expressions lets you select a set of rows from one table and insert it into another table using a single Insert statement. The Select statement is evaluated before any values are inserted.This query cannot be made on the table into which values are inserted. See "Select" for information about Select statements.

Notes · Insert is supported for primitive types and non-nested complex types. See "Complex Type Normalization" for details. · The driver supports an insert on a child table prior to an insert on a parent table, circumventing referential integrity constraints associated with traditional RDBMS. To maintain integrity between parent and child tables, it is recommended that an insert first be performed on the parent table for each foreign key value added to the child. If such an insert is not first performed, the driver automatically inserts a row into the parent table that contains only the primary key values and NULL values for all non-primary key columns. See "Complex Type Normalization" for details. · To enable Insert, Update, and Delete, set the ReadOnly connection property to false.

See also · Literals on page 151 · Select on page 137 · Complex Type Normalization on page 38

136 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Refresh Map (EXT)

Refresh Map (EXT)

Purpose The REFRESH MAP statement adds newly discovered objects to your relational view of native data. It also incorporates any configuration changes made to your relational view by reloading the schema map configuration file.

Syntax REFRESH MAP

Notes REFRESH MAP is an expensive query since it involves the discovery of native data.

Select

Purpose Use the Select statement to fetch results from one or more tables.

Syntax

SELECT select_clausefrom_clause [where_clause] [groupby_clause] [having_clause] [{UNION [ALL | DISTINCT] | {MINUS [DISTINCT] | EXCEPT [DISTINCT]} | INTERSECT [DISTINCT]} select_statement] [limit_clause]

where:

select_clause

specifies the columns from which results are to be returned by the query. See "Select Clause" for a complete explanation.

from_clause

specifies one or more tables on which the other clauses in the query operate. See "From Clause" for a complete explanation.

where_clause

is optional and restricts the results that are returned by the query. See "Where Clause" for a complete explanation.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 137 Chapter 8: Supported SQL Functionality

groupby_clause

is optional and allows query results to be aggregated in terms of groups. See "Group By Clause" for a complete explanation.

having_clause

is optional and specifies conditions for groups of rows (for example, display only the departments that have salaries totaling more than $200,000). See "Having Clause" for a complete explanation.

UNION

is an optional operator that combines the results of the left and right Select statements into a single result. See "Union Operator" for a complete explanation.

INTERSECT

is an optional operator that returns a single result by keeping any distinct values from the results of the left and right Select statements. See "Intersect Operator" for a complete explanation.

EXCEPT | MINUS

are synonymous optional operators that returns a single result by taking the results of the left Select statement and removing the results of the right Select statement. See "Except and Minus Operators" for a complete explanation.

orderby_clause

is optional and sorts the results that are returned by the query. See "Order By Clause" for a complete explanation.

limit_clause

is optional and places an upper bound on the number of rows returned in the result. See "Limit Clause" for a complete explanation.

See also Select Clause on page 139 From Clause on page 141 Where Clause on page 143 Group By Clause on page 143 Having Clause on page 144 Union Operator on page 145 Intersect Operator on page 145 Except and Minus Operators on page 146 Order By Clause on page 147 Limit Clause on page 148

138 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Select

Select Clause

Purpose Use the Select clause to specify with a list of column expressions that identify columns of values that you want to retrieve or an asterisk (*) to retrieve the value of all columns.

Syntax

SELECT [{LIMIT offset number | TOP number}] [ALL | DISTINCT] {* | column_expression [[AS] column_alias] [,column_expression [[AS] column_alias], ...]}

where:

LIMIT offset number

creates the result set for the Select statement first and then discards the first number of rows specified by offset and returns the number of remaining rows specified by number. To not discard any of the rows, specify 0 for offset, for example, LIMIT 0 number. To discard the first offset number of rows and return all the remaining rows, specify 0 for number, for example, LIMIT offset0.

TOP number

is equivalent to LIMIT 0number.

column_expression

can be simply a column name (for example, last_name). More complex expressions may include mathematical operations or string manipulation (for example, salary * 1.05). See "SQL Expressions" for details. column_expression can also include aggregate functions. See "Aggregate Functions" for details.

column_alias

can be used to give the column a descriptive name. For example, to assign the alias department to the column dep:

SELECT dep AS department FROM emp

DISTINCT

eliminates duplicate rows from the result of a query. This operator can precede the first column expression. For example:

SELECT DISTINCT dep FROM emp

Notes · Separate multiple column expressions with commas (for example, SELECT last_name, first_name, hire_date). · Column names can be prefixed with the table name or table alias. For example, SELECT emp.last_name or e.last_name, where e is the alias for the table emp. · NULL values are not treated as distinct from each other. The default behavior is that all result rows be returned, which can be made explicit with the keyword ALL.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 139 Chapter 8: Supported SQL Functionality

See also SQL Expressions on page 150 Aggregate Functions on page 140

Aggregate Functions Aggregate functions can also be a part of a Select clause. Aggregate functions return a single value from a set of rows. An aggregate can be used with a column name (for example, AVG(salary)) or in combination with a more complex column expression (for example, AVG(salary * 1.07)). The column expression can be preceded by the DISTINCT operator.The DISTINCT operator eliminates duplicate values from an aggregate expression. The following table lists supported aggregate functions.

Table 16: Aggregate Functions

Aggregate Returns

AVG The average of the values in a numeric column expression. For example, AVG(salary) returns the average of all salary column values.

COUNT The number of values in any field expression. For example, COUNT(name) returns the number of name values. When using COUNT with a field name, COUNT returns the number of non-NULL column values. A special example is COUNT(*), which returns the number of rows in the set, including rows with NULL values.

MAX The maximum value in any column expression. For example, MAX(salary) returns the maximum salary column value.

MIN The minimum value in any column expression. For example, MIN(salary) returns the minimum salary column value.

SUM The total of the values in a numeric column expression. For example, SUM(salary) returns the sum of all salary column values.

Example A In the following example, only distinct last name values are counted. The default behavior is that all duplicate values be returned, which can be made explicit with ALL. COUNT (DISTINCT last_name)

Example B The following example uses the COUNT, MAX, and AVG aggregate functions:

SELECT COUNT(amount) AS numOpportunities, MAX(amount) AS maxAmount, AVG(amount) AS avgAmount FROM opportunity o INNER JOIN user u ON o.ownerId = u.id WHERE o.isClosed = ©false© AND u.name = ©MyName©

140 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Select

From Clause

Purpose The From clause indicates the tables to be used in the Select statement.

Syntax FROM table_name [table_alias] [,...] where:

table_name

is the name of a table or a subquery. Multiple tables define an implicit inner join among those tables. Multiple table names must be separated by a comma. For example:

SELECT * FROM emp, dep

Subqueries can be used instead of table names. Subqueries must be enclosed in parentheses. See "Subquery in a From Clause" for an example.

table_alias

is a name used to refer to a table in the rest of the Select statement. When you specify an alias for a table, you can prefix all column names of that table with the table alias.

Example The following example specifies two table aliases, e for emp and d for dep:

SELECT e.name, d.deptName FROM emp e, dep d WHERE e.deptId = d.id

table_alias is a name used to refer to a table in the rest of the Select statement. When you specify an alias for a table, you can prefix all column names of that table with the table alias. For example, given the table specification:

FROM emp E

you may refer to the last_name field as E.last_name. Table aliases must be used if the Select statement joins a table to itself. For example:

SELECT * FROM emp E, emp F WHERE E.mgr_id = F.emp_id

The equal sign (=) includes only matching rows in the results.

See also Subquery in a From Clause on page 142

Outer Join Escape Sequences

Purpose The SQL-92 left, right, and full outer join syntax is supported.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 141 Chapter 8: Supported SQL Functionality

Syntax

{oj outer-join}

where outer-join is

table-reference {LEFT | RIGHT | FULL} OUTER JOIN {table-reference | outer-join} ON search-condition

where table-reference is a database table name, and search-condition is the join condition you want to use for the tables.

Example: SELECT Customers.CustID, Customers.Name, Orders.OrderID, Orders.Status FROM {oj Customers LEFT OUTER JOIN Orders ON Customers.CustID=Orders.CustID} WHERE Orders.Status=©OPEN©

The following outer join escape sequences are supported: · Left outer joins · Right outer joins · Full outer joins · Nested outer joins

Join in a From Clause

Purpose You can use a Join as a way to associate multiple tables within a Select statement. Joins may be either explicit or implicit. For example, the following is the example from the previous section restated as an explicit inner join:

SELECT * FROM emp INNER JOIN dep ON id=empId SELECT e.name, d.deptName FROM emp e INNER JOIN dep d ON e.deptId = d.id;

whereas the following is the same statement as an implicit inner join:

SELECT * FROM emp, dep WHERE emp.deptID=dep.id

Syntax

FROM table_name {RIGHT OUTER | INNER | LEFT OUTER | CROSS | FULL OUTER} JOIN table.key ON search-condition

Example In this example, two tables are joined using LEFT OUTER JOIN. T1, the first table named includes nonmatching rows.

SELECT * FROM T1 LEFT OUTER JOIN T2 ON T1.key = T2.key

If you use a CROSS JOIN, no ON expression is allowed for the join.

Subquery in a From Clause Subqueries can be used in the From clause in place of table references (table_name).

142 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Select

Example SELECT * FROM (SELECT * FROM emp WHERE sal > 10000) new_emp, dept WHERE new_emp.deptno = dept.deptno

See also For more information about subqueries, see Subqueries on page 158.

Where Clause

Purpose Specifies the conditions that rows must meet to be retrieved.

Syntax WHERE expr1rel_operatorexpr2 where:

expr1

is either a column name, literal, or expression.

expr2

is either a column name, literal, expression, or subquery. Subqueries must be enclosed in parentheses.

rel_operator

is the relational operator that links the two expressions.

Example The following Select statement retrieves the first and last names of employees that make at least $20,000. SELECT last_name, first_name FROM emp WHERE salary >= 20000

See also · SQL Expressions on page 150 · Subqueries on page 158 · Operation Timeouts on page 98

Group By Clause

Purpose Specifies the names of one or more columns by which the returned values are grouped. This clause is used to return a set of aggregate values.

Syntax GROUP BY column_expression [,...] where:

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 143 Chapter 8: Supported SQL Functionality

column_expression

is either a column name or a SQL expression. Multiple values must be separated by a comma. If column_expression is a column name, it must match one of the column names specified in the Select clause. Also, the Group By clause must include all non-aggregate columns specified in the Select list.

Example The following example totals the salaries in each department: SELECT dept_id, sum(salary) FROM emp GROUP BY dept_id This statement returns one row for each distinct department ID. Each row contains the department ID and the sum of the salaries of the employees in the department.

See also · SQL Expressions on page 150 · Subqueries on page 158

Having Clause

Purpose Specifies conditions for groups of rows (for example, display only the departments that have salaries totaling more than $200,000). This clause is valid only if you have already defined a Group By clause.

Syntax HAVING expr1rel_operatorexpr2 where:

expr1 | expr2

can be column names, constant values, or expressions. These expressions do not have to match a column expression in the Select clause. See "SQL Expressions" for details regarding SQL expressions.

rel_operator

is the relational operator that links the two expressions.

Example The following example returns only the departments that have salaries totaling more than $200,000: SELECT dept_id, sum(salary) FROM emp GROUP BY dept_id HAVING sum(salary) > 200000

See also · SQL Expressions on page 150 · Subqueries on page 158

144 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Select

Union Operator

Purpose Combines the results of two Select statements into a single result. The single result is all the returned rows from both Select statements. By default, duplicate rows are not returned. To return duplicate rows, use the All keyword (UNION ALL).

Syntax select_statement UNION [ALL | DISTINCT] | {MINUS [DISTINCT] | EXCEPT [DISTINCT]} | INTERSECT [DISTINCT]select_statement

Notes · When using the Union operator, the Select lists for each Select statement must have the same number of column expressions with the same data types and must be specified in the same order.

Example A The following example has the same number of column expressions, and each column expression, in order, has the same data type.

SELECT last_name, salary, hire_date FROM emp UNION SELECT name, pay, birth_date FROM person

Example B The following example is not valid because the data types of the column expressions are different (salary FROM emp has a different data type than last_name FROM raises). This example does have the same number of column expressions in each Select statement but the expressions are not in the same order by data type.

SELECT last_name, salary FROM emp UNION SELECT salary, last_name FROM raises

Intersect Operator

Purpose Intersect operator returns a single result set. The result set contains rows that are returned by both Select statements. Duplicates are returned unless the Distinct operator is added.

Syntax select_statement INTERSECT [DISTINCT] select_statement

where:

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 145 Chapter 8: Supported SQL Functionality

DISTINCT

eliminates duplicate rows from the results.

Notes · When using the Intersect operator, the Select lists for each Select statement must have the same number of column expressions with the same data types and must be specified in the same order.

Example A The following example has the same number of column expressions, and each column expression, in order, has the same data type.

SELECT last_name, salary, hire_date FROM emp INTERSECT [DISTINCT] SELECT name, pay, birth_date FROM person

Example B The following example is not valid because the data types of the column expressions are different (salary FROM emp has a different data type than last_name FROM raises). This example does have the same number of column expressions in each Select statement but the expressions are not in the same order by data type.

SELECT last_name, salary FROM emp INTERSECT SELECT salary, last_name FROM raises

Except and Minus Operators

Purpose Return the rows from the left Select statement that are not included in the result of the right Select statement.

Syntax select_statement {EXCEPT [DISTINCT] | MINUS [DISTINCT]} select_statement

where:

DISTINCT

eliminates duplicate rows from the results.

Notes · When using one of these operators, the Select lists for each Select statement must have the same number of column expressions with the same data types and must be specified in the same order.

146 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Select

Example A The following example has the same number of column expressions, and each column expression, in order, has the same data type.

SELECT last_name, salary, hire_date FROM emp EXCEPT SELECT name, pay, birth_date FROM person

Example B The following example is not valid because the data types of the column expressions are different (salary FROM emp has a different data type than last_name FROM raises). This example does have the same number of column expressions in each Select statement but the expressions are not in the same order by data type.

SELECT last_name, salary FROM emp EXCEPT SELECT salary, last_name FROM raises

Order By Clause

Purpose The Order By clause specifies how the rows are to be sorted.

Syntax ORDER BY sort_expression [DESC | ASC] [,...] where:

sort_expression

is either the name of a column, a column alias, a SQL expression, or the positioned number of the column or expression in the select list to use.

The default is to perform an ascending (ASC) sort.

Example To sort by last_name and then by first_name, you could use either of the following Select statements: SELECT emp_id, last_name, first_name FROM emp ORDER BY last_name, first_name or SELECT emp_id, last_name, first_name FROM emp ORDER BY 2,3 In the second example, last_name is the second item in the Select list, so ORDER BY 2,3 sorts by last_name and then by first_name.

See also SQL Expressions on page 150

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 147 Chapter 8: Supported SQL Functionality

Limit Clause

Purpose Places an upper bound on the number of rows returned in the result.

Syntax LIMIT number_of_rows [OFFSET offset_number] where:

number_of_rows

specifies a maximum number of rows in the result. A negative number indicates no upper bound.

OFFSET

specifies how many rows to skip at the beginning of the result set. offset_number is the number of rows to skip.

Notes · In a compound query, the Limit clause can appear only on the final Select statement. The limit is applied to the entire query, not to the individual Select statement to which it is attached.

Example The following example returns a maximum of 20 rows. SELECT last_name, first_name FROM emp WHERE salary > 20000 ORDER BY dept_idc LIMIT 20

Update

Purpose An Update statement changes the value of columns in the selected rows of a table. In Cassandra, Updates are in effect Upserts. When an Update is performed on a row that does not exist, the row will be inserted.

Syntax UPDATE table_name SET column_name = expression [, column_name = expression] [WHERE conditions]

table_name

is the name of the table for which you want to update values.

148 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Update

column_name

is the name of a column, the value of which is to be changed. Multiple column values can be changed in a single statement.

expression

is the new value for the column. The expression can be a constant value or a subquery that returns a single value. Subqueries must be enclosed in parentheses.

Example A The following example changes every record that meets the conditions in the Where clause. In this case, the salary and exempt status are changed for all employees having the employee ID E10001. Because employee IDs are unique in the emp table, only one record is updated. UPDATE emp SET salary=32000, exempt=1 WHERE emp_id = ©E10001©

Example B The following example uses a subquery. In this example, the salary is changed to the average salary in the company for the employee having employee ID E10001. UPDATE emp SET salary = (SELECT avg(salary) FROM emp) WHERE emp_id = ©E10001©

Notes · Update is supported for primitive types, non-nested Tuple types, and non-nested user-defined types. Update is also supported for values in non-nested Map types. The driver does not support updates on List types, Set types, or keys in Map types because the values in each are part of the primary key of their respective child tables and primary key columns cannot be updated. If an Update is attempted when not allowed, the driver issues the following error message: [DataDirect][Cassandra ODBC Driver][Cassandra]syntax error or access rule violation: UPDATE not permitted for column: column_name See "Complex Type Normalization" for details.

· Update is supported for Counter columns when all the other columns in the row comprise that row's primary key. The Counter column itself is the only updatable field in the row. When updating a Counter column on an existing row, the Counter column is updated according to the increment (or decrement) specified in the SQL statement. When updating a Counter column for which there is no existing row, the values of the columns that comprise the row's primary key are inserted into the table alongside the value of the Counter column. For example, consider the following table.

CREATE TABLE page_view_counts ( counter_value counter, url_name varchar, page_name varchar, PRIMARYKEY (url_name, page_name));

The following Update can be performed on the page_view_counts table.

UPDATE page_view_counts SET counter_value=counter_value + 1 WHERE url_name = ©www.progress.com© AND page_name = ©home©

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 149 Chapter 8: Supported SQL Functionality

This Update would provide the following output.

Note: Cassandra initially assigns a value of 0 (zero) to Counter columns. An increment or decrement can be specified in the SQL statement.

url_name | page_name | counter_value ------+------+------www.progress.com | home | 1

· A Where clause can be used to restrict which rows are updated. · To enable Insert, Update, and Delete, set the ReadOnly connection property to false.

See also · Where Clause on page 143 · Subqueries on page 158 · Complex Type Normalization on page 38 · Native and Refresh Escape Clauses on page 134

SQL Expressions

An expression is a combination of one or more values, operators, and SQL functions that evaluate to a value. You can use expressions in the Where, and Having of Select statements; and in the Set clauses of Update statements. Expressions enable you to use mathematical operations as well as character string manipulation operators to form complex queries. The driver supports both unquoted and quoted identifiers. An unquoted identifier must start with an ASCII alpha character and can be followed by zero or more ASCII alphanumeric characters. Unquoted identifiers are converted to uppercase before being used. Quoted identifiers must be enclosed in double quotation marks (""). A quoted identifier can contain any Unicode character including the space character. The driver recognizes the Unicode escape sequence \uxxxx as a Unicode character.You can specify a double quotation mark in a quoted identifier by escaping it with a double quotation mark. The maximum length of both quoted and unquoted identifiers is 128 characters. Valid expression elements are: · Column names · Literals · Operators · Functions

150 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 SQL Expressions

Column Names

The most common expression is a simple column name.You can combine a column name with other expression elements.

Literals

Literals are fixed data values. For example, in the expression PRICE * 1.05, the value 1.05 is a constant. Literals are classified into types, including the following: · Binary · Character string · Date · Floating point · Integer · Numeric · Time · Timestamp The following table describes the literal format for supported SQL data types.

Table 17: Literal Syntax Examples

SQL Type Literal Syntax Example

BIGINT n 12 or -34 or 0 where n is any valid integer value in the range of the INTEGER data type

BOOLEAN Min Value: 0 0 Max Value: 1 1

DATE DATE©date© ©2010-05-21©

DATETIME TIMESTAMP©ts© ©2010-05-21 18:33:05.025©

DECIMAL n.f 0.25 where: 3.1415 n -7.48 is the integral part f is the fractional part

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 151 Chapter 8: Supported SQL Functionality

SQL Type Literal Syntax Example

DOUBLE n.fEx 1.2E0 or 2.5E40 or -3.45E2 or 5.67E-4 where: n is the integral part f is the fractional part x is the exponent

INTEGER n 12 or -34 or 0 where n is a valid integer value in the range of the INTEGER data type

LONGVARBINARY X©hex_value© ©000482ff©

LONGVARCHAR ©value© ©This is a string literal©

TIME TIME©time© ©2010-05-21 18:33:05.025©

VARCHAR ©value© ©This is a string literal©

Character String Literals Text specifies a character string literal. A character string literal must be enclosed in single quotation marks. To represent one single quotation mark within a literal, you must enter two single quotation marks. When the data in the fields is returned to the client, trailing blanks are stripped. A character string literal can have a maximum length of 32 KB, that is, (32*1024) bytes.

Example

©Hello© ©Jim©©s friend is Joe©

Numeric Literals Unquoted numeric values are treated as numeric literals. If the unquoted numeric value contains a decimal point or exponent, it is treated as a real literal; otherwise, it is treated as an integer literal.

Example +1894.1204

Binary Literals Binary literals are represented with single quotation marks. The valid characters in a binary literal are 0-9, a-f, and A-F.

152 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 SQL Expressions

Example ©00af123d©

Date/Time Literals Date and time literal values are enclosed in single quotion marks (©value©). · The format for a Date literal is DATE©date©. · The format for a Time literal is TIME©time©. · The format for a Timestamp literal is TIMESTAMP©ts©.

Integer Literals Integer literals are represented by a string of numbers that are not enclosed in quotation marks and do not contain decimal points.

Notes · Integer constants must be whole numbers; they cannot contain decimals. · Integer literals can start with sign characters (+/-).

Example 1994 or -2

Operators

This section describes the operators that can be used in SQL expressions.

Unary Operator A unary operator operates on only one operand. operator operand

Binary Operator A binary operator operates on two operands. operand1 operator operand2 If an operator is given a null operand, the result is always null. The only operator that does not follow this rule is concatenation (||).

Arithmetic Operators You can use an arithmetic operator in an expression to negate, add, subtract, multiply, and divide numeric values.The result of this operation is also a numeric value.The + and - operators are also supported in date/time fields to allow date arithmetic. The following table lists the supported arithmetic operators.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 153 Chapter 8: Supported SQL Functionality

Table 18: Arithmetic Operators

Operator Purpose Example

+ - Denotes a positive or negative expression.These SELECT * FROM emp WHERE comm = -1 are unary operators.

* / Multiplies, divides. These are binary operators. UPDATE emp SET sal = sal + sal * 0.10

+ - Adds, subtracts. These are binary operators. SELECT sal + comm FROM emp WHERE empno > 100

Concatenation Operator The concatenation operator manipulates character strings. The following table lists the only supported concatenation operator.

Table 19: Concatenation Operator

Operator Purpose Example

|| Concatenates character strings. SELECT ©Name is© || ename FROM emp

The result of concatenating two character strings is the data type VARCHAR.

Comparison Operators Comparison operators compare one expression to another. The result of such a comparison can be TRUE, FALSE, or UNKNOWN (if one of the operands is NULL).The driver considers the UNKNOWN result as FALSE. The following table lists the supported comparison operators.

Table 20: Comparison Operators

Operator Purpose Example

= Equality test. SELECT * FROM emp WHERE sal = 1500

!=<> Inequality test. SELECT * FROM emp WHERE sal != 1500

>< ªGreater than" and "less than" SELECT * FROM emp WHERE sal > tests. 1500 SELECT * FROM emp WHERE sal < 1500

>=<= ªGreater than or equal to" and "less SELECT * FROM emp WHERE sal >= than or equal to" tests. 1500 SELECT * FROM emp WHERE sal <= 1500

154 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 SQL Expressions

Operator Purpose Example

ESCAPE clause in LIKE The Escape clause is supported in SELECT * FROM emp WHERE ENAME operator the LIKE predicate to indicate the LIKE ©J%\_%© ESCAPE ©\© escape character. Escape LIKE 'pattern string' ESCAPE This matches all records with names that characters are used in the pattern 'c' start with letter ©J© and have the ©_© string to indicate that any wildcard character in them. character that is after the escape character in the pattern string SELECT * FROM emp WHERE ENAME should be treated as a regular LIKE ©JOE\_JOHN© ESCAPE ©\© character. This matches only records with name The default escape character is 'JOE_JOHN'. backslash (\).

[NOT] IN ªEqual to any member of" test. SELECT * FROM emp WHERE job IN (©CLERK©,©ANALYST©) SELECT * FROM emp WHERE sal IN (SELECT sal FROM emp WHERE deptno = 30)

[NOT] BETWEEN x AND y "Greater than or equal to x" and SELECT * FROM emp WHERE sal "less than or equal to y." BETWEEN 2000 AND 3000

EXISTS Tests for existence of rows in a SELECT empno, ename, deptno subquery. FROM emp e WHERE EXISTS (SELECT deptno FROM dept WHERE e.deptno = dept.deptno)

IS [NOT] NULL Tests whether the value of the SELECT * FROM emp WHERE ename column or expression is NULL. IS NOT NULL SELECT * FROM emp WHERE ename IS NULL

Logical Operators A logical operator combines the results of two component conditions to produce a single result or to invert the result of a single condition. The following table lists the supported logical operators.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 155 Chapter 8: Supported SQL Functionality

Table 21: Logical Operators

Operator Purpose Example

NOT Returns TRUE if the following condition is FALSE. Returns FALSE if it is TRUE. If it is SELECT * FROM emp WHERE NOT (job IS NULL) UNKNOWN, it remains UNKNOWN. SELECT * FROM emp WHERE NOT (sal BETWEEN 1000 AND 2000)

AND Returns TRUE if both component conditions are TRUE. Returns FALSE if either is FALSE; SELECT * FROM emp WHERE job = ©CLERK© AND deptno = 10 otherwise, returns UNKNOWN.

OR Returns TRUE if either component condition is TRUE. Returns FALSE if both are FALSE; SELECT * FROM emp WHERE job = ©CLERK© OR deptno = 10 otherwise, returns UNKNOWN.

Example In the Where clause of the following Select statement, the AND logical operator is used to ensure that managers earning more than $1000 a month are returned in the result:

SELECT * FROM emp WHERE jobtitle = manager AND sal > 1000

Operator Precedence As expressions become more complex, the order in which the expressions are evaluated becomes important. The following table shows the order in which the operators are evaluated. The operators in the first line are evaluated first, then those in the second line, and so on. Operators in the same line are evaluated left to right in the expression.You can change the order of precedence by using parentheses. Enclosing expressions in parentheses forces them to be evaluated together.

Table 22: Operator Precedence

Precedence Operator

1 + (Positive), - (Negative)

2 *(Multiply), / (Division)

3 + (Add), - (Subtract)

4 || (Concatenate)

5 =, >, <, >=, <=, <>, != (Comparison operators)

6 NOT, IN, LIKE

7 AND

8 OR

156 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 SQL Expressions

Example A The query in the following example returns employee records for which the department number is 1 or 2 and the salary is greater than $1000:

SELECT * FROM emp WHERE (deptno = 1 OR deptno = 2) AND sal > 1000

Because parenthetical expressions are forced to be evaluated first, the OR operation takes precedence over AND.

Example B In the following example, the query returns records for all the employees in department 1, but only employees whose salary is greater than $1000 in department 2.

SELECT * FROM emp WHERE deptno = 1 OR deptno = 2 AND sal > 1000

The AND operator takes precedence over OR, so that the search condition in the example is equivalent to the expression deptno = 1 OR (deptno = 2 AND sal > 1000).

Functions

The driver supports a number of functions that you can use in expressions, as listed and described in "Scalar Functions."

See also Scalar Functions on page 172

Conditions

A condition specifies a combination of one or more expressions and logical operators that evaluates to either TRUE, FALSE, or UNKNOWN.You can use a condition in the Where clause of the Delete, Select, and Update statements; and in the Having clauses of Select statements. The following describes supported conditions.

Table 23: Conditions

Condition Description

Simple comparison Specifies a comparison with expressions or subquery results.

= , !=, <>, < , >, <=, <=

Group comparison Specifies a comparison with any or all members in a list or subquery.

[= , !=, <>, < , >, <=, <=] [ANY, ALL, SOME]

Membership Tests for membership in a list or subquery.

[NOT] IN

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 157 Chapter 8: Supported SQL Functionality

Condition Description

Range Tests for inclusion in a range.

[NOT] BETWEEN

NULL Tests for nulls.

IS NULL, IS NOT NULL

EXISTS Tests for existence of rows in a subquery.

[NOT] EXISTS

LIKE Specifies a test involving pattern matching.

[NOT] LIKE

Compound Specifies a combination of other conditions.

CONDITION [AND/OR] CONDITION

Subqueries

A query is an operation that retrieves data from one or more tables or views. In this reference, a top-level query is called a Select statement, and a query nested within a Select statement is called a subquery. A subquery is a query expression that appears in the body of another expression such as a Select, an Update, or a Delete statement. In the following example, the second Select statement is a subquery: SELECT * FROM emp WHERE deptno IN (SELECT deptno FROM dept)

IN Predicate

Purpose The In predicate specifies a set of values against which to compare a result set. If the values are being compared against a subquery, only a single column result set is returned.

Syntax value [NOT] IN (value1, value2,...) OR value [NOT] IN (subquery)

Example SELECT * FROM emp WHERE deptno IN (SELECT deptno FROM dept WHERE dname <> ©Sales©)

158 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Subqueries

EXISTS Predicate

Purpose The Exists predicate is true only if the cardinality of the subquery is greater than 0; otherwise, it is false.

Syntax EXISTS (subquery)

Example

SELECT empno, ename, deptno FROM emp e WHERE EXISTS (SELECT deptno FROM dept WHERE e.deptno = dept.deptno)

UNIQUE Predicate

Purpose The Unique predicate is used to determine whether duplicate rows exist in a virtual table (one returned from a subquery).

Syntax UNIQUE (subquery)

Example

SELECT * FROM dept d WHERE UNIQUE (SELECT deptno FROM emp e WHERE e.deptno = d.deptno)

Correlated Subqueries

Purpose A correlated subquery is a subquery that references a column from a table referred to in the parent statement. A correlated subquery is evaluated once for each row processed by the parent statement.The parent statement can be a Select, Update, or Delete statement. A correlated subquery answers a multiple-part question in which the answer depends on the value in each row processed by the parent statement. For example, you can use a correlated subquery to determine which employees earn more than the average salaries for their departments. In this case, the correlated subquery specifically computes the average salary for each department.

Syntax

SELECT select_list FROM table1 t_alias1 WHERE expr rel_operator (SELECT column_list FROM table2 t_alias2

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 159 Chapter 8: Supported SQL Functionality

WHERE t_alias1.columnrel_operatort_alias2.column) UPDATE table1 t_alias1 SET column = (SELECT expr FROM table2 t_alias2 WHERE t_alias1.column = t_alias2.column) DELETE FROM table1 t_alias1 WHERE column rel_operator (SELECT expr FROM table2 t_alias2 WHERE t_alias1.column = t_alias2.column)

Notes · Correlated column names in correlated subqueries must be explicitly qualified with the table name of the parent.

Example A The following statement returns data about employees whose salaries exceed their department average. This statement assigns an alias to emp, the table containing the salary information, and then uses the alias in a correlated subquery:

SELECT deptno, ename, sal FROM emp x WHERE sal > (SELECT AVG(sal) FROM emp WHERE x.deptno = deptno) ORDER BY deptno

Example B This is an example of a correlated subquery that returns row values:

SELECT * FROM dept "outer" WHERE ©manager© IN (SELECT managername FROM emp WHERE "outer".deptno = emp.deptno)

Example C This is an example of finding the department number (deptno) with multiple employees:

SELECT * FROM dept main WHERE 1 < (SELECT COUNT(*) FROM emp WHERE deptno = main.deptno)

Example D This is an example of correlating a table with itself:

SELECT deptno, ename, sal FROM emp x WHERE sal > (SELECT AVG(sal) FROM emp WHERE x.deptno = deptno)

160 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 I

Reference

This part provides detailed reference information about Progress DataDirect for ODBC drivers.

Note: This part describes the behavior of multiple Progress DataDirect for ODBC drivers. The functionality described may not necessarily apply to your driver or database system.

For details, see the following topics:

· Code Page Values · ODBC API and Scalar Functions · Internationalization, Localization, and Unicode · Designing ODBC applications for performance optimization · Using Indexes · Locking and Isolation Levels · WorkAround Options · Threading

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 161 Part I: Reference

162 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 9

Code Page Values

This chapter lists supported code page values, along with a description, for your driver.

For details, see the following topics:

· IANAAppCodePage Values

IANAAppCodePage Values

The following table lists supported code page values for the IANAAppCodePage connection option. See "IANAAppCodePage" for information about this attribute. To determine the correct numeric value (the MIBenum value) for the IANAAppCodePage connection string attribute, perform the following steps:

1. Determine the code page of your database. 2. Determine the MIBenum value that corresponds to your database code page. To do this, go to: http://www.iana.org/assignments/character-sets On this web page, search for the name of your database code page. This name will be listed as an alias or the name of a character set, and will have a MIBenum value associated with it.

3. Check the following table to make sure that the MIBenum value you looked up on the IANA Web page is supported by your Progress DataDirect for ODBC driver. If the value is not listed, contact Progress Technical Support to request support for that value.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 163 Chapter 9: Code Page Values

Table 24: IANAAppCodePage Values

Value (MIBenum) Description

3 US_ASCII

4 ISO_8859_1

5 ISO_8859_2

6 ISO_8859_3

7 ISO_8859_4

8 ISO_8859_5

9 ISO_8859_6

10 ISO_8859_7

11 ISO_8859_8

12 ISO_8859_9

16 JIS_Encoding

17 Shift_JIS

18 EUC_JP

30 ISO_646_IRV

36 KS_C_5601

37 ISO_2022_KR

38 EUC_KR

39 ISO_2022_JP

40 ISO_2022_JP_2

57 GB_2312_80

104 ISO_2022_CN

105 ISO_2022_CN_EXT

109 ISO_8859_13

110 ISO_8859_14

111 ISO_8859_15

164 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Value (MIBenum) Description

113 GBK

2004 HP_ROMAN8

2009 IBM850

2010 IBM852

2011 IBM437

2013 IBM862

2016 IBM-Thai

2024 WINDOWS-31J

2025 GB2312

2026 Big5

2027 MACINTOSH

2028 IBM037

2029 IBM038

2030 IBM273

2033 IBM277

2034 IBM278

2035 IBM280

2037 IBM284

2038 IBM285

2039 IBM290

2040 IBM297

2041 IBM420

2043 IBM424

2044 IBM500

2045 IBM851

2046 IBM855

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 165 Chapter 9: Code Page Values

Value (MIBenum) Description

2047 IBM857

2048 IBM860

2049 IBM861

2050 IBM863

2051 IBM864

2052 IBM865

2053 IBM868

2054 IBM869

2055 IBM870

2056 IBM871

2062 IBM918

2063 IBM1026

2084 KOI8_R

2085 HZ_GB_2312

2086 IBM866

2087 IBM775

2089 IBM00858

2091 IBM01140

2092 IBM01141

2093 IBM01142

2094 IBM01143

2095 IBM01144

2096 IBM01145

2097 IBM01146

2098 IBM01147

2099 IBM01148

166 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Value (MIBenum) Description

2100 IBM01149

2102 IBM1047

2250 WINDOWS_1250

2251 WINDOWS_1251

2252 WINDOWS_1252

2253 WINDOWS_1253

2254 WINDOWS_1254

2255 WINDOWS_1255

2256 WINDOWS_1256

2257 WINDOWS_1257

2258 WINDOWS_1258

2259 TIS_620

20000009398 IBM-939

20000009438 IBM-943_P14A-2000

20000010258 IBM-1025

20000043968 IBM-4396

20000050268 IBM-5026

20000050358 IBM-5035

See also IANAAppCodePage on page 114

8 These values are assigned by Progress DataDirect and do not appear in http://www.iana.org/assignments/character-sets.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 167 Chapter 9: Code Page Values

168 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 10

ODBC API and Scalar Functions

This chapter lists the ODBC API functions that your Progress DataDirect for ODBC driver supports. In addition, it lists the scalar functions that you use in SQL statements.

For details, see the following topics:

· API Functions · Scalar Functions

API Functions

Your Progress DataDirect for ODBC driver is Level 1 compliant, that is, it supports all ODBC Core and Level 1 functions. It also supports a limited set of Level 2 functions, as described in the following table.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 169 Chapter 10: ODBC API and Scalar Functions

Table 25: Function Conformance for ODBC 2.x Applications

Core Functions Level 1 Functions Level 2 Functions

SQLAllocConnect SQLColumns SQLBrowseConnect SQLAllocEnv SQLDriverConnect SQLDataSources SQLAllocStmt SQLGetConnectOption SQLDescribeParam SQLBindCol SQLGetData SQLExtendedFetch (forward scrolling only) SQLBindParameter SQLGetFunctions SQLMoreResults SQLCancel SQLGetInfo SQLNativeSql SQLColAttributes SQLGetStmtOption SQLNumParams SQLConnect SQLGetTypeInfo SQLParamOptions SQLDescribeCol SQLParamData SQLSetScrollOptions SQLDisconnect SQLPutData SQLDrivers SQLSetConnectOption SQLError SQLSetStmtOption SQLExecDirect SQLSpecialColumns SQLExecute SQLStatistics SQLFetch SQLTables SQLFreeConnect SQLFreeEnv SQLFreeStmt SQLGetCursorName SQLNumResultCols SQLPrepare SQLRowCount SQLSetCursorName SQLTransact

The functions for ODBC 3.x Applications that the driver supports are listed in the following table. Any additions to these supported functions or differences in the support of specific functions are listed in "ODBC Compliance."

170 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Table 26: Function Conformance for ODBC 3.x Applications

SQLAllocHandle SQLGetDescField SQLBindCol SQLGetDescRec SQLBindParameter SQLGetDiagField SQLBrowseConnect (except for Progress) SQLGetDiagRec SQLBulkOperations SQLGetEnvAttr SQLCancel SQLGetFunctions SQLCloseCursor SQLGetInfo SQLColAttribute SQLGetStmtAttr SQLColumns SQLGetTypeInfo SQLConnect SQLMoreResults SQLCopyDesc SQLNativeSql SQLDataSources SQLNumParens SQLDescribeCol SQLNumResultCols SQLDisconnect SQLParamData SQLDriverConnect SQLPrepare SQLDrivers SQLPutData SQLEndTran SQLRowCount SQLError SQLSetConnectAttr SQLExecDirect SQLSetCursorName SQLExecute SQLSetDescField SQLExtendedFetch SQLSetDescRec SQLFetch SQLSetEnvAttr SQLFetchScroll (forward scrolling only) SQLSetStmtAttr SQLFreeHandle SQLSpecialColumns SQLFreeStmt SQLStatistics SQLGetConnectAttr SQLTables SQLGetCursorName SQLTransact SQLGetData

See also ODBC Compliance on page 33

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 171 Chapter 10: ODBC API and Scalar Functions

Scalar Functions

This section lists the scalar functions that ODBC supports.Your database system may not support all these functions. Refer to the documentation for your database system to find out which functions are supported. Also, depending on the driver that you are using, all the scalar functions may not be supported. To check which scalar functions are supported by a driver, use the SQLGetInfo ODBC function. You can use these scalar functions in SQL statements using the following syntax: {fn scalar-function} where scalar-function is one of the functions listed in the following tables. For example: SELECT {fn UCASE(NAME)} FROM EMP

Table 27: Scalar Functions

String Functions Numeric Functions Timedate Functions System Functions

ASCII ABS CURDATE CURSESSIONID

BIT_LENGTH ACOS CURTIME CURRENT_USER

CHAR ASIN DATEDIFF DATABASE

CHAR_LENGTH ATAN DAYNAME IDENTITY

CONCAT ATAN2 DAYOFMONTH USER

DIFFERENCE BITAND DAYOFWEEK

HEXTORAW BITOR DAYOFYEAR

INSERT CEILING EXTRACT

LCASE COS HOUR

LEFT COT MINUTE

LENGTH DEGREES MONTH

LOCATE EXP MONTHNAME

LOWER FLOOR NOW

LTRIM LOG QUARTER

OCTET_LENGTH LOG10 SECOND

RAWTOHEX MOD WEEK

REPEAT PI YEAR

REPLACE POWER CURRENT_DATE

172 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 String Functions Numeric Functions Timedate Functions System Functions

RIGHT RADIANS CURRENT_TIME

RTRIM RAND CURRENT_ TIMESTAMP

SOUNDEX ROUND

SPACE ROUNDMAGIC

SUBSTR SIGN

SUBSTRING SIN

UCASE SQRT

UPPER TAN

TRUNCATE

String Functions The following table lists the string functions that ODBC supports. The string functions listed accept the following arguments:

· string_exp can be the name of a column, a string literal, or the result of another scalar function, where the underlying data type is SQL_CHAR, SQL_VARCHAR, or SQL_LONGVARCHAR. · start, length, and count can be the result of another scalar function or a literal numeric value, where the underlying data type is SQL_TINYINT, SQL_SMALLINT, or SQL_INTEGER. The string functions are one-based; that is, the first character in the string is character 1. Character string literals must be surrounded in single quotation marks.

Table 28: Scalar String Functions

Function Returns

ASCII(string_exp) ASCII code value of the leftmost character of string_exp as an integer.

BIT_LENGTH(string_exp) The length in bits of the string expression. [ODBC 3.0 only]

CHAR(code) The character with the ASCII code value specified by code. code should be between 0 and 255; otherwise, the return value is data-source dependent.

CHAR_LENGTH(string_exp) The length in characters of the string expression, if the string expression is of a character data type; otherwise, the length in bytes of the string expression (the [ODBC 3.0 only] smallest integer not less than the number of bits divided by 8). (This function is the same as the CHARACTER_LENGTH function.)

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 173 Chapter 10: ODBC API and Scalar Functions

Function Returns

CHARACTER_LENGTH(string_exp) The length in characters of the string expression, if the string expression is of a [ODBC 3.0 only] character data type; otherwise, the length in bytes of the string expression (the smallest integer not less than the number of bits divided by 8). (This function is the same as the CHAR_LENGTH function.)

CONCAT(string_exp1, string_exp2) The string resulting from concatenating string_exp2 and string_exp1.The string is system dependent.

DIFFERENCE(string_exp1, An integer value that indicates the difference between the values returned by string_exp2) the SOUNDEX function for string_exp1 and string_exp2.

INSERT(string_exp1, start, length, A string where length characters have been deleted from string_exp1 string_exp2) beginning at start and where string_exp2 has been inserted into string_exp beginning at start.

LCASE(string_exp) Uppercase characters in string_exp converted to lowercase.

LEFT(string_exp,count) The count of characters of string_exp.

LENGTH(string_exp) The number of characters in string_exp, excluding trailing blanks and the string termination character.

LOCATE(string_exp1, The starting position of the first occurrence of string_exp1 within string_exp2[,start]) string_exp2. If start is not specified, the search begins with the first character position in string_exp2. If start is specified, the search begins with the character position indicated by the value of start. The first character position in string_exp2 is indicated by the value 1. If string_exp1 is not found, 0 is returned.

LTRIM(string_exp) The characters of string_exp with leading blanks removed.

OCTET_LENGTH(string_exp) The length in bytes of the string expression. The result is the smallest integer not less than the number of bits divided by 8. [ODBC 3.0 only]

POSITION(character_exp IN The position of the first character expression in the second character expression. character_exp) The result is an exact numeric with an implementation-defined precision and a scale of 0. [ODBC 3.0 only]

REPEAT(string_exp, count) A string composed of string_exp repeated count times.

REPLACE(string_exp1, string_exp2, Replaces all occurrences of string_exp2 in string_exp1 with string_exp3. string_exp3)

RIGHT(string_exp, count) The rightmost count of characters in string_exp.

RTRIM(string_exp) The characters of string_exp with trailing blanks removed.

SOUNDEX(string_exp) A data source dependent string representing the sound of the words in string_exp.

174 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Function Returns

SPACE(count) A string consisting of count spaces.

SUBSTRING(string_exp, start, A string derived from string_exp beginning at the character position start length) for length characters.

UCASE(string_exp) Lowercase characters in string_exp converted to uppercase.

Numeric Functions The following table lists the numeric functions that ODBC supports. The numeric functions listed accept the following arguments:

· numeric_exp can be a column name, a numeric literal, or the result of another scalar function, where the underlying data type is SQL_NUMERIC, SQL_DECIMAL, SQL_TINYINT, SQL_SMALLINT, SQL_INTEGER, SQL_BIGINT, SQL_FLOAT, SQL_REAL, or SQL_DOUBLE. · float_exp can be a column name, a numeric literal, or the result of another scalar function, where the underlying data type is SQL_FLOAT. · integer_exp can be a column name, a numeric literal, or the result of another scalar function, where the underlying data type is SQL_TINYINT, SQL_SMALLINT, SQL_INTEGER, or SQL_BIGINT.

Table 29: Scalar Numeric Functions

Function Returns

ABS(numeric_exp) Absolute value of numeric_exp.

ACOS(float_exp) Arccosine of float_exp as an angle in radians.

ASIN(float_exp) Arcsine of float_exp as an angle in radians.

ATAN(float_exp) Arctangent of float_exp as an angle in radians.

ATAN2(float_exp1, float_exp2) Arctangent of the x and y coordinates, specified by float_exp1 and float_exp2 as an angle in radians.

CEILING(numeric_exp) Smallest integer greater than or equal to numeric_exp.

COS(float_exp) Cosine of float_exp as an angle in radians.

COT(float_exp) Cotangent of float_exp as an angle in radians.

DEGREES(numeric_exp) Number if degrees converted from numeric_exp radians.

EXP(float_exp) Exponential value of float_exp.

FLOOR(numeric_exp) Largest integer less than or equal to numeric_exp.

LOG(float_exp) Natural log of float_exp.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 175 Chapter 10: ODBC API and Scalar Functions

Function Returns

LOG10(float_exp) Base 10 log of float_exp.

MOD(integer_exp1, integer_exp2) Remainder of integer_exp1 divided by integer_exp2.

PI() Constant value of pi as a floating-point number.

POWER(numeric_exp, integer_exp) Value of numeric_exp to the power of integer_exp.

RADIANS(numeric_exp) Number of radians converted from numeric_exp degrees.

RAND([integer_exp]) Random floating-point value using integer_exp as the optional seed value.

ROUND(numeric_exp, integer_exp) numeric_exp rounded to integer_exp places right of the decimal (left of the decimal if integer_exp is negative).

SIGN(numeric_exp) Indicator of the sign of numeric_exp. If numeric_exp < 0, -1 is returned. If numeric_exp = 0, 0 is returned. If numeric_exp > 0, 1 is returned.

SIN(float_exp) Sine of float_exp, where float_exp is an angle in radians.

SQRT(float_exp) Square root of float_exp.

TAN(float_exp) Tangent of float_exp, where float_exp is an angle in radians.

TRUNCATE(numeric_exp, integer_exp) numeric_exp truncated to integer_exp places right of the decimal. (If integer_exp is negative, truncation is to the left of the decimal.)

Date and Time Functions The following table lists the date and time functions that ODBC supports. The date and time functions listed accept the following arguments:

· date_exp can be a column name, a date or timestamp literal, or the result of another scalar function, where the underlying data type can be represented as SQL_CHAR, SQL_VARCHAR, SQL_DATE, or SQL_TIMESTAMP. · time_exp can be a column name, a timestamp or timestamp literal, or the result of another scalar function, where the underlying data type can be represented as SQL_CHAR, SQL_VARCHAR, SQL_TIME, or SQL_TIMESTAMP. · timestamp_exp can be a column name; a time, date, or timestamp literal; or the result of another scalar function, where the underlying data type can be represented as SQL_CHAR, SQL_VARCHAR, SQL_TIME, SQL_DATE, or SQL_TIMESTAMP.

Table 30: Scalar Time and Date Functions

Function Returns

CURRENT_DATE() Current date. [ODBC 3.0 only]

176 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Function Returns

CURRENT_TIME[(time-precision)] Current local time. The time-precision argument determines the seconds precision of the returned value. [ODBC 3.0 only]

CURRENT_TIMESTAMP([timestamp-precision]) Current local date and local time as a timestamp value. The timestamp-precision argument determines the [ODBC 3.0 only] seconds precision of the returned timestamp.

CURDATE() Current date as a date value.

CURTIME() Current local time as a time value.

DAYNAME(date_exp) Character string containing a data-source-specific name of the day for the day portion of date_exp.

DAYOFMONTH(date_exp) Day of the month in date_exp as an integer value (1±31).

DAYOFWEEK(date_exp) Day of the week in date_exp as an integer value (1±7).

DAYOFYEAR(date_exp) Day of the year in date_exp as an integer value (1±366).

EXTRACT({YEAR | MONTH | DAY | HOUR | MINUTE | Any of the date and time terms can be extracted from SECOND} FROM datetime_value) datetime_value.

HOUR(time_exp) Hour in time_exp as an integer value (0±23).

MINUTE(time_exp) Minute in time_exp as an integer value (0±59).

MONTH(date_exp) Month in date_exp as an integer value (1±12).

MONTHNAME(date_exp) Character string containing the data source-specific name of the month.

NOW() Current date and time as a timestamp value.

QUARTER(date_exp) Quarter in date_exp as an integer value (1±4).

SECOND(time_exp) Second in date_exp as an integer value (0±59).

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 177 Chapter 10: ODBC API and Scalar Functions

Function Returns

TIMESTAMPADD(interval, integer_exp, time_exp) Timestamp calculated by adding integer_exp intervals of type interval to time_exp. interval can be one of the following values:

SQL_TSI_FRAC_SECOND SQL_TSI_SECOND SQL_TSI_MINUTE

SQL_TSI_HOUR

SQL_TSI_DAY

SQL_TSI_WEEK

SQL_TSI_MONTH

SQL_TSI_QUARTER

SQL_TSI_YEAR

Fractional seconds are expressed in billionths of a second.

TIMESTAMPDIFF(interval, time_exp1, time_exp2) Integer number of intervals of type interval by which time_exp2 is greater than time_exp1. interval has the same values as TIMESTAMPADD. Fractional seconds are expressed in billionths of a second.

WEEK(date_exp) Week of the year in date_exp as an integer value (1±53).

YEAR(date_exp) Year in date_exp. The range is data-source dependent.

System Functions The following table lists the system functions that ODBC supports.

Table 31: Scalar System Functions

Function Returns

DATABASE() Name of the database, corresponding to the connection handle (hdbc).

IFNULL(exp,value) value, if exp is null.

USER() Authorization name of the user.

178 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 11

Internationalization, Localization, and Unicode

This chapter provides an overview of how internationalization, localization, and Unicode relate to each other. It also provides a background on Unicode, and how it is accommodated by Unicode and non-Unicode ODBC drivers.

For details, see the following topics:

· Internationalization and Localization · Unicode Character Encoding · Unicode and Non-Unicode ODBC Drivers · Driver Manager and Unicode Encoding on UNIX/Linux

Internationalization and Localization

Software that has been designed for internationalization is able to manage different linguistic and cultural conventions transparently and without modification. The same binary copy of an application should run on any localized version of an operating system without requiring source code changes. Software that has been designed for localization includes language translation (such as text messages, icons, and buttons), cultural data (such as dates, times, and currency), and other components (such as input methods and spell checkers) for meeting regional market requirements. Properly designed applications can accommodate a localized interface without extensive modification. The applications can be designed, first, to run internationally, and, second, to accommodate the language- and cultural-specific elements of a designated locale.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 179 Chapter 11: Internationalization, Localization, and Unicode

Locale A locale represents the language and cultural data chosen by the user and dynamically loaded into memory at runtime. The locale settings are applied to the operating system and to subsequent application launches. While language is a fairly straightforward item, cultural data is a little more complex. Dates, numbers, and currency are all examples of data that is formatted according to cultural expectations. Because cultural preferences are bound to a geographic area, country is an important element of locale. Together these two elements (language and country) provide a precise context in which information can be presented. Locale presents information in the language and form that is best understood and appreciated by the local user.

Language A locale©s language is specified by the ISO 639 standard. The following table lists some commonly used language codes.

Language Code Language en English nl Dutch fr French es Spanish zh Chinese ja Japanese vi Vietnamese

Because language is correlated with geography, a language code might not capture all the nuances of usage in a particular area. For example, French and Canadian French may use different phrases and terms to mean different things even though basic grammar and vocabulary are the same. Language is only one element of locale.

Country The locale©s country identifier is also specified by an ISO standard, ISO 3166, which describes valid two-letter codes for all countries. ISO 3166 defines these codes in uppercase letters. The following table lists some commonly used country codes.

Country Code Country US United States FR France IE Ireland CA Canada MX Mexico

The country code provides more contextual information for a locale and affects a language©s usage, word spelling, and collation rules.

180 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Variant A variant is an optional extension to a locale. It identifies a custom locale that is not possible to create with just language and country codes. Variants can be used by anyone to add additional context for identifying a locale. The locale en_US represents English (United States), but en_US_CA represents even more information and might identify a locale for English (California, U.S.A). Operating system or software vendors can use these variants to create more descriptive locales for their specific environments.

Unicode Character Encoding

In addition to locale, the other major component of internationalizing software is the use of the Universal Codeset, or Unicode. Most developers know that Unicode is a standard encoding that can be used to support multilingual character sets. Unfortunately, understanding Unicode is not as simple as its name would indicate. Software developers have used a number of character encodings, from ASCII to Unicode, to solve the many problems that arise when developing software applications that can be used worldwide.

Background Most legacy computing environments have used ASCII character encoding developed by the ANSI standards body to store and manipulate character strings inside software applications. ASCII encoding was convenient for programmers because each ASCII character could be stored as a byte. The initial version of ASCII used only 7 of the 8 bits available in a byte, which meant that applications could use only 128 different characters. This version of ASCII could not account for European characters and was completely inadequate for Asian characters. Using the eighth bit to extend the total range of characters to 256 added support for most European characters. Today, ASCII refers to either the 7-bit or 8-bit encoding of characters. As the need increased for applications with additional international support, ANSI again increased the functionality of ASCII by developing an extension to accommodate multilingual software. The extension, known as the Double-Byte Character Set (DBCS), allowed existing applications to function without change, but provided for the use of additional characters, including complex Asian characters. With DBCS, characters map to either one byte (for example, American ASCII characters) or two bytes (for example, Asian characters). The DBCS environment also introduced the concept of an operating system code page that identified how characters would be encoded into byte sequences in a particular computing environment. DBCS encoding provided a cross-platform mechanism for building multilingual applications.

The DataDirect for ODBC UNIX and Linux drivers can use double-byte character sets. The drivers normally use the character set defined by the default locale "C" unless explicitly pointed to another character set.The default locale "C" corresponds to the 7-bit US-ASCII character set. Use the following procedure to set the locale to a different character set: 1. Add the following line at the beginning of applications that use double-byte character sets:

setlocale (LC_ALL, "");

This is a standard UNIX function. It selects the character set indicated by the environment variable LANG as the one to be used by X/Open compliant, character-handling functions. If this line is not present, or if LANG is not set or is set to NULL, the default locale "C" is used.

2. Set the LANG environment variable to the appropriate character set. The UNIX command locale -a can be used to display all supported character sets on your system. For more information, refer to the man pages for "locale" and "setlocale."

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 181 Chapter 11: Internationalization, Localization, and Unicode

Using a DBCS, however, was not ideal; many developers felt that there was a better way to solve the problem. A group of leading software companies joined forces to form the Unicode Consortium.Together, they produced a new solution to building worldwide applicationsÐUnicode. Unicode was originally designed as a fixed-width, uniform two-byte designation that could represent all modern scripts without the use of code pages.The Unicode Consortium has continued to evaluate new characters, and the current number of supported characters is over 109,000. Although it seemed to be the perfect solution to building multilingual applications, Unicode started off with a significant drawbackÐit would have to be retrofitted into existing computing environments. To use the new paradigm, all applications would have to change. As a result, several standards-based transliterations were designed to convert two-byte fixed Unicode values into more appropriate character encodings, including, among others, UTF-8, UCS-2, and UTF-16. UTF-8 is a standard method for transforming Unicode values into byte sequences that maintain transparency for all ASCII codes. UTF-8 is recognized by the Unicode Consortium as a mechanism for transforming Unicode values and is popular for use with HTML, XML, and other protocols. UTF-8 is, however, currently used primarily on AIX, HP-UX, Solaris, and Linux. UTF-16 is a superset of UCS-2, with the addition of some special characters in surrogate pairs. UTF-16 is the standard encoding for Windows platforms. Microsoft recommends using UTF-16 for new applications. See "Unicode Support" to determine which encodings your driver supports.

See also Unicode Support on page 47

Unicode Support in Databases Recently, database vendors have begun to support Unicode data types natively in their systems. With Unicode support, one database can hold multiple languages. For example, a large multinational corporation could store expense data in the local languages for the Japanese, U.S., English, German, and French offices in one database. Not surprisingly, the implementation of Unicode data types varies from vendor to vendor. For example, the Microsoft SQL Server 2000 implementation of Unicode provides data in UTF-16 format, while Oracle provides Unicode data types in UTF-8 and UTF-16 formats. A consistent implementation of Unicode not only depends on the operating system, but also on the database itself.

Unicode Support in ODBC Prior to the ODBC 3.5 standard, all ODBC access to function calls and string data types was through ANSI encoding (either ASCII or DBCS). Applications and drivers were both ANSI-based. The ODBC 3.5 standard specified that the ODBC Driver Manager be capable of mapping both Unicode function calls and string data types to ANSI encoding as transparently as possible.This meant that ODBC 3.5-compliant Unicode applications could use Unicode function calls and string data types with ANSI drivers because the Driver Manager could convert them to ANSI. Because of character limitations in ANSI, however, not all conversions are possible. The ODBC Driver Manager version 3.5 and later, therefore, supports the following configurations:

· ANSI application with an ANSI driver · ANSI application with a Unicode driver · Unicode application with a Unicode driver · Unicode application with an ANSI driver

182 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 A Unicode application can work with an ANSI driver because the Driver Manager provides limited Unicode-to-ANSI mapping. The Driver Manager makes it possible for a pre-3.5 ANSI driver to work with a Unicode application. What distinguishes a Unicode driver from a non-Unicode driver is the Unicode driver©s capacity to interpret Unicode function calls without the intervention of the Driver Manager, as described in the following section.

Unicode and Non-Unicode ODBC Drivers

The way in which a driver handles function calls from a Unicode application determines whether it is considered a "Unicode driver."

Function Calls Instead of the standard ANSI SQL function calls, such as SQLConnect, Unicode applications use "W" (wide) function calls, such as SQLConnectW. If the driver is a true Unicode driver, it can understand "W" function calls and the Driver Manager can pass them through to the driver without conversion to ANSI. The Progress DataDirect for ODBC for Apache Cassandra driver supports "W" function calls. If a driver is a non-Unicode driver, it cannot understand W function calls, and the Driver Manager must convert them to ANSI calls before sending them to the driver. The Driver Manager determines the ANSI encoding system to which it must convert by referring to a code page. On Windows, this reference is to the Active Code Page. On non-Windows platforms, it is to the IANAAppCodePage connection string attribute, part of the odbc.ini file. The following examples illustrate these conversion streams for the Progress DataDirect for ODBC drivers. The Driver Manager on UNIX and Linux determines the type of Unicode encoding of both the application and the driver, and performs conversions when the application and driver use different types of encoding. This determination is made by checking two ODBC attributes: SQL_ATTR_APP_UNICODE_TYPE and SQL_ATTR_DRIVER_UNICODE_TYPE, which can be set for either the environment, using SQLSetEnvAttr, or the connection, using SQLSetConnectAttr. "Driver Manager and Unicode Encoding on UNIX/Linux" describes in detail how this is done.

See also Driver Manager and Unicode Encoding on UNIX/Linux on page 186

Unicode Application with a Non-Unicode Driver An operation involving a Unicode application and a non-Unicode driver incurs more overhead because function conversion is involved.

Windows 1. The Unicode application sends UCS-2/UTF-16 function calls to the Driver Manager. 2. The Driver Manager converts the function calls from UCS-2/UTF-16 to ANSI.The type of ANSI is determined by the Driver Manager through reference to the client machine's Active Code Page. 3. The Driver Manager sends the ANSI function calls to the non-Unicode driver. 4. The driver returns ANSI argument values to the Driver Manager. 5. The Driver Manager converts the function calls from ANSI to UCS-2/UTF-16 and returns these converted calls to the application.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 183 Chapter 11: Internationalization, Localization, and Unicode

UNIX and Linux 1. The Unicode application sends function calls to the Driver Manager. The Driver Manager expects the string arguments in these function calls to be UTF-8 or UTF-16 based on the value of the SQL_ATTR_APP_UNICODE_TYPE attribute. Note that the SQL_ATTR_APP_UNICODE_TYPE attribute can be set for the environment, using SQLSetEnvAttr, or the connection, using SQLSetConnectAttr. 2. The Driver Manager converts the function calls from UTF-8 or UTF-16 to ANSI. The type of ANSI is determined by the Driver Manager through reference to the client machine's value for the IANAAppCodePage connection string attribute. 3. The Driver Manager sends the converted ANSI function calls to the non-Unicode driver. 4. The driver returns ANSI argument values to the Driver Manager. 5. The Driver Manager converts the function calls from ANSI to UTF-8 or UTF-16 and returns these converted calls to the application.

Unicode Application with a Unicode Driver An operation involving a Unicode application and a Unicode driver that use the same Unicode encoding is efficient because no function conversion is involved. If the application and the driver each use different types of encoding, there is some conversion overhead. See "Driver Manager and Unicode Encoding on UNIX/Linux" for details.

Windows 1. The Unicode application sends UCS-2 or UTF-16 function calls to the Driver Manager. 2. The Driver Manager does not have to convert the UCS-2/UTF-16 function calls to ANSI. It passes the Unicode function call to the Unicode driver. 3. The driver returns UCS-2/UTF-16 argument values to the Driver Manager. 4. The Driver Manager returns UCS-2/UTF-16 function calls to the application.

UNIX and Linux 1. The Unicode application sends function calls to the Driver Manager. The Driver Manager expects the string arguments in these function calls to be UTF-8 or UTF-16 based on the value of the SQL_ATTR_APP_UNICODE_TYPE attribute. Note that the SQL_ATTR_APP_UNICODE_TYPE attribute can be set for the environment, using SQLSetEnvAttr, or the connection, using SQLSetConnectAttr. 2. The Driver Manager passes Unicode function calls to the Unicode driver.The Driver Manager has to perform function call conversions if the SQL_ATTR_APP_UNICODE_TYPE is different from the SQL_ATTR_DRIVER_UNICODE_TYPE. 3. The driver returns argument values to the Driver Manager. Whether these are UTF-8 or UTF-16 argument values is based on the value of the SQL_ATTR_DRIVER_UNICODE_TYPE attribute. 4. The Driver Manager returns appropriate function calls to the application based on the SQL_ATTR_APP_UNICODE_TYPE attribute value. The Driver Manager has to perform function call conversions if the SQL_ATTR_DRIVER_UNICODE_TYPE value is different from the SQL_ATTR_APP_UNICODE_TYPE value.

See also Driver Manager and Unicode Encoding on UNIX/Linux on page 186

184 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Data ODBC C data types are used to indicate the type of C buffers that store data in the application. This is in contrast to SQL data types, which are mapped to native database types to store data in a database (data store). ANSI applications bind to the C data type SQL_C_CHAR and expect to receive information bound in the same way. Similarly, most Unicode applications bind to the C data type SQL_C_WCHAR (wide data type) and expect to receive information bound in the same way. Any ODBC 3.5-compliant Unicode driver must be capable of supporting SQL_C_CHAR and SQL_C_WCHAR so that it can return data to both ANSI and Unicode applications. When the driver communicates with the database, it must use ODBC SQL data types, such as SQL_CHAR and SQL_WCHAR, that map to native database types. In the case of ANSI data and an ANSI database, the driver receives data bound to SQL_C_CHAR and passes it to the database as SQL_CHAR. The same is true of SQL_C_WCHAR and SQL_WCHAR in the case of Unicode data and a Unicode database. When data from the application and the data stored in the database differ in format, for example, ANSI application data and Unicode database data, conversions must be performed. The driver cannot receive SQL_C_CHAR data and pass it to a Unicode database that expects to receive a SQL_WCHAR data type. The driver or the Driver Manager must be capable of converting SQL_C_CHAR to SQL_WCHAR, and vice versa. The simplest cases of data communication are when the application, the driver, and the database are all of the same type and encoding, ANSI-to-ANSI-to-ANSI or Unicode-to-Unicode-to-Unicode. There is no data conversion involved in these instances. When a difference exists between data types, a conversion from one type to another must take place at the driver or Driver Manager level, which involves additional overhead. The type of driver determines whether these conversions are performed by the driver or the Driver Manager. "Driver Manager and Unicode Encoding on UNIX/Linux" describes how the Driver Manager determines the type of Unicode encoding of the application and driver.

The following sections discuss two basic types of data conversion in the Progress DataDirect for ODBC driver and the Driver Manager. How an individual driver exchanges different types of data with a particular database at the database level is beyond the scope of this discussion.

See also Driver Manager and Unicode Encoding on UNIX/Linux on page 186

Unicode Driver The Unicode driver, not the Driver Manager, must convert SQL_C_CHAR (ANSI) data to SQL_WCHAR (Unicode) data, and vice versa, as well as SQL_C_WCHAR (Unicode) data to SQL_CHAR (ANSI) data, and vice versa. The driver must use client code page information (Active Code Page on Windows and IANAAppCodePage attribute on UNIX/Linux) to determine which ANSI code page to use for the conversions.The Active Code Page or IANAAppCodePage must match the database default character encoding; if it does not, conversion errors are possible.

ANSI Driver The Driver Manager, not the ANSI driver, must convert SQL_C_WCHAR (Unicode) data to SQL_CHAR (ANSI) data, and vice versa (see "Unicode Support in ODBC" for a detailed discussion). This is necessary because ANSI drivers do not support any Unicode ODBC types. The Driver Manager must use client code page information (Active Code Page on Windows and the IANAAppCodePage attribute on UNIX/Linux) to determine which ANSI code page to use for the conversions. The Active Code Page or IANAAppCodePage must match the database default character encoding. If not, conversion errors are possible.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 185 Chapter 11: Internationalization, Localization, and Unicode

See also Unicode Support in ODBC on page 182

Default Unicode Mapping The following table shows the default Unicode mapping for an application's SQL_C_WCHAR variables.

Platform Default Unicode Mapping Windows UCS-2/UTF-16 AIX UTF-8 HP-UX UTF-8 Solaris UTF-8 Linux UTF-8

Connection Attribute for Unicode If you do not want to use the default Unicode mappings for SQL_C_WCHAR, a connection attribute is available to override the default mappings. This attribute determines how character data is converted and presented to an application and the database.

Attribute Description SQL_ATTR_APP_WCHAR_TYPE (1061) Sets the SQL_C_WCHAR type for parameter and column binding to the Unicode type, either SQL_DD_CP_UTF16 (default for Windows) or SQL_DD_CP_UTF8 (default for UNIX/Linux).

You can set this attribute before or after you connect. After this attribute is set, all conversions are made based on the character set specified. For example:

rc = SQLSetConnectAttr (hdbc, SQL_ATTR_APP_WCHAR_TYPE, (void *)SQL_DD_CP_UTF16, SQL_IS_INTEGER);

SQLGetConnectAttr and SQLSetConnectAttr for the SQL_ATTR_APP_WCHAR_TYPE attribute return a SQL State of HYC00 for drivers that do not support Unicode. This connection attribute and its valid values can be found in the file qesqlext.h, which is installed with the product.

Driver Manager and Unicode Encoding on UNIX/Linux

Unicode ODBC drivers on UNIX and Linux can use UTF-8 or UTF-16 encoding. To use a single UTF-8 or UTF-16 application with either a UTF-8 or UTF-16 driver, the Driver Manager must be able to determine with which type of encoding the application and driver use and, if necessary, convert them accordingly.

186 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 To make this determination, the Driver Manager supports a set of ODBC attributes that can be set for the environment or the connection. If your application uses both UTF-8 and UTF-16 drivers in the same environment, encoding should be set for the connection only; otherwise, either method can be used. · To configure the encoding type for the environment, set the ODBC environment attributes SQL_ATTR_APP_UNICODE_TYPE and SQL_ATTR_DRIVER_UNICODE_TYPE using SQLSetEnvAttr. · To configure the encoding for the connection only, set the ODBC connection attribute SQL_ATTR_APP_UNICODE_TYPE and SQL_ATTR_DRIVER_UNICODE_TYPE using SQLSetConnectAttr. The attributes support values of SQL_DD_CP_UTF8 and SQL_DD_CP_UTF16. The default value is SQL_DD_CP_UTF8.

Note: You must specify a value for SQL_ATTR_DRIVER_UNICODE_TYPE when using third-party drivers. However, for DataDirect drivers, the driver manager detects the Unicode type for the driver by default.

The Driver Manager performs the following steps before actually connecting to the driver. 1. Determine the application Unicode type: Applications that use UTF-16 encoding for their string types need to set SQL_ATTR_APP_UNICODE_TYPE accordingly at connection, or, if setting the encoding type for the environment, before connecting to any driver. When the Driver Manager reads this attribute, it expects all string arguments to the ODBC "W" functions to be in the specified Unicode format. This attribute also indicates how the SQL_C_WCHAR buffers must be encoded. 2. Determine the driver Unicode type: The Driver Manager must determine through which Unicode encoding the driver supports its "W" functions. This is done as follows: a. SQLGetEnvAttr(SQL_ATTR_DRIVER_UNICODE_TYPE) or SQLGetConnectATTR (SQL_ATTR_DRIVER_UNICODE_TYPE) is called in the driver by the Driver Manager. The driver, if capable, returns either SQL_DD_CP_UTF16 or SQL_DD_CP_UTF8 to indicate to the Driver Manager which encoding it expects. b. If the preceding call to SQLGetEnvAttr fails, the Driver Manager looks either in the Data Source section of the odbc.ini specified by the connection string or in the connection string itself for a connection option named DriverUnicodeType. Valid values for this option are 1 (UTF-16) or 2 (UTF-8). The Driver Manager assumes that the Unicode encoding of the driver corresponds to the value specified. c. If neither of the preceding attempts are successful, the Driver Manager assumes that the Unicode encoding of the driver is UTF-8.

3. Determine if the driver supports SQL_ATTR_WCHAR_TYPE: SQLSetConnectAttr (SQL_ATTR_WCHAR_TYPE, x) is called in the driver by the Driver Manager, where x is either SQL_DD_CP_UTF8 or SQL_DD_CP_UTF16, depending on the value of the SQL_ATTR_APP_UNICODE_TYPE setting. If the driver returns any error on this call to SQLSetConnectAttr, the Driver Manager assumes that the driver does not support this connection attribute. If an error occurs, the Driver Manager returns a warning. The Driver Manager does not convert all bound parameter data from the application Unicode type to the driver Unicode type specified by SQL_ATTR_DRIVER_UNICODE_TYPE. Neither does it convert all data bound as SQL_C_WCHAR to the application Unicode type specified by SQL_ATTR_APP_UNICODE_TYPE.

Based on the information it has gathered prior to connection, the Driver Manager either does not have to convert function calls, or, before calling the driver, it converts to either UTF-8 or UTF-16 all string arguments to calls to the ODBC "W" functions.

References The Java Tutorials, http://docs.oracle.com/javase/tutorial/i18n/index.html

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 187 Chapter 11: Internationalization, Localization, and Unicode

Unicode Support in the Solaris Operating Environment, May 2000, Sun Microsystems, Inc., 901 San Antonio Road, Palo Alto, CA 94303-4900

188 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 12

Designing ODBC applications for performance optimization

Developing performance-oriented ODBC applications is not easy. Microsoft's ODBC Programmer's Reference does not provide information about system performance. In addition, ODBC drivers and the ODBC driver manager do not return warnings when applications run inefficiently. This chapter contains some general guidelines that have been compiled by examining the ODBC implementations of numerous shipping ODBC applications. These guidelines include:

· Use catalog functions appropriately · Retrieve only required data · Select functions that optimize performance · Manage connections and updates Following these general rules will help you solve some common ODBC performance problems, such as those listed in the following table.

Table 32: Common Performance Problems Using ODBC Applications

Problem Solution See guidelines in...

Network communication is slow. Reduce network traffic. "Using Catalog Functions"

The process of evaluating complex Simplify queries. "Using Catalog Functions" SQL queries on the database server "Selecting ODBC Functions" is slow and can reduce concurrency.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 189 Chapter 12: Designing ODBC applications for performance optimization

Problem Solution See guidelines in...

Excessive calls from the application Optimize "Retrieving Data" to the driver slow performance. application-to-driver "Selecting ODBC Functions" interaction.

Disk I/O is slow. Limit disk input/output. "Managing Connections and Updates"

For details, see the following topics:

· Using catalog functions · Retrieving data · Selecting ODBC functions · Managing connections and updates

Using catalog functions

Because catalog functions, such as those listed here, are slow compared to other ODBC functions, their frequent use can impair system performance:

· SQLColumns · SQLForeignKeys · SQLGetTypeInfo · SQLSpecialColumns · SQLStatistics · SQLTables SQLGetTypeInfo is included in this list of expensive ODBC functions because many drivers must query the server to obtain accurate information about which types are supported (for example, to find dynamic types such as user defined types).

Caching information to minimize the use of catalog functions To return all result column information mandated by the ODBC specification, a driver may have to perform multiple queries, joins, subqueries, or unions to return the required result set for a single call to a catalog function. These particular elements of the SQL language are performance expensive. Although it is almost impossible to write an ODBC application without catalog functions, their use should be minimized. By caching information, applications can avoid multiple executions. For example, call SQLGetTypeInfo once in the application and cache the elements of the result set that your application depends on. It is unlikely that any application uses all elements of the result set generated by a catalog function, so the cached information should not be difficult to maintain.

190 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Avoiding search patterns Passing NULL arguments or search patterns to catalog functions generates time-consuming queries. In addition, network traffic potentially increases because of unwanted results. Always supply as many non-NULL arguments to catalog functions as possible. Because catalog functions are slow, applications should invoke them efficiently. Any information that the application can send the driver when calling catalog functions can result in improved performance and reliability. For example, consider a call to SQLTables where the application requests information about the table "Customers." Often, this call is coded as shown, using as many NULL arguments as possible: rc = SQLTables (hstmt, NULL, 0, NULL, 0, "Customers", SQL_NTS, NULL, 0); A driver processes this SQLTables call into SQL that looks like this:

SELECT ... FROM SysTables WHERE TableName = 'Customers' UNION ALL SELECT ... FROM SysViews WHERE ViewName = 'Customers' UNION ALL SELECT ... FROM SysSynonyms WHERE SynName = 'Customers' ORDER BY ...

In our example, the application provides scant information about the object for which information was requested. Suppose three "Customers" tables were returned in the result set: the first table owned by the user named Beth, the second owned by the sales department, and the third a view created by management. It may not be obvious to the end user which table to choose. If the application had specified the OwnerName argument in the SQLTables call, only one table would be returned and performance would improve. Less network traffic would be required to return only one result row and unwanted rows would be filtered by the database. In addition, if the TableType argument was supplied, the SQL sent to the server can be optimized from a three-query union into a single Select statement as shown: SELECT ... FROM SysTables WHERE TableName = ©Customers© AND Owner = ©Beth©

Using a dummy query to determine table characteristics Avoid using SQLColumns to determine characteristics about a table. Instead, use a dummy query with SQLDescribeCol. Consider an application that allows the user to choose the columns that will be selected. Should the application use SQLColumns to return information about the columns to the user or prepare a dummy query and call SQLDescribeCol? Case 1: SQLColumns Method

rc = SQLColumns (... "UnknownTable" ...); // This call to SQLColumns will generate a query to the system catalogs... // possibly a join which must be prepared, executed, and produce a result set rc = SQLBindCol (...); rc = SQLExtendedFetch (...); // user must retrieve N rows from the server // N = # result columns of UnknownTable // result column information has now been obtained

Case 2: SQLDescribeCol Method

// prepare dummy query rc = SQLPrepare (... "SELECT * FROM UnknownTable WHERE 1 = 0" ...); // query is never executed on the server - only prepared rc = SQLNumResultCols (...); for (irow = 1; irow <= NumColumns; irow++) { rc = SQLDescribeCol (...) // + optional calls to SQLColAttributes

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 191 Chapter 12: Designing ODBC applications for performance optimization

} // result column information has now been obtained // Note we also know the column ordering within the table! // This information cannot be assumed from the SQLColumns example.

In both cases, a query is sent to the server, but in Case 1, the query must be evaluated and form a result set that must be sent to the client. Clearly, Case 2 is the better performing model. To complicate this discussion, let us consider a database server that does not natively support preparing a SQL statement. The performance of Case 1 does not change, but the performance of Case 2 improves slightly because the dummy query is evaluated before being prepared. Because the Where clause of the query always evaluates to FALSE, the query generates no result rows and should execute without accessing table data. Again, for this situation, Case 2 outperforms Case 1.

Retrieving data

To retrieve data efficiently, return only the data that you need, and choose the most efficient method of doing so. The guidelines in this section will help you optimize system performance when retrieving data with ODBC applications.

Retrieving long data Because retrieving long data across the network is slow and resource-intensive, applications should not request long data (SQL_LONGVARCHAR, SQL_WLONGVARCHAR, and SQL_LONGVARBINARY data) unless it is necessary. Most users do not want to see long data. If the user does need to see these result items, the application can query the database again, specifying only long columns in the select list. This technique allows the average user to retrieve the result set without having to pay a high performance penalty for network traffic. Although the best approach is to exclude long data from the select list, some applications do not formulate the select list before sending the query to the ODBC driver (that is, some applications simply SELECT * FROM table_name ...). If the select list contains long data, the driver must retrieve that data at fetch time even if the application does not bind the long data in the result set. When possible, use a technique that does not retrieve all columns of the table.

Reducing the size of data retrieved Sometimes, long data must be retrieved. When this is the case, remember that most users do not want to see 100 KB, or more, of text on the screen. To reduce network traffic and improve performance, you can reduce the size of data being retrieved to some manageable limit by calling SQLSetStmtAttr with the SQL_ATTR_MAX_LENGTH option. Eliminating SQL_LONGVARCHAR, SQL_WLONGVARCHAR, and SQL_LONGVARBINARY data from the result set is ideal for optimizing performance. Many application developers mistakenly assume that if they call SQLGetData with a container of size x that the ODBC driver only retrieves x bytes of information from the server. Because SQLGetData can be called multiple times for any one column, most drivers optimize their network use by retrieving long data in large chunks and then returning it to the user when requested. For example: char CaseContainer[1000]; ... rc = SQLExecDirect (hstmt, "SELECT CaseHistory FROM Cases WHERE CaseNo = 71164", SQL_NTS); ... rc = SQLFetch (hstmt); rc = SQLGetData (hstmt, 1, CaseContainer,(SWORD) sizeof(CaseContainer), ...);

192 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 At this point, it is more likely that an ODBC driver will retrieve 64 KB of information from the server instead of 1 KB. In terms of network access, one 64-KB retrieval is less expensive than 64 retrievals of 1 KB. Unfortunately, the application may not call SQLGetData again; therefore, the first and only retrieval of CaseHistory would be slowed by the fact that 64 KB of data must be sent across the network. Many ODBC drivers allow you to limit the amount of data retrieved across the network by supporting the SQL_MAX_LENGTH attribute. This attribute allows the driver to communicate to the database server that only x bytes of data are relevant to the client. The server responds by sending only the first x bytes of data for all result columns. This optimization substantially reduces network traffic and improves client performance. The previous example returned only one row, but consider the case where 100 rows are returned in the result setÐthe performance improvement would be substantial.

Using bound columns Retrieving data through bound columns (SQLBindCol) instead of using SQLGetData reduces the ODBC call load and improves performance. Consider the following code fragment: rc = SQLExecDirect (hstmt, "SELECT <20 columns> FROM Employees WHERE HireDate >= ?", SQL_NTS); do { rc = SQLFetch (hstmt); // call SQLGetData 20 times } while ((rc == SQL_SUCCESS) || (rc == SQL_SUCCESS_WITH_INFO)); Suppose the query returns 90 result rows. In this case, 1891 ODBC calls are made (20 calls to SQLGetData x 90 result rows + 91 calls to SQLFetch). Consider the same scenario that uses SQLBindCol instead of SQLGetData: rc = SQLExecDirect (hstmt, "SELECT <20 columns> FROM Employees WHERE HireDate >= ?", SQL_NTS); // call SQLBindCol 20 times do { rc = SQLFetch (hstmt); } while ((rc == SQL_SUCCESS) || (rc == SQL_SUCCESS_WITH_INFO)); The number of ODBC calls made is reduced from 1891 to 111 (20 calls to SQLBindCol + 91 calls to SQLFetch). In addition to reducing the call load, many drivers optimize how SQLBindCol is used by binding result information directly from the database server into the user's buffer. That is, instead of the driver retrieving information into a container and then copying that information to the user's buffer, the driver simply requests the information from the server be placed directly into the user's buffer.

Using SQLExtendedFetch instead of SQLFetch Use SQLExtendedFetch to retrieve data instead of SQLFetch. The ODBC call load decreases (resulting in better performance) and the code is less complex (resulting in more maintainable code). Most ODBC drivers now support SQLExtendedFetch for forward only cursors; yet, most ODBC applications use SQLFetch to retrieve data. Consider the examples in "Using Bound Columns", this time using SQLExtendedFetch instead of SQLFetch: rc = SQLSetStmtOption (hstmt, SQL_ROWSET_SIZE, 100); // use arrays of 100 elements rc = SQLExecDirect (hstmt, "SELECT <20 columns> FROM Employees WHERE HireDate >= ?", SQL_NTS); // call SQLBindCol 1 time specifying row-wise binding do { rc = SQLExtendedFetch (hstmt, SQL_FETCH_NEXT, 0, &RowsFetched,RowStatus); } while ((rc == SQL_SUCCESS) || (rc == SQL_SUCCESS_WITH_INFO)); Notice the improvement from the previous examples. The initial call load was 1891 ODBC calls. By choosing ODBC calls carefully, the number of ODBC calls made by the application has now been reduced to 4 (1 SQLSetStmtOption + 1 SQLExecDirect + 1 SQLBindCol + 1 SQLExtendedFetch). In addition to reducing the call load, many ODBC drivers retrieve data from the server in arrays, further improving the performance by reducing network traffic.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 193 Chapter 12: Designing ODBC applications for performance optimization

For ODBC drivers that do not support SQLExtendedFetch, the application can enable forward-only cursors using the ODBC cursor library:

(rc=SQLSetConnectOption (hdbc, SQL_ODBC_CURSORS, SQL_CUR_USE_IF_NEEDED);

Although using the cursor library does not improve performance, it should not be detrimental to application response time when using forward-only cursors (no logging is required). Furthermore, using the cursor library means that the application can always depend on SQLExtendedFetch being available. This simplifies the code because the application does not require two algorithms (one using SQLExtendedFetch and one using SQLFetch).

See also Using bound columns on page 193

Choosing the right data type Retrieving and sending certain data types can be expensive. When you are working with data on a large scale, select the data type that can be processed most efficiently. For example, integer data is processed faster than floating-point data. Floating-point data is defined according to internal database-specific formats, usually in a compressed format. The data must be decompressed and converted into a different format so that it can be processed by the wire protocol.

Selecting ODBC functions

The guidelines in this section will help you select which ODBC functions will give you the best performance.

Using SQLPrepare/SQLExecute and SQLExecDirect Using SQLPrepare/SQLExecute is not always as efficient as SQLExecDirect. Use SQLExecDirect for queries that will be executed once and SQLPrepare/SQLExecute for queries that will be executed multiple times. ODBC drivers are optimized based on the perceived use of the functions that are being executed. SQLPrepare/SQLExecute is optimized for multiple executions of statements that use parameter markers. SQLExecDirect is optimized for a single execution of a SQL statement. Unfortunately, more than 75% of all ODBC applications use SQLPrepare/SQLExecute exclusively. Consider the case where an ODBC driver implements SQLPrepare by creating a stored procedure on the server that contains the prepared statement. Creating stored procedures involve substantial overhead, but the statement can be executed multiple times. Although creating stored procedures is performance-expensive, execution is minimal because the query is parsed and optimization paths are stored at create procedure time. Using SQLPrepare/SQLExecute for a statement that is executed only once results in unnecessary overhead. Furthermore, applications that use SQLPrepare/SQLExecute for large single execution query batches exhibit poor performance. Similarly, applications that always use SQLExecDirect do not perform as well as those that use a logical combination of SQLPrepare/SQLExecute and SQLExecDirect sequences.

Using arrays of parameters Passing arrays of parameter values for bulk insert operations, for example, with SQLPrepare/SQLExecute and SQLExecDirect can reduce the ODBC call load and network traffic.To use arrays of parameters, the application calls SQLSetStmtAttr with the following attribute arguments:

· SQL_ATTR_PARAMSET_SIZE sets the array size of the parameter.

194 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 · SQL_ATTR_PARAMS_PROCESSED_PTR assigns a variable filled by SQLExecute, which contains the number of rows that are actually inserted. · SQL_ATTR_PARAM_STATUS_PTR points to an array in which status information for each row of parameter values is returned.

Note: ODBC 3.x replaced the ODBC 2.x call to SQLParamOptions with calls to SQLSetStmtAttr using the SQL_ATTR_PARAMSET_SIZE, SQL_ATTR_PARAMS_PROCESSED_ARRAY, and SQL_ATTR_PARAM_STATUS_PTR arguments.

Before executing the statement, the application sets the value of each data element in the bound array. When the statement is executed, the driver tries to process the entire array contents using one network roundtrip. For example, let us compare the following examples, Case 1 and Case 2. Case 1: Executing Prepared Statement Multiple Times rc = SQLPrepare (hstmt, "INSERT INTO DailyLedger (...) VALUES (?,?,...)", SQL_NTS); // bind parameters ... do { // read ledger values into bound parameter buffers ... rc = SQLExecute (hstmt); // insert row } while ! (eof); Case 2: Using Arrays of Parameters SQLPrepare (hstmt, " INSERT INTO DailyLedger (...) VALUES (?,?,...)", SQL_NTS); SQLSetStmtAttr (hstmt, SQL_ATTR_PARAMSET_SIZE, (UDWORD)100, SQL_IS_UINTEGER); SQLSetStmtAttr (hstmt, SQL_ATTR_PARAMS_PROCESSED_PTR, &rows_processed, SQL_IS_POINTER); // Specify an array in which to return the status of // each set of parameters. SQLSetStmtAttr(hstmt, SQL_ATTR_PARAM_STATUS_PTR, ParamStatusArray, SQL_IS_POINTER); // pass 100 parameters per execute // bind parameters ... do { // read up to 100 ledger values into // bound parameter buffers ... rc = SQLExecute (hstmt); // insert a group of 100 rows } while ! (eof); In Case 1, if there are 100 rows to insert, 101 network roundtrips are required to the server, one to prepare the statement with SQLPrepare and 100 additional roundtrips for each time SQLExecute is called. In Case 2, the call load has been reduced from 100 SQLExecute calls to only 1 SQLExecute call. Furthermore, network traffic is reduced considerably.

Using the cursor library If the driver provides scrollable cursors, do not use the cursor library.The cursor library creates local temporary log files, which are performance-expensive to generate and provide worse performance than native scrollable cursors. The cursor library adds support for static cursors, which simplifies the coding of applications that use scrollable cursors. However, the cursor library creates temporary log files on the user's local disk drive to accomplish the task. Typically, disk I/O is a slow operation. Although the cursor library is beneficial, applications should not automatically choose to use the cursor library when an ODBC driver supports scrollable cursors natively.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 195 Chapter 12: Designing ODBC applications for performance optimization

Typically, ODBC drivers that support scrollable cursors achieve high performance by requesting that the database server produce a scrollable result set instead of emulating the capability by creating log files. Many applications use: rc = SQLSetConnectOption (hdbc, SQL_ODBC_CURSORS, SQL_CUR_USE_ODBC); but should use: rc = SQLSetConnectOption (hdbc, SQL_ODBC_CURSORS, SQL_CUR_USE_IF_NEEDED);

Managing connections and updates

The guidelines in this section will help you to manage connections and updates to improve system performance for your ODBC applications.

Managing connections Connection management is important to application performance. Optimize your application by connecting once and using multiple statement handles, instead of performing multiple connections. Avoid connecting to a data source after establishing an initial connection. Although gathering driver information at connect time is a good practice, it is often more efficient to gather it in one step rather than two steps. Some ODBC applications are designed to call informational gathering routines that have no record of already attached connection handles. For example, some applications establish a connection and then call a routine in a separate DLL or shared library that reattaches and gathers information about the driver. Applications that are designed as separate entities should pass the already connected HDBC pointer to the data collection routine instead of establishing a second connection. Another bad practice is to connect and disconnect several times throughout your application to process SQL statements. Connection handles can have multiple statement handles associated with them. Statement handles can provide memory storage for information about SQL statements. Therefore, applications do not need to allocate new connection handles to process SQL statements. Instead, applications should use statement handles to manage multiple SQL statements. You can significantly improve performance with connection pooling, especially for applications that connect over a network or through the World Wide Web. With connection pooling, closing connections does not close the physical connection to the database. When an application requests a connection, an active connection from the connection pool is reused, avoiding the network round trips needed to create a new connection. Connection and statement handling should be addressed before implementation. Spending time and thoughtfully handling connection management improves application performance and maintainability.

Managing commits in transactions Committing data is extremely disk I/O intensive and slow. If the driver can support transactions, always turn autocommit off. What does a commit actually involve? The database server must flush back to disk every data page that contains updated or new data. This is not a sequential write but a searched write to replace existing data in the table. By default, autocommit is on when connecting to a data source. Autocommit mode usually impairs system performance because of the significant amount of disk I/O needed to commit every operation. Some database servers do not provide an Autocommit mode. For this type of server, the ODBC driver must explicitly issue a COMMIT statement and a BEGIN TRANSACTION for every operation sent to the server. In addition to the large amount of disk I/O required to support Autocommit mode, a performance penalty is paid for up to three network requests for every statement issued by an application.

196 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Although using transactions can help application performance, do not take this tip too far. Leaving transactions active can reduce throughput by holding locks on rows for long times, preventing other users from accessing the rows. Commit transactions in intervals that allow maximum concurrency.

Choosing the right transaction model Many systems support distributed transactions; that is, transactions that span multiple connections. Distributed transactions are at least four times slower than normal transactions due to the logging and network round trips necessary to communicate between all the components involved in the distributed transaction. Unless distributed transactions are required, avoid using them. Instead, use local transactions when possible.

Using positioned updates and deletes Use positioned updates and deletes or SQLSetPos to update data. Although positioned updates do not apply to all types of applications, developers should use positioned updates and deletes when it makes sense. Positioned updates (either through UPDATE WHERE CURRENT OF CURSOR or through SQLSetPos) allow the developer to signal the driver to "change the data here" by positioning the database cursor at the appropriate row to be changed. The designer is not forced to build a complex SQL statement, but simply supplies the data to be changed. In addition to making the application more maintainable, positioned updates usually result in improved performance. Because the database server is already positioned on the row for the Select statement in process, performance-expensive operations to locate the row to be changed are not needed. If the row must be located, the server typically has an internal pointer to the row available (for example, ROWID).

Using SQLSpecialColumns Use SQLSpecialColumns to determine the optimal set of columns to use in the Where clause for updating data. Often, pseudo-columns provide the fastest access to the data, and these columns can only be determined by using SQLSpecialColumns. Some applications cannot be designed to take advantage of positioned updates and deletes.These applications typically update data by forming a Where clause consisting of some subset of the column values returned in the result set. Some applications may formulate the Where clause by using all searchable result columns or by calling SQLStatistics to find columns that are part of a unique index. These methods typically work, but can result in fairly complex queries. Consider the following example:

rc = SQLExecDirect (hstmt, "SELECT first_name, last_name, ssn, address, city, state, zip FROM emp", SQL_NTS); // fetchdata ... rc = SQLExecDirect (hstmt, "UPDATE EMP SET ADDRESS = ? WHERE first_name = ? AND last_name = ? AND ssn = ? AND address = ? AND city = ? AND state = ? AND zip = ?", SQL_NTS); // fairly complex query

Applications should call SQLSpecialColumns/SQL_BEST_ROWID to retrieve the optimal set of columns (possibly a pseudo-column) that identifies a given record. Many databases support special columns that are not explicitly defined by the user in the table definition but are "hidden" columns of every table (for example, ROWID and TID). These pseudo-columns provide the fastest access to data because they typically point to the exact location of the record. Because pseudo-columns are not part of the explicit table definition, they are not returned from SQLColumns. To determine if pseudo-columns exist, call SQLSpecialColumns.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 197 Chapter 12: Designing ODBC applications for performance optimization

Consider the previous example again:

... rc = SQLSpecialColumns (hstmt, ..... 'emp', ...); ... rc = SQLExecDirect (hstmt, "SELECT first_name, last_name, ssn, address, city, state, zip, ROWID FROM emp", SQL_NTS); // fetch data and probably "hide" ROWID from the user ... rc = SQLExecDirect (hstmt, "UPDATE emp SET address = ? WHERE ROWID = ?",SQL_NTS); // fastest access to the data!

If your data source does not contain special pseudo-columns, the result set of SQLSpecialColumns consists of columns of the optimal unique index on the specified table (if a unique index exists).Therefore, your application does not need to call SQLStatistics to find the smallest unique index.

198 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 13

Using Indexes

This chapter discusses the ways in which you can improve the performance of database activity using indexes. It provides general guidelines that apply to most databases. Consult your database vendor's documentation for more detailed information.

For details, see the following topics:

· Introduction · Improving Row Selection Performance · Indexing Multiple Fields · Deciding Which Indexes to Create · Improving Join Performance

Introduction

An index is a database structure that you can use to improve the performance of database activity. A database table can have one or more indexes associated with it. An index is defined by a field expression that you specify when you create the index. Typically, the field expression is a single field name, like emp_id. An index created on the emp_id field, for example, contains a sorted list of the employee ID values in the table. Each value in the list is accompanied by references to the rows that contain that value.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 199 Chapter 13: Using Indexes

A database driver can use indexes to find rows quickly. An index on the emp_id field, for example, greatly reduces the time that the driver spends searching for a particular employee ID value. Consider the following Where clause: WHERE EMP_id = ©E10001© Without an index, the server must search the entire database table to find those rows having an employee ID of E10001. By using an index on the emp_id field, however, the server can quickly find those rows. Indexes may improve the performance of SQL statements.You may not notice this improvement with small tables, but it can be significant for large tables; however, there can be disadvantages to having too many indexes. Indexes can slow down the performance of some inserts, updates, and deletes when the driver has to maintain the indexes as well as the database tables. Also, indexes take additional disk space.

Improving Row Selection Performance

For indexes to improve the performance of selections, the index expression must match the selection condition exactly. For example, if you have created an index whose expression is last_name, the following Select statement uses the index: SELECT * FROM emp WHERE last_name = ©Smith© This Select statement, however, does not use the index: SELECT * FROM emp WHERE UPPER(last_name) = ©SMITH© The second statement does not use the index because the Where clause contains UPPER(last_name), which does not match the index expression last_name. If you plan to use the UPPER function in all your Select statements and your database supports indexes on expressions, then you should define an index using the expression UPPER(last_name).

Indexing Multiple Fields

If you often use Where clauses that involve more than one field, you may want to build an index containing multiple fields. Consider the following Where clause: WHERE last_name = ©Smith© AND first_name = ©Thomas© For this condition, the optimal index field expression is last_name, first_name. This creates a concatenated index. Concatenated indexes can also be used for Where clauses that contain only the first of two concatenated fields. The last_name, first_name index also improves the performance of the following Where clause (even though no first name value is specified):

200 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 last_name = ©Smith© Consider the following Where clause: WHERE last_name = ©Smith© AND middle_name = ©Edward© AND first_name = ©Thomas© If your index fields include all the conditions of the Where clause in that order, the driver can use the entire index. If, however, your index is on two nonconsecutive fields, for example, last_name and first_name, the driver can use only the last_name field of the index. The driver uses only one index when processing Where clauses. If you have complex Where clauses that involve a number of conditions for different fields and have indexes on more than one field, the driver chooses an index to use. The driver attempts to use indexes on conditions that use the equal sign as the relational operator rather than conditions using other operators (such as greater than). Assume you have an index on the emp_id field as well as the last_name field and the following Where clause: WHERE emp_id >= ©E10001© AND last_name = ©Smith© In this case, the driver selects the index on the last_name field. If no conditions have the equal sign, the driver first attempts to use an index on a condition that has a lower and upper bound, and then attempts to use an index on a condition that has a lower or upper bound.The driver always attempts to use the most restrictive index that satisfies the Where clause. In most cases, the driver does not use an index if the Where clause contains an OR comparison operator. For example, the driver does not use an index for the following Where clause: WHERE emp_id >= ©E10001© OR last_name = ©Smith©

Deciding Which Indexes to Create

Before you create indexes for a database table, consider how you will use the table. The most common operations on a table are:

· Inserting, updating, and deleting rows · Retrieving rows If you most often insert, update, and delete rows, then the fewer indexes associated with the table, the better the performance. This is because the driver must maintain the indexes as well as the database tables, thus slowing down the performance of row inserts, updates, and deletes. It may be more efficient to drop all indexes before modifying a large number of rows, and re-create the indexes after the modifications. If you most often retrieve rows, you must look further to define the criteria for retrieving rows and create indexes to improve the performance of these retrievals. Assume you have an employee database table and you will retrieve rows based on employee name, department, or hire date.You would create three indexesÐone on the dept field, one on the hire_date field, and one on the last_name field. Or perhaps, for the retrievals based on the name field, you would want an index that concatenates the last_name and the first_name fields (see "Indexing Multiple Fields" for details). Here are a few rules to help you decide which indexes to create:

· If your row retrievals are based on only one field at a time (for example, dept=©D101©), create an index on these fields. · If your row retrievals are based on a combination of fields, look at the combinations. · If the comparison operator for the conditions is And (for example, city = ©Raleigh© AND state = ©NC©), then build a concatenated index on the city and state fields. This index is also useful for retrieving rows based on the city field.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 201 Chapter 13: Using Indexes

· If the comparison operator is OR (for example, dept = ©D101© OR hire_date > {01/30/89}), an index does not help performance. Therefore, you need not create one. · If the retrieval conditions contain both AND and OR comparison operators, you can use an index if the OR conditions are grouped. For example: dept = ©D101© AND (hire_date > {01/30/89} OR exempt = 1) In this case, an index on the dept field improves performance.

· If the AND conditions are grouped, an index does not improve performance. For example: (dept = ©D101© AND hire_date) > {01/30/89}) OR exempt = 1

See also Indexing Multiple Fields on page 200

Improving Join Performance

When joining database tables, index tables can greatly improve performance. Unless the proper indexes are available, queries that use joins can take a long time. Assume you have the following Select statement: SELECT * FROM dept, emp WHERE dept.dept_id = emp.dept_id In this example, the dept and emp database tables are being joined using the dept_id field. When the driver executes a query that contains a join, it processes the tables from left to right and uses an index on the second table's join field (the dept field of the emp table). To improve join performance, you need an index on the join field of the second table in the FROM clause. If the FROM clause includes a third table, the driver also uses an index on the field in the third table that joins it to any previous table. For example: SELECT * FROM dept, emp, addr WHERE dept.dept_id = emp.dept AND emp.loc = addr.loc In this case, you should have an index on the emp.dept field and the addr.loc field.

202 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 14

Locking and Isolation Levels

This chapter discusses locking and isolation levels and how their settings can affect the data you retrieve. Different database systems support different locking and isolation levels. For Apache Cassandra systems, isolation level 0 (read uncommitted) is supported.

For details, see the following topics:

· Locking · Isolation Levels · Locking Modes and Levels

Locking

Locking is a database operation that restricts a user from accessing a table or record. Locking is used in situations where more than one user might try to use the same table or record at the same time. By locking the table or record, the system ensures that only one user at a time can affect the data. Locking is critical in multiuser databases, where different users can try to access or modify the same records concurrently. Although such concurrent database activity is desirable, it can create problems. Without locking, for example, if two users try to modify the same record at the same time, they might encounter problems ranging from retrieving bad data to deleting data that the other user needs. If, however, the first user to access a record can lock that record to temporarily prevent other users from modifying it, such problems can be avoided. Locking provides a way to manage concurrent database access while minimizing the various problems it can cause.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 203 Chapter 14: Locking and Isolation Levels

Isolation Levels

An isolation level represents a particular locking strategy employed in the database system to improve data consistency. The higher the isolation level, the more complex the locking strategy behind it. The isolation level provided by the database determines whether a transaction will encounter the following behaviors in data consistency: Dirty reads User 1 modifies a row. User 2 reads the same row before User 1 commits. User 1 performs a rollback. User 2 has read a row that has never really existed in the database. User 2 may base decisions on false data. Non-repeatable reads User 1 reads a row, but does not commit. User 2 modifies or deletes the same row and then commits. User 1 rereads the row and finds it has changed (or has been deleted). Phantom reads User 1 uses a search condition to read a set of rows, but does not commit. User 2 inserts one or more rows that satisfy this search condition, then commits. User 1 rereads the rows using the search condition and discovers rows that were not present before.

Isolation levels represent the database system's ability to prevent these behaviors. The American National Standards Institute (ANSI) defines four isolation levels:

· Read uncommitted (0) · Read committed (1) · Repeatable read (2) · Serializable (3) In ascending order (0±3), these isolation levels provide an increasing amount of data consistency to the transaction. At the lowest level, all three behaviors can occur. At the highest level, none can occur.The success of each level in preventing these behaviors is due to the locking strategies they use, which are as follows: Read uncommitted (0) Locks are obtained on modifications to the database and held until end of transaction (EOT). Reading from the database does not involve any locking. Read committed (1) Locks are acquired for reading and modifying the database. Locks are released after reading but locks on modified objects are held until EOT. Repeatable read (2) Locks are obtained for reading and modifying the database. Locks on all modified objects are held until EOT. Locks obtained for reading data are held until EOT. Locks on non-modified access structures (such as indexes and hashing structures) are released after reading. Serializable (3) All data read or modified is locked until EOT. All access structures that are modified are locked until EOT. Access structures used by the query are locked until EOT.

The following table shows what data consistency behaviors can occur at each isolation level.

Table 33: Isolation Levels and Data Consistency

Level Dirty Read Nonrepeatable Read Phantom Read

0, Read uncommitted Yes Yes Yes

1, Read committed No Yes Yes

204 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Level Dirty Read Nonrepeatable Read Phantom Read

2, Repeatable read No No Yes

3, Serializable No No No

Although higher isolation levels provide better data consistency, this consistency can be costly in terms of the concurrency provided to individual users. Concurrency is the ability of multiple users to access and modify data simultaneously. As isolation levels increase, so does the chance that the locking strategy used will create problems in concurrency. The higher the isolation level, the more locking involved, and the more time users may spend waiting for data to be freed by another user. Because of this inverse relationship between isolation levels and concurrency, you must consider how people use the database before choosing an isolation level.You must weigh the trade-offs between data consistency and concurrency, and decide which is more important.

Locking Modes and Levels

Different database systems use various locking modes, but they have two basic modes in common: shared and exclusive. Shared locks can be held on a single object by multiple users. If one user has a shared lock on a record, then a second user can also get a shared lock on that same record; however, the second user cannot get an exclusive lock on that record. Exclusive locks are exclusive to the user that obtains them. If one user has an exclusive lock on a record, then a second user cannot get either type of lock on the same record. Performance and concurrency can also be affected by the locking level used in the database system. The locking level determines the size of an object that is locked in a database. For example, many database systems let you lock an entire table, as well as individual records. An intermediate level of locking, page-level locking, is also common. A page contains one or more records and is typically the amount of data read from the disk in a single disk access.The major disadvantage of page-level locking is that if one user locks a record, a second user may not be able to lock other records because they are stored on the same page as the locked record.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 205 Chapter 14: Locking and Isolation Levels

206 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 15

WorkAround Options

Progress DataDirect has included non-standard connection options your driver that enable you to take full advantage of packaged ODBC-enabled applications requiring non-standard or extended behavior. To use these options, we recommend that you create a separate user data source for each application.

Your driver features the Extended Options configuration field on the Advanced tab of the driver's Setup dialog box.You can use the Extended Options field to enter undocumented connection options when instructed by Progress DataDirect Technical Support. Alternatively, you can make the change by updating the Registry. After you create the data source,

· On Windows, using the registry editor REGEDIT, open the HKEY_CURRENT_USER\SOFTWARE\ODBC\ODBC.INI section of the registry. Select the data source that you created. · On UNIX/Linux, using a text editor, open the odbc.ini file to edit the data source that you created. Add the string WorkArounds= (or WorkArounds2=) with a value of n (WorkArounds=n or WorkArounds2=n), where the value n is the cumulative value of all options added together. For example, if you wanted to use both WorkArounds=1 and WorkArounds=8, you would enter in the data source: WorkArounds=9

Warning: Each of these options has potential side effects related to its use. An option should only be used to address the specific problem for which it was designed. For example, WorkArounds=2 causes the driver to report that database qualifiers are not supported, even when they are. As a result, applications that use qualifiers may not perform correctly when this option is enabled.

The following list includes both WorkArounds and WorkArounds2.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 207 Chapter 15: WorkAround Options

WorkArounds=1. Enabling this option causes the driver to return 1 instead of 0 if the value for SQL_CURSOR_COMMIT_BEHAVIOR or SQL_CURSOR_ROLLBACK_BEHAVIOR is 0. Statements are prepared again by the driver. WorkArounds=2. Enabling this option causes the driver to report that database qualifiers are not supported. Some applications cannot process database qualifiers. WorkArounds=8. Enabling this option causes the driver to return 1 instead of -1 for SQLRowCount. If an ODBC driver cannot determine the number of rows affected by an Insert, Update, or Delete statement, it may return -1 in SQLRowCount. This may cause an error in some products. WorkArounds=16. Enabling this option causes the driver not to return an INDEX_QUALIFIER. For SQLStatistics, if an ODBC driver reports an INDEX_QUALIFIER that contains a period, some applications return a "tablename is not a valid name" error. WorkArounds=32. Enabling this option causes the driver to re-bind columns after calling SQLExecute for prepared statements. WorkArounds=64. Enabling this option results in a column name of Cposition where position is the ordinal position in the result set. For example, "SELECT col1, col2+col3 FROM table1" produces the column names "col1" and C2. For result columns that are expressions, SQLColAttributes/SQL_COLUMN_NAME returns an empty string. Use this option for applications that cannot process empty string column names. WorkArounds=256. Enabling this option causes the value of SQLGetInfo/SQL_ACTIVE_CONNECTIONS to be returned as 1. WorkArounds=512. Enabling this option prevents ROWID results. This option forces the SQLSpecialColumns function to return a unique index as returned from SQLStatistics. WorkArounds=2048. Enabling this option causes DATABASE= instead of DB= to be returned. For some data sources, Microsoft Access performs more efficiently when the output connection string of SQLDriverConnect returns DATABASE= instead of DB=. WorkArounds=65536. Enabling this option strips trailing zeros from decimal results, which prevents Microsoft Access from issuing an error when decimal columns containing trailing zeros are included in the unique index. WorkArounds=131072. Enabling this option turns all occurrences of the double quote character (") into the accent grave character (Á). Some applications always quote identifiers with double quotes. Double quoting can cause problems for data sources that do not return SQLGetInfo/SQL_IDENTIFIER_QUOTE_CHAR = double_quote. WorkArounds=524288. Enabling this option forces the maximum precision/scale settings. The Microsoft Foundation Classes (MFC) bind all SQL_DECIMAL parameters with a fixed precision and scale, which can cause truncation errors. WorkArounds=1048576. Enabling this option overrides the specified precision and sets the precision to 256. Some applications incorrectly specify a precision of 0 for character types when the value will be SQL_NULL_DATA. WorkArounds=2097152. Enabling this option overrides the specified precision and sets the precision to 2000. Some applications incorrectly specify a precision of -1 for character types. WorkArounds=4194304. Enabling this option converts, for PowerBuilder users, all catalog function arguments to uppercase unless they are quoted. WorkArounds=16777216. Enabling this option allows MS Access to retrieve Unicode data types as it expects the default conversion to be to SQL_C_CHAR and not SQL_C_WCHAR. WorkArounds=33554432. Enabling this option prevents MS Access from failing when SQLError returns an extremely long error message. WorkArounds=67108864. Enabling this option allows parameter bindings to work correctly with MSDASQL. WorkArounds=536870912. Enabling this option allows re-binding of parameters after calling SQLExecute for prepared statements.

208 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 WorkArounds=1073741824. Enabling this option addresses the assumption by the application that ORDER BY columns do not have to be in the SELECT list. This assumption may be incorrect for data sources such as Informix. WorkArounds2=2. Enabling this option causes the driver to ignore the ColumnSize/DecimalDigits specified by the application and use the database defaults instead. Some applications incorrectly specify the ColumnSize/DecimalDigits when binding timestamp parameters. WorkArounds2=4. Enabling this option reverses the order in which Microsoft Access returns native types so that Access uses the most appropriate native type. Microsoft Access uses the last native type mapping, as returned by SQLGetTypeInfo, for a given SQL type. WorkArounds2=8. Enabling this option causes the driver to add the bindoffset in the ARD to the pointers returned by SQLParamData. This is to work around an MSDASQL problem. WorkArounds2=16. Enabling this option causes the driver to ignore calls to SQLFreeStmt(RESET_PARAMS) and only return success without taking other action. It also causes parameter validation not to use the bind offset when validating the charoctetlength buffer. This is to work around a MSDASQL problem. WorkArounds2=24. Enabling this option allows a flat-file driver, such as dBASE, to operate properly under MSDASQL. WorkArounds2=32. Enabling this option appends "DSN=" to a connection string if it is not already included. Microsoft Access requires "DSN" to be included in a connection string. WorkArounds2=128. Enabling this option causes 0 to be returned by SQLGetInfo(SQL_ACTIVE_STATEMENTS). Some applications open extra connections if SQLGetInfo(SQL_ACTIVE_STATEMENTS) does not return 0. WorkArounds2=256. Enabling this option causes the driver to return Buffer Size for Long Data on calls to SQLGetData with a buffer size of 0 on columns of SQL type SQL_LONGVARCHAR or SQL_LONGVARBINARY. Applications should always set this workaround when using MSDASQL and retrieving long data. WorkArounds2=512. Enabling this option causes the flat-file drivers to return old literal prefixes and suffixes for date, time, and timestamp data types. Microsoft Query 2000 does not correctly handle the ODBC escapes that are currently returned as literal prefix and literal suffix. WorkArounds2=1024. Enabling this option causes the driver to return "N" for SQLGetInfo(SQL_MULT_RESULT_SETS). ADO incorrectly interprets SQLGetInfo(SQL_MULT_RESULT_SETS) to mean that the contents of the last result set returned from a stored procedure are the output parameters for the stored procedure. WorkArounds2=2048. Enabling this option causes the driver to accept 2.x SQL type defines as valid. ODBC 3.x applications that use the ODBC cursor library receive errors on bindings for SQL_DATE, SQL_TIME, and SQL_TIMESTAMP columns. The cursor library incorrectly rebinds these columns with the ODBC 2.x type defines. WorkArounds2=4096. Enabling this option causes the driver to internally adjust the length of empty strings. The ODBC Driver Manager incorrectly translates lengths of empty strings when a Unicode-enabled application uses a non-Unicode driver. Use this workaround only if your application is Unicode-enabled. WorkArounds2=8192. Enabling this option causes Microsoft Access not to pass the error -7748. Microsoft Access only asks for data as a two-byte SQL_C_WCHAR, which is an insufficient buffer size to store the UCS2 character and the null terminator; thus, the driver returns a warning, "01004 Data truncated" and returns a null character to Microsoft Access. Microsoft Access then passes error -7748.

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 209 Chapter 15: WorkAround Options

210 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 16

Threading

The ODBC specification mandates that all drivers must be thread-safe, that is, drivers must not fail when database requests are made on separate threads. It is a common misperception that issuing requests on separate threads always results in improved throughput. Because of network transport and database server limitations, some drivers serialize threaded requests to the server to ensure thread safety. The ODBC 3.0 specification does not provide a method to find out how a driver services threaded requests, although this information is useful to an application. All the Progress DataDirect for ODBC drivers provide this information to the user through the SQLGetInfo information type 1028. The result of calling SQLGetInfo with 1028 is a SQL_USMALLINT flag that denotes the session's thread model. A return value of 0 denotes that the session is fully thread-enabled and that all requests use the threaded model. A return value of 1 denotes that the session is restricted at the connection level. Sessions of this type are fully thread-enabled when simultaneous threaded requests are made with statement handles that do not share the same connection handle. In this model, if multiple requests are made from the same connection, the first request received by the driver is processed immediately and all subsequent requests are serialized. A return value of 2 denotes that the session is thread-impaired and all requests are serialized by the driver. Consider the following code fragment: rc = SQLGetInfo (hdbc, 1028, &ThreadModel, NULL, NULL);

If (rc == SQL_SUCCESS) { // driver is a DataDirect driver that can report threading information

if (ThreadModel == 0) // driver is unconditionally thread-enabled; application can take advantage of // threading

else if (ThreadModel == 1) // driver is thread-enabled when thread requests are from different connections // some applications can take advantage of threading

else if (ThreadModel == 2) // driver is thread-impaired; application should only use threads if it reduces

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 211 Chapter 16: Threading

// program complexity

} else // driver is not guaranteed to be thread-safe; use threading at your own risk

212 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Index

Index

A Config Options connection option 107 configuring address, IP 48 server mode 77 Administrator configuring a data source Windows ODBC and tracing 86 UNIX and Linux 21 Administrator, Data Source using a GUI 61 Windows 18 configuring Java logging 91 Advanced tab 67, 71 configuring security 73 aggregate functions 140 connecting via connection string AIX, See UNIX and Linux data source 74 Apache Cassandra driver connection attribute overview 25 ApplicationUsingThreads 104 Application Using Threads connection option 104 AsciiSize 105 arithmetic operators 153 AuthenticationMethod 106 arrays of parameter values, passing 194 ClusterNodes 106 Ascii Size connection option 105 ConfigOptions 107 Authentication Method connection option 106 CreateMap 108 Autocommit mode 196 DataSourceName 109 DecimalPrecision 110 B DecimalScale 110 Description 111 binary FetchSize 112 literals 152 HostName 113 operators 153 IANAAppCodePage 114 binding InitializationString 115 dynamic parameters 46 JVMArgs 115 parameter markers 46 JVMClasspath 116 bound columns 193 KeyspaceName 117 LogConfigFile 118 LoginTimeout 119 C LogonID 129 caching information to improve performance 190 NativeFetchSize 119 call load, reducing 193 Password 120 catalog functions, using 190 PortNumber 121 character encoding 181 QueryTimeout 121 character string literals 152 ReadConsistency 122 client code page, See code pages ReadOnly 123 cluster ReportCodepageConversionErrors 124 failover 82 ResultMemorySize 124 Cluster Nodes connection option 106 SchemaMap 126 code pages ServerPortNumber 127 IANAAppCodePage attribute 114, 163 SQLEngineMode 128 collection types 38 TransactionMode 129 column VarcharSize 130 names 151 VarintPrecision 130 comparison operators 154 WriteConsistency 131 complex types connection failover normalization 38 cluster 82 support 38 connection handles 196 concatenation operator 154 connection issues 95 conditions 157

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 213 Index

connection options data source (continued) Application Using Threads 104 configuring (continued) Ascii Size 105 Windows 18 Authentication Method 106 connecting via connection string 74 Cluster Nodes 106 connecting via logon dialog box 75 Config Options 107 Data Source Administrator Create Map 108 Windows 18 Data Source Name 109 Data Source Name connection option 109 Decimal Precision 110 data types Decimal Scale 110 choosing 194 Description 111 complex 38, 44 Fetch Size 112 database connections, testing 92±93 Host Name 113 databases IANAAppCodePage 114 supported 25 Initialization String 115 date and time functions 176 JVM Arguments 115 date/time literals 153 JVM Classpath 116 DDL (Data Definition Language) 133±134 Keyspace Name 117 ddlogging.properties file 92 Log Config File 118 ddtestlib tool 54, 88 Login Timeout 119 Decimal Precision connection option 110 Native Fetch Size 119 Decimal Scale connection option 110 overview 101 Delete statement 134 Password 120 demoodbc application 92 Port Number 121 Description connection option 111 Query Timeout 121 determining optimal set of columns 197 Read Consistency 122 diagnostic tools 85 Read Only 123 dirty reads 204 Report Codepage Conversion Errors 124 distributed transactions 197 required 61 documentation, about 15 Result Memory Size 124 double-byte character sets in UNIX and Linux 181 Schema Map 126 driver Server Port Number 127 API and scalar functions 169 SQL Engine Mode 128 code page values 163 Transaction Mode 129 connection option descriptions 101 User Name 129 ODBC compliance 33 Varchar Size 130 optimizing application performance 189 Varint Precision 130 supported features 47 Write Consistency 131 threading 211 connection string attributes 101 driver requirements 26 connections Driver to SQL Communication logger 90 optimizing 196 driver, logging for 89 supported 50 drivers contacting Customer Support 16 version string information 34 correlated subqueries 159 dynamic parameters, binding 46 Counter data type 134, 148 Create Map connection option 108 E cursor library, performance implications of using 195 enabling tracing 86 D environment variables 52 Data Definition Language (DDL) 133±134 environment variable, library path 95 data encryption 64 error messages data retrieval, optimizing 192 general 93 data source UNIX and Linux 93 configuring Windows 93 UNIX and Linux 20, 55

214 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Index

escape clauses IP addresses 48 native 134 IPv4 48 refresh 134 IPv6 48 example application 93 isolation levels Except operator 146 about 204 executing SQL 93 read committed 204 EXISTS predicate 159 read uncommitted 204 Extended Options 64, 67, 71, 207 repeatable read 204 serializable 204 F isolation levels and data consistency compared 204 Fetch Size connection option 112 dirty reads 204 From clause non-repeatable reads 204 Outer Join Escape Sequences 141 phantom reads 204 functions, ODBC ivtestlib tool 54, 88 selecting for performance 194 J G Java logging General Tab 61 configuring 91 Group By clause 143 SQL engine logger 90 Java Logging for the SQL Engine Server 82 Java Path 77 H join in a From clause 142 handles JVM Arguments connection option 115 connection 196 JVM Classpath connection option 116 statement 196 Having clause 144 K Host Name connection option 113 Keyspace Name connection option 117 I L IANAAppCodePage connection option values 163 library path environment variable IANAAppCodePage connection option 114 setting on UNIX/Linux 21 IANAAppCodePage values 163 setting on Windows 18 identifiers 83 Limit clause 148 improving Linux, See UNIX and Linux database performance 199 list type 38 index performance 199 literals join performance 202 about 151 ODBC application performance 97, 189 binary 152 record selection performance 200 character string 152 IN predicate 158 date/time 153 indexes integer 153 deciding which to create 201 numeric 152 improving performance 199 local table overview 199 deleting rows 134 indexing multiple fields 200 locale 180 Initialization String connection option 115 localization 179 Insert statement 135 locking 203 integer literals 153 locking levels 46 internationalization 179 locking modes and levels 205 interoperability issues 96 Log Config File connection option 118 Intersect operator 145

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 215 Index

logging, Java operators (continued) components 89 comparison 154 configuring 91 concatenation 154 SQL engine logger 90 logical 155 logical operators 155 precedence 156 Login Timeout connection option 119 optimization, performance 189 long data, retrieving 192 Order By clause 147 out of memory errors M troubleshooting 97 Outer Join Escape Sequences 141 managing connections 196 map type 38 P metadata methods 25 MIBenum value 163 parameter markers, binding 46 Minus operator 146 parameter metadata 48 parameter values, passing arrays of 194 N Password connection option 120 performance 97 naming conflicts performance considerations 76 identifiers 83 performance optimization Native escape clause 134 avoiding catalog functions 190 Native Fetch Size connection option 119 avoiding search patterns 191 NativeFetchSize 98 commits in transactions 196 nested complex types 44 managing connections 196 non-repeatable reads 204 overview 189 normalization reducing the size of retrieved data 192 identifiers 83 retrieving long data 192 numeric functions 175 using a dummy query 191 numeric literals 152 using bound columns 193 performance, improving database using indexes 199 O index 199 ODBC join 202 API functions 169 record selection 200 call load, reducing 193 phantom reads 204 designing for performance 189 Port Number connection option 121 functions, selecting for performance 194 positioned updates and deletes 197 how the architecture works 24 predicate scalar functions 172 EXISTS 159 specification 33 IN 158 ODBC application performance design 97 UNIQUE 159 ODBC conformance 33 properties file for Java logging ODBC Test 89 JVM 91 ODBC Trace 85 ODBC-native-escape 134 Q ODBC-refresh-schema-escape 134 odbc.ini (system information) file Query Timeout connection option 121 configuring with 55 odbc.ini file R controlling tracing with 87 Open Database Connectivity, See ODBC specification read committed 204 operation timeouts Read Consistency connection option 122 troubleshooting 98 Read Only connection option 123 operators read uncommitted 204 arithmetic 153 ReadConsistency 98

216 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 Index

Refresh escape clause 134 statements supported 50 Refresh Map (EXT) 137 static cursors 195 repeatable read 204 stopping the SQL engine server Report Codepage Conversion Errors connection option UNIX/Linx 81 124 string functions 173 Result Memory Size connection option 124 subqueries retrieving data, optimizing 192 correlated 159 overview 158 S subquery in a From clause 142 supported databases 25 scalar functions, ODBC 172 system functions 178 Schema Map connection option 126 system information file (.odbc.ini) 55, 87 scrollable cursors 195 search patterns, avoiding 191 T Security tab 73 Select clause 139 Technical Support, contacting 16 Select statement 137 test connect serializable isolation level 204 UNIX and Linux 22 server mode, configuring 77 testing database connections 92±93 Server Port Number connection option 127 threading, overview 211 Services 77 time functions 176 set type 38 timeouts, operation setting library path environment variable troubleshooting 98 on UNIX/Linux 21 tools on Windows 18 diagnostic 85 setup issues 95 other 93 SQL trace log 85 executing 93 tracing expressions 150 creating a trace log 85 supported 50 enabling with system information file 87 SQL engine logger 90 enabling with Windows ODBC Administrator 86 SQL Engine Mode connection option 128 tracking JDBC calls 89 SQL engine, starting Transaction Mode connection option 129 UNIX/Linux 80 transactions, managing commits 196 SQL engine, stopping troubleshooting UNIX/Linux 81 operation timeouts 98 SQL engine, using tuple type 42 UNIX/Linux 80 SQL escape clauses U native 134 refresh 134 UCS-2 181 SQL functionality 25 unary operator 153 SQL statement support 133 undocumented connection options 64, 67, 71 SQL statements Unicode Insert 135 character encoding 181 Update 134, 148 ODBC drivers 183 SQLExecDirect, advantages of using 194 support in databases 182 SQLFetch, disadvantage of using 193 support in ODBC 182 SQLSpecialColumns, determining optimal set of columns Unicode support 47 197 Union operator 145 standards, ODBC specification compliance 33 UNIQUE predicate 159 starting UNIX and Linux SQL engine server configuring a data source 21 UNIX/Linux 80 configuring through the system information (odbc.ini) statement handles, using to manage SQL statements file 55 196 data source configuration 20, 55

The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0 217 Index

UNIX and Linux (continued) UTF-16 Applications 60 double-byte character sets 181 UTF-8 181 driver names 33 environment V DD_INSTALLDIR 54 ddtestlib tool 54 Varchar Size connection option 130 introduction 52 Varint Precision connection option 130 ivtestlib tool 54 version string information 34 library search path 52 ODBCINI 53 ODBCINST 53 W system information file (.odbc.ini) 55 Where clause 98, 143 error messages 93 Windows system information (odbc.ini) file, configuring through configuring a data source using a GUI 61 55 Data Source Administrator 18 system requirements 28 data source configuration 18 Update statement 134, 148 driver names 28 updates, optimizing 196 error messages 93 User Name connection option 129 system requirements 26 user-defined types 42 Windows ODBC Administrator and tracing 86 using wire protocol adapter logger 91 SQL engine WorkAround options for ODBC drivers 207 UNIX/Linux 80 Write Consistency connection option 131 UTF-16 181 WriteConsistency 98

218 The Progress DataDirect for ODBC for Apache Cassandra™ Driver: User©s Guide and Reference: Version 8.0.0