Amazon Dynamodb Adapter Guide

TIBCO Data Virtualization® Amazon DynamoDB Adapter Guide

Version 8.4

Last Updated: January 25, 2021

Two-Second Advantage® Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR BUNDLED TIBCO SOFTWARE IS SOLELY TO ENABLE THE FUNCTIONALITY (OR PROVIDE LIMITED ADD-ON FUNCTIONALITY) OF THE LICENSED TIBCO SOFTWARE. THE EMBEDDED OR BUNDLED SOFTWARE IS NOT LICENSED TO BE USED OR ACCESSED BY ANY OTHER TIBCO SOFTWARE OR FOR ANY OTHER PURPOSE. USE OF TIBCO SOFTWARE AND THIS DOCUMENTATION IS SUBJECT TO THE TERMS AND CONDITIONS OF A LICENSE AGREEMENT FOUND IN EITHER A SEPARATELY EXECUTED SOFTWARE LICENSE AGREEMENT, OR, IF THERE IS NO SUCH SEPARATE AGREEMENT, THE CLICKWRAP END USER LICENSE AGREEMENT WHICH IS DISPLAYED DURING DOWNLOAD OR INSTALLATION OF THE SOFTWARE (AND WHICH IS DUPLICATED IN THE LICENSE FILE) OR IF THERE IS NO SUCH SOFTWARE LICENSE AGREEMENT OR CLICKWRAP END USER LICENSE AGREEMENT, THE LICENSE(S) LOCATED IN THE “LICENSE” FILE(S) OF THE SOFTWARE. USE OF THIS DOCUMENTATION IS SUBJECT TO THOSE TERMS AND CONDITIONS, AND YOUR USE HEREOF SHALL CONSTITUTE ACCEPTANCE OF AND AN AGREEMENT TO BE BOUND BY THE SAME.

This document is subject to U.S. and international copyright laws and treaties. No part of this document may be reproduced in any form without the written authorization of TIBCO Software Inc.

TIBCO and the TIBCO logo are either registered trademarks or trademarks of TIBCO Software Inc. in the United States and/or other countries

TIBCO, Two-Second Advantage, TIBCO Spotfire, TIBCO ActiveSpaces, TIBCO Spotfire Developer, TIBCO EMS, TIBCO Spotfire Automation Services, TIBCO Enterprise Runtime for R, TIBCO Spotfire Server, TIBCO Spotfire Web Player, TIBCO Spotfire Statistics Services, S-PLUS, and TIBCO Spotfire S+ are either registered trademarks or trademarks of TIBCO Software Inc. in the United States and/or other countries.

All other product and company names and marks mentioned in this document are the property of their respective owners and are mentioned for identification purposes only.

THIS SOFTWARE MAY BE AVAILABLE ON MULTIPLE OPERATING SYSTEMS. HOWEVER, NOT ALL OPERATING SYSTEM PLATFORMS FOR A SPECIFIC SOFTWARE VERSION ARE RELEASED AT THE SAME TIME. SEE THE README FILE FOR THE AVAILABILITY OF THIS SOFTWARE VERSION ON A SPECIFIC OPERATING SYSTEM PLATFORM.

THIS DOCUMENTATION IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. THIS DOCUMENTATION COULD INCLUDE TECHNICAL INACCURACIES OR TYPOGRAPHICAL ERRORS. CHANGES ARE PERIODICALLY ADDED TO THE INFORMATION HEREIN; THESE CHANGES WILL BE INCORPORATED IN NEW EDITIONS OF THIS DOCUMENTATION. TIBCO SOFTWARE INC. MAY MAKE IMPROVEMENTS AND/OR CHANGES IN THE PRODUCT(S) AND/OR THE PROGRAM(S) DESCRIBED IN THIS DOCUMENTATION AT ANY TIME.

THE CONTENTS OF THIS DOCUMENTATION MAY BE MODIFIED AND/OR QUALIFIED, DIRECTLY OR INDIRECTLY, BY OTHER DOCUMENTATION WHICH ACCOMPANIES THIS SOFTWARE, INCLUDING BUT NOT LIMITED TO ANY RELEASE NOTES AND "READ ME" FILES.

TIBCO Software Inc. Confidential Information

Contents | 1 Contents

Preface ...... 2 Product-Specific Documentation ...... 2 How to Access TIBCO Documentation...... 3 How to Contact TIBCO Support ...... 3 How to Join TIBCO Community ...... 4

TDV Amazon DynamoDB Adapter ...... 5 Deploying the Adapter ...... 5 Basic Tab...... 5 Advanced Tab ...... 8 Connection String Options ...... 8 Logging ...... 39 Querying The Data ...... 41 Advanced Settings...... 42 NoSQL Database...... 46 Automatic Schema Discovery...... 46 Vertical Flattening...... 48 JSON Functions ...... 49 DynamoDB Queries ...... 51 Querying Documents and Lists...... 53 Data Type Mapping ...... 54 Custom Schema Definitions ...... 55 Custom Schema Example ...... 56 SQL Compliance ...... 58 SELECT Statements ...... 59 SELECT INTO Statements...... 62 INSERT Statements ...... 62 UPDATE Statements ...... 63 DELETE Statements ...... 63 EXECUTE Statements ...... 64 Data Model ...... 65 Tables...... 65 Table Columns ...... 65 Stored Procedures ...... 67

TIBCO® Data Virtualization | 2 Preface

Forinformationonthefollowing,seetheTDVUserGuide: • AddingaDataSource • IntrospectingaDataSource • TestingtheConnectiontoYourDataSource Documentation for this and other TIBCO products is available on the TIBCO Documentation site. This site is updated more frequently than any documentation that might be included with the product. To ensure that you are accessing the latest available help topics, please visit: • https://docs.tibco.com

Product-Specific Documentation The following documents form the TIBCO® Data Virtualization(TDV) documentation set: •Users TDV Getting Started Guide TDV User Guide TDV Client Interfaces Guide TDV Tutorial Guide TDV Northbay Example • Administration TDV Installation and Upgrade Guide TDV Administration Guide TDV Active Cluster Guide TDV Security Features Guide • Data Sources TDV Adapter Guides TDV Data Source Toolkit Guide (Formerly Extensibility Guide)

TIBCO® Data Virtualization Preface | 3

• References TDV Reference Guide TDV Application Programming Interface Guide •Other TDV Business Directory Guide TDV Discovery Guide • TIBCO TDV and Business Directory Release Notes Read the release notes for a list of new and changed features. This document also contains lists of known issues and closed issues for this release.

How to Access TIBCO Documentation Documentation for TIBCO products is available on the TIBCO Product Documentation website mainly in the HTML and PDF formats. The TIBCO Product Documentation website is updated frequently and is more current than any other documentation included with the product. To access the latest documentation, visit https://docs.tibco.com. Documentation for TIBCO Data Virtualization is available on https://docs.tibco.com/products/tibco-data-virtualization-server.

How to Contact TIBCO Support You can contact TIBCO Support in the following ways: • For an overview of TIBCO Support, visit https://www.tibco.com/services/support. • For accessing the Support Knowledge Base and getting personalized content about products you are interested in, visit the TIBCO Support portal at https://support.tibco.com. • For creating a Support case, you must have a valid maintenance or support contract with TIBCO. You also need a user name and password to log in to https://support.tibco.com. If you do not have a user name, you can request one by clicking Register on the website.

TIBCO® Data Virtualization 4 | Preface

How to Join TIBCO Community TIBCO Community is the official channel for TIBCO customers, partners, and employee subject matter experts to share and access their collective experience. TIBCO Community offers access to Q&A forums, product wikis, and best practices. It also offers access to extensions, adapters, solution accelerators, and tools that extend and enable customers to gain full value from TIBCO products. In addition, users can submit and vote on feature requests from within the TIBCO Ideas Portal. For a free registration, go to https://community.tibco.com.

TIBCO® Data Virtualization | 5 TDV Amazon DynamoDB Adapter

Requirements and Restrictions The SQL Compliance section shows the SQL syntax supported by the adapter and points out any limitations.

Deploying the Adapter

For instructions on deploying the adapter, refer to the Installation Guide, section Installing the Advanced Adapters.

Basic Tab

Obtaining the Access Key To obtain the credentials for an IAM user, follow the steps below: 1. Sign into the IAM console. 2. In the navigation pane, select Users. 3. To create or manage the access keys for a user, select the user and then select the Security Credentials tab. To obtain the credentials for your AWS root account, follow the steps below: 1. Sign into the AWS Management console with the credentials for your root account. 2. Select your account name or number and select My Security Credentials in the menu that is displayed. 3. Click Continue to Security Credentials and expand the Access Keys section to manage or create root account access keys.

Authenticating with Root Credentials To authenticate using account root credentials, set the following: •AuthScheme: Set this to AwsRootKeys. • AWSAccessKey: The access key associated with the AWS root account.

TIBCO® Data Virtualization 6 | Basic Tab

•AWSSecretKey: The secret key associated with the AWS root account. Note: Use of this authentication scheme is discouraged by Amazon for anything but simple tests. The account root credentials have the full permissions of the user, making this the least secure authentication method.

Authenticating with Temporary Credentials To authenticate using temporary credentials, specify the following: •AuthScheme: Set this to TemporaryCredentials. • AWSAccessKey: The access key of the IAM user to assume the role for. •AWSSecretKey: The secret key of the IAM user to assume the role for. • AWSSessionToken: Your AWS session token. This will have been provided alongside your temporary credentials. The adapter can now request resources using the same permissions provided by long-term credentials (such as IAM user credentials) for the lifespan of the temporary credentials. If you are also using an IAM role to authenticate, you must additionally specify the following: •AWSRoleARN: Specify the Role ARN for the role you'd like to authenticate with. This will cause the adapter to attempt to retrieve credentials for the specified role. •AWSExternalId: Only if required when you assume a role in another account.

Authenticating from an EC2 Instance If you are using the adapter from an EC2 Instance and have an IAM Role assigned to the instance, you can use the IAM Role to authenticate. To do so, set the following properties to authenticate: •AuthScheme: Set this to AwsEC2Roles. Leave AWSAccessKey and AWSSecretKey empty. The adapter will automatically obtain your IAM Role credentials and authenticate with them. If you are also using an IAM role to authenticate, you must additionally specify the following: •AWSRoleARN: Specify the Role ARN for the role you'd like to authenticate with. This will cause the adapter to attempt to retrieve credentials for the specified role. •AWSExternalId: Only if required when you assume a role in another account.

TIBCO® Data Virtualization Basic Tab | 7

Authenticating as an AWS Role In many situations it may be preferable to use an IAM role for authentication instead of the direct security credentials of an AWS root user. To authenticate as an AWS role, set the following: •AuthScheme: Set this to AwsIAMRoles. •AWSRoleARN: Specify the Role ARN for the role you'd like to authenticate with. This will cause the adapter to attempt to retrieve credentials for the specified role. •AWSExternalId: Only if required when you assume a role in another account. If you are connecting to AWS (instead of already being connected such as on an EC2 instance), you must additionally specify the following: • AWSAccessKey: The access key of the IAM user to assume the role for. •AWSSecretKey: The secret key of the IAM user to assume the role for. Note: Roles may not be used when specifying the AWSAccessKey and AWSSecretKey of an AWS root user.

Authenticating with MFA For users and roles that require Multi-factor Authentication, specify the following to authenticate: •AuthScheme: Set this to AwsMFA. • CredentialsLocation: The location of the settings file where MFA credentials are saved. See the Credentials File Location page under Connection String Options for more information. •MFASerialNumber: The serial number of the MFA device if one is being used. •MFAToken: The temporary token available from your MFA device. If you are connecting to AWS (instead of already being connected such as on an EC2 instance), you must additionally specify the following: • AWSAccessKey: The access key of the IAM user for whom MFA will be issued. •AWSSecretKey: The secret key of the IAM user whom MFA will be issued. If you are also using an IAM role to authenticate, you must additionally specify the following: •AWSRoleARN: Specify the Role ARN for the role you'd like to authenticate with. This will cause the adapter to attempt to retrieve credentials for the specified role using MFA.

TIBCO® Data Virtualization 8 | Advanced Tab

•AWSExternalId: Only if required when you assume a role in another account. This will cause the adapter to submit the MFA credentials in a request to retrieve temporary authentication credentials. Note that the duration of the temporary credentials may be controlled via the TemporaryTokenDuration property (default 3600 seconds).

Connecting to DynamoDB Once you have configured the adapter for your desired authentication method, you can optionally set the following to refine data access: • Domain: Set this if you want to use a domain name you have associated with AWS. •AWSRegion: Set this to the region where your Amazon DynamoDB data is hosted.

Advanced Tab

The connection string properties describe the various options that can be used to establish a connection.

Connection String Options The following is the full list of the options you can configure in the connection string for this provider.

Auth Scheme The scheme used for authentication. Accepted entries are: Auto, , AwsRootKeys , AwsIAMRoles , AwsEC2Roles , AwsMFA , AwsCredentialsFile.

Auto Detect A boolean indicating if secondary indexes should be automatically detected based Index on the query used.

AWS Access Your AWS account access key. This value is accessible from your AWS security Key credentials page.

AWS The path to the AWS Credentials File to be used for authentication. Credentials File

TIBCO® Data Virtualization Advanced Tab | 9

AWS The name of the profile to be used from the supplied AWSCredentialsFile. Credentials File Profile

AWS External A unique identifier that might be required when you assume a role in another Id account.

AWS Region The hosting region for your Amazon Web Services.

AWS Role The Amazon Resource Name of the role to use when authenticating. ARN

AWS Secret Your AWS account secret key. This value is accessible from your AWS security Key credentials page.

AWS Session Your AWS session token. Token

Buffer Size Determines the internal buffer size to fill up to before returning results.

Credentials The location of the settings file where MFA credentials are saved. Location

Domain Your AWS domain name. You can optionally choose to associate your domain name with AWS.

Firewall A password used to authenticate to a proxy-based firewall. Password

Firewall Port The TCP port for a proxy-based firewall.

Firewall Server The name or IP address of a proxy-based firewall.

Firewall Type The protocol used by a proxy-based firewall.

Firewall User The user name to use to authenticate with a proxy-based firewall.

Flatten Arrays By default, nested arrays are returned as strings of JSON. The FlattenArrays property can be used to flatten the elements of nested arrays into columns of their own. Set FlattenArrays to the number of elements you want to return from nested arrays.

Flatten Objects Set FlattenObjects to true to flatten object properties into columns of their own. Otherwise, objects nested in arrays are returned as strings of JSON.

TIBCO® Data Virtualization 10 | Advanced Tab

Generate Indicates the user preference as to when schemas should be generated and saved. Schema Files

Insert Mode How to handle values when inserting if the same primary key combination already exists in DynamoDB.

Location A path to the directory that contains the schema files defining tables, views, and stored procedures.

Log Modules Core modules to be included in the log file.

Maximum The maximum number of times to retry a request. Request Retries

Max Rows Limits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time.

MFA Serial The serial number of the MFA device if one is being used. Number

MFA Token The temporary token available from your MFA device.

Number Specifies how to handle detected number columns. DynamoDB returns number Column Mode values with a total precision of 38.

Other The other parameters necessary to connect to a data source, such as username and password, when applicable.

Pagesize The maximum number of results to return per page from DynamoDB per request.

Proxy Auth The authentication type to use to authenticate to the ProxyServer proxy. Scheme

Proxy Auto This indicates whether to use the system proxy settings or not. Set Detect ProxyAutoDetect to FALSE to use custom proxy settings. This takes precedence over other proxy settings.

Proxy A semicolon separated list of hosts or IPs that will be exempt from connecting Exceptions through the ProxyServer .

Proxy A password to be used to authenticate to the ProxyServer proxy. Password

Proxy Port The TCP port the ProxyServer proxy is running on.

TIBCO® Data Virtualization Advanced Tab | 11

Proxy Server The hostname or IP address of a proxy to route HTTP traffic through.

Proxy SSL The SSL type to use when connecting to the ProxyServer proxy. Type

Proxy User A user name to be used to authenticate to the ProxyServer proxy.

Readonly You can use this property to enforce read-only access to DynamoDB from the provider.

Retry Wait The minimum number of seconds the provider will wait to retry a request. Time

Row Scan The maximum number of rows to scan to look for the columns available in a table. Depth

Separator The character or characters used to denote hierarchy. Character

SSL Server The certificate to be accepted from the server when connecting using TLS/SSL. Cert

Temporary The amount of time (in seconds) a temporary token will last. Token Duration

Thread Count The number of threads to use when selecting data via a parallel scan. Setting ThreadCount to 1 will disable parallel scans.

Timeout The value in seconds until the timeout error is thrown, canceling the operation.

Type Determines how to determine the data type of columns. Detection Scheme

Use Consistent Whether to always use Consistent Reads or not when querying DynamoDb. Reads

Use Simple Boolean determining if simple names should be used for tables and columns. Names

Auth Scheme The scheme used for authentication. Accepted entries are: Auto, , AwsRootKeys , AwsIAMRoles , AwsEC2Roles , AwsMFA , AwsCredentialsFile.

TIBCO® Data Virtualization 12 | Advanced Tab

Data Type string

Default Value "Auto"

Remarks Use the following options to select your authentication scheme: • Auto: Set this to have the adapter attempt to automatically resolve the proper authentication scheme to use based on the other connection properties specified. • TemporaryCredentials: Set this to leverage temporary security credentials alongside a session token to connect. • AwsRootKeys: Set this to use the root user access key and secret. Useful for quickly testing, but production use cases are encouraged to use something with narrowed permissions. • AwsIAMRoles: Set to use IAM Roles for the connection. • AwsEC2Roles: Set this to automatically use IAM Roles assigned to the EC2 machine the Amazon DynamoDB Adapter is currently running on. • AwsMFA: Set to use multi factor authentication. • AwsCredentialsFile: Set to use a credential file for authentication.

Auto Detect Index A boolean indicating if secondary indexes should be automatically detected based on the query used.

Data Type bool

Default Value true

TIBCO® Data Virtualization Advanced Tab | 13

Remarks In DynamoDB, secondary indexes can be used to more quickly select data from a given table. By default, we attempt to automatically detect an index to use based on the query. However, this may not always be desirable. If you have control over the query and would prefer to specify the index yourself, set AutoDetectIndex to false and simply use the SecondaryIndex pseudo column to specify which index to use (if any).

AWS Access Key Your AWS account access key. This value is accessible from your AWS security credentials page.

Data Type string

Default Value ""

Remarks Your AWS account access key. This value is accessible from your AWS security credentials page: 1. Sign into the AWS Management console with the credentials for your root account. 2. Select your account name or number and select My Security Credentials in the menu that is displayed. 3. Click Continue to Security Credentials and expand the Access Keys section to manage or create root account access keys.

AWS Credentials File The path to the AWS Credentials File to be used for authentication.

Data Type string

Default Value ""

TIBCO® Data Virtualization 14 | Advanced Tab

Remarks The path to the AWS Credentials File to be used for authentication. See https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html for more information.

AWS Credentials File Profile The name of the profile to be used from the supplied AWSCredentialsFile.

Data Type string

Default Value "default"

Remarks The name of the profile to be used from the supplied AWSCredentialsFile. See https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html for more information.

AWS External Id A unique identifier that might be required when you assume a role in another account.

Data Type string

Default Value ""

Remarks A unique identifier that might be required when you assume a role in another account.

AWS Region The hosting region for your Amazon Web Services.

TIBCO® Data Virtualization Advanced Tab | 15

Data Type string

Default Value "NORTHERNVIRGINIA"

Remarks The hosting region for your Amazon Web Services. Available values are OHIO, NORTHERNVIRGINIA, NORTHERNCALIFORNIA, OREGON, CAPETOWN, HONGKONG, MUMBAI, OSAKA, SEOUL, SINGAPORE, SYDNEY, TOKYO, CENTRAL, BEIJING, NINGXIA, FRANKFURT, IRELAND, LONDON, MILAN, PARIS, STOCKHOLM, BAHRAIN, SAOPAULO, GOVCLOUDEAST, and GOVCLOUDWEST.

AWS Role ARN The Amazon Resource Name of the role to use when authenticating.

Data Type string

Default Value ""

Remarks When authenticating outside of AWS, it is common to use a Role for authentication instead of your direct AWS account credentials. Entering the AWSRoleARN will cause the Amazon DynamoDB Adapter to perform a role based authentication instead of using the AWSAccessKey and AWSSecretKey directly. The AWSAccessKey and AWSSecretKey must still be specified to perform this authentication. You cannot use the credentials of an AWS root user when setting RoleARN. The AWSAccessKey and AWSSecretKey must be those of an IAM user.

AWS Secret Key Your AWS account secret key. This value is accessible from your AWS security credentials page.

TIBCO® Data Virtualization 16 | Advanced Tab

Data Type string

Default Value ""

Remarks Your AWS account secret key. This value is accessible from your AWS security credentials page: 1. Sign into the AWS Management console with the credentials for your root account. 2. Select your account name or number and select My Security Credentials in the menu that is displayed. 3. Click Continue to Security Credentials and expand the Access Keys section to manage or create root account access keys.

AWS Session Token Your AWS session token.

Data Type string

Default Value ""

Remarks Your AWS session token. This value can be retrieved in different ways.

Buffer Size Determines the internal buffer size to fill up to before returning results.

Data Type string

TIBCO® Data Virtualization Advanced Tab | 17

Default Value "20000"

Remarks In order to provide maximal performance from Amazon DynamoDB we use multiple threads on all Scans to retrieve data. Setting the BufferSize will determine the size of the buffer all threads feed into before we internally start blocking until you read more results.

Credentials Location The location of the settings file where MFA credentials are saved.

Data Type string

Default Value "%APPDATA%\\CData\\AmazonDynamoDB Data Provider\\CredentialsFile.txt"

Remarks MFA credentials are short lived and typically expire after an hour. At that point, a different MFAToken must be specified to continue connecting. MFA tokens only work once, so in the case of multi-threaded or multi-process applications, the credentials must be centrally located and shared to avoid problems. When MFAToken is not specified, this property does nothing. If left unspecified, the default location is "%APPDATA%\\CData\\AmazonDynamoDB Data Provider\\CredentialsFile.txt" with %APPDATA% being set to the user's configuration directory:

Platform %APPDATA%

Windows The value of the APPDATA environment variable

Mac ~/Library/Application Support

Linux ~/.config

TIBCO® Data Virtualization 18 | Advanced Tab

Domain Your AWS domain name. You can optionally choose to associate your domain name with AWS.

Data Type string

Default Value "amazonaws.com"

Remarks If you do not have a unique AWS domain name, leave this value specified as amazonaws.com.

Firewall Password A password used to authenticate to a proxy-based firewall.

Data Type string

Default Value ""

Remarks This property is passed to the proxy specified by FirewallServer and FirewallPort, following the authentication method specified by FirewallType.

Firewall Port The TCP port for a proxy-based firewall.

Data Type string

TIBCO® Data Virtualization Advanced Tab | 19

Default Value ""

Remarks This specifies the TCP port for a proxy allowing traversal of a firewall. Use FirewallServer to specify the name or IP address. Specify the protocol with FirewallType.

Firewall Server The name or IP address of a proxy-based firewall.

Data Type string

Default Value ""

Remarks This property specifies the IP address, DNS name, or host name of a proxy allowing traversal of a firewall. The protocol is specified by FirewallType: Use FirewallServer with this property to connect through SOCKS or do tunneling. Use ProxyServer to connect to an HTTP proxy. Note that the adapter uses the system proxy by default. To use a different proxy, set ProxyAutoDetect to false.

Firewall Type The protocol used by a proxy-based firewall.

Data Type string

Default Value "NONE"

TIBCO® Data Virtualization 20 | Advanced Tab

Remarks This property specifies the protocol that the adapter will use to tunnel traffic through the FirewallServer proxy. Note that by default the adapter connects to the system proxy; to disable this behavior and connect to one of the following proxy types, set ProxyAutoDetect to false.

Type Default Description Port

TUNNE 80 When this is set, the adapter opens a connection to Amazon DynamoDB and L traffic flows back and forth through the proxy.

SOCKS4 1080 When this is set, the adapter sends data through the SOCKS 4 proxy specified by FirewallServer and FirewallPort and passes the FirewallUser value to the proxy, which determines if the connection request should be granted.

SOCKS5 1080 When this is set, the adapter sends data through the SOCKS 5 proxy specified by FirewallServer and FirewallPort. If your proxy requires authentication, set FirewallUser and FirewallPassword to credentials the proxy recognizes.

To connect to HTTP proxies, use ProxyServer and ProxyPort. To authenticate to HTTP proxies, use ProxyAuthScheme, ProxyUser, and ProxyPassword.

Firewall User The user name to use to authenticate with a proxy-based firewall.

Data Type string

Default Value ""

TIBCO® Data Virtualization Advanced Tab | 21

Remarks The FirewallUser and FirewallPassword properties are used to authenticate against the proxy specified in FirewallServer and FirewallPort, following the authentication method specified in FirewallType.

Data Type string

Default Value ""

Remarks By default, nested arrays are returned as strings of JSON. The FlattenArrays property can be used to flatten the elements of nested arrays into columns of their own. This is only recommended for arrays that are expected to be short. Set FlattenArrays to the number of elements you want to return from nested arrays. The specified elements are returned as columns. The zero-based index is concatenated to the column name. Other elements are ignored. For example, you can return an arbitrary number of elements from an array of strings: ["FLOW-MATIC","LISP","COBOL"] When FlattenArrays is set to 1, the preceding array is flattened into the following table:

Column Name Column Value

languages_0 FLOW-MATIC

TIBCO® Data Virtualization 22 | Advanced Tab

Flatten Objects Set FlattenObjects to true to flatten object properties into columns of their own. Otherwise, objects nested in arrays are returned as strings of JSON.

Data Type bool

Default Value true

Remarks Set FlattenObjects to true to flatten object properties into columns of their own. Otherwise, objects nested in arrays are returned as strings of JSON. The property name is concatenated onto the object name with an underscore to generate the column name. For example, you can flatten the nested objects below at connection time: [ { "grade": "A", "score": 2 }, { "grade": "A", "score": 6 }, { "grade": "A", "score": 10 }, { "grade": "A", "score": 9 }, { "grade": "B", "score": 14 } ] When FlattenObjects is set to true and FlattenArrays is set to 1, the preceding array is flattened into the following table:

Column Name Column Value

grades_0_grade A

grades_0_score 2

Generate Schema Files Indicates the user preference as to when schemas should be generated and saved.

TIBCO® Data Virtualization Advanced Tab | 23

Data Type string

Default Value "Never"

Remarks GenerateSchemaFiles enables you to save the table definitions identified by Automatic Schema Discovery. This property outputs schemas to .rsd files in the path specified by Location. Available settings are the following: • Never: A schema file will never be generated. • OnUse: A schema file will be generated the first time a table is referenced, provided the schema file for the table does not already exist. • OnStart: A schema file will be generated at connection time for any tables that do not currently have a schema file. • OnCreate: A schema file will be generated by when running a CREATE TABLE SQL query. Note that if you want to regenerate a file, you will first need to delete it.

Generate Schemas with SQL When you set GenerateSchemaFiles to OnUse, the adapter generates schemas as you execute SELECT queries. Schemas are generated for each table referenced in the query.

Generate Schemas on Connection Another way to use this property is to obtain schemas for every table in your database when you connect. To do so, set GenerateSchemaFiles to OnStart and connect.

Editing Schemas Schema files have a simple format that makes them easy to modify. See Custom Schema Definitions for more information.

TIBCO® Data Virtualization 24 | Advanced Tab

Insert Mode How to handle values when inserting if the same primary key combination already exists in DynamoDB.

Data Type string

Default Value "REPLACE"

Remarks If InsertMode is set to DONTREPLACE, an exception will be thrown from DynamoDB if you attempt to insert a value where the primary key combination already exists in DynamoDB. By default there will be no exception and the item in DynamoDB will be replaced by the newly inserted value.

Location A path to the directory that contains the schema files defining tables, views, and stored procedures.

Data Type string

Default Value ""

Remarks The path to a directory which contains the schema files for the adapter (.rsd files for tables and views, .rsb files for stored procedures). The Location property is only needed if you would like to customize definitions (e.g., change a column name, ignore a column, etc.) or extend the data model with new tables, views, or stored procedures. The schema files are deployed alongside the adapter assemblies. You must also ensure that Location points to the folder that contains the schema files. The folder location can be a relative path from the location of the executable.

TIBCO® Data Virtualization Advanced Tab | 25

Log Modules Core modules to be included in the log file.

Data Type string

Default Value ""

Remarks Only the modules specified (separated by ';') will be included in the log file. By default all modules are included.

Maximum Request Retries The maximum number of times to retry a request.

Data Type string

Default Value ""

Remarks MaximumRequestRetries is the maximum number of times the adapter will retry a request when the problem has been detected as temporary (errors like "unknown error", network issues, and exceeding the maximum threshold per table). In this case on the first retry the adapter will back off and wait for the amount of time designated by RetryWaitTime. If that request fails, the adapter will double the time and then double again until the adapter has exhausted the available retries. For example, if RetryWaitTime is set to 2 seconds and MaximumRequestRetries is set to 5, the wait times will be as follows: 0 -> 2 -> 4 -> 8 -> 16 -> 32.

Max Rows Limits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time.

TIBCO® Data Virtualization 26 | Advanced Tab

Data Type int

Default Value -1

Remarks Limits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time.

MFA Serial Number The serial number of the MFA device if one is being used.

Data Type string

Default Value ""

Remarks You can find the device for an IAM user by going to the AWS Management Console and viewing the user's security credentials. For virtual devices, this is actually an Amazon Resource Name (such as arn:aws:iam::123456789012:mfa/user).

MFA Token The temporary token available from your MFA device.

Data Type string

Default Value ""

TIBCO® Data Virtualization Advanced Tab | 27

Remarks If MFA is required, this value will be used along with the MFASerialNumber to retrieve temporary credentials to login. The temporary credentials available from AWS will only last up to 1 hour. Once the time is up, the connection must be updated to specify a new MFA token so that new credentials may be obtained.

Number Column Mode Specifies how to handle detected number columns. DynamoDB returns number values with a total precision of 38.

Data Type string

Default Value "DOUBLE"

Remarks By default, all numbers will be specified as double. However, depending on your needs, you may wish to handle number columns as either double or decimal. In addition, you can also set the value to string to report all number columns as strings if you are using the Amazon DynamoDB Adapter in an environment that cannot handle the maximum precision size of DynamoDB.

Other The other parameters necessary to connect to a data source, such as username and password, when applicable.

Data Type string

Default Value ""

Remarks The Other property is a semicolon-separated list of name-value pairs used in connection parameters specific to a data source.

TIBCO® Data Virtualization 28 | Advanced Tab

Integration and Formatting

DefaultColumnSize Sets the default length of string fields when the data source does not provide column length in the metadata. The default value is 2000.

ConvertDateTimeToGMT Whether to convert date-time values to GMT, instead of the local time of the machine.

RecordToFile=filename Records the underlying socket data transfer to the specified file.

Pagesize The maximum number of results to return per page from DynamoDB per request.

Data Type string

Default Value ""

Remarks The Pagesize property affects the maximum number of results to return per page from DynamoDB. By default, DynamoDB will return up to 1MB of data per page. The Pagesize indicates the maximum number of items to return per page. DynamoDB may return fewer than this number if the total size of the items exceeds 1MB before hitting it.

Proxy Auth Scheme The authentication type to use to authenticate to the ProxyServer proxy.

Data Type string

Default Value "BASIC"

TIBCO® Data Virtualization Advanced Tab | 29

Remarks This value specifies the authentication type to use to authenticate to the HTTP proxy specified by ProxyServer and ProxyPort. Note that the adapter will use the system proxy settings by default, without further configuration needed; if you want to connect to another proxy, you will need to set ProxyAutoDetect to false, in addition to ProxyServer and ProxyPort. To authenticate, set ProxyAuthScheme and set ProxyUser and ProxyPassword, if needed. The authentication type can be one of the following: • BASIC: The adapter performs HTTP BASIC authentication. • DIGEST: The adapter performs HTTP DIGEST authentication. • NEGOTIATE: The adapter retrieves an NTLM or Kerberos token based on the applicable protocol for authentication. • PROPRIETARY: The adapter does not generate an NTLM or Kerberos token. You must supply this token in the Authorization header of the HTTP request. If you need to use another authentication type, such as SOCKS 5 authentication, see Firewall Type.

Proxy Auto Detect This indicates whether to use the system proxy settings or not. Set ProxyAutoDetect to FALSE to use custom proxy settings. This takes precedence over other proxy settings.

Data Type bool

Default Value true

Remarks This takes precedence over other proxy settings, so you'll need to set ProxyAutoDetect to FALSE in order use custom proxy settings. NOTE: When this property is set to True, the proxy used is determined as follows: • A search from the JVM properties (http.proxy, https.proxy, socksProxy, etc.) is performed.

TIBCO® Data Virtualization 30 | Advanced Tab

• In the case that the JVM properties don't exist, a search from java.home/lib/net.properties is performed. • In the case that java.net.useSystemProxies is set to True, a search from the SystemProxy is performed. • In Windows only, an attempt is made to retrieve these properties from the Internet Options in the registry. To connect to an HTTP proxy, see ProxyServer. For other proxies, such as SOCKS or tunneling, see FirewallType.

Proxy Exceptions A semicolon separated list of hosts or IPs that will be exempt from connecting through the ProxyServer .

Data Type string

Default Value ""

Remarks The ProxyServer will be used for all addresses, except for addresses defined in this property. Use semicolons to separate entries. Note that the adapter will use the system proxy settings by default, without further configuration needed; if you want to explicitly configure proxy exceptions for this connection, you will need to set ProxyAutoDetect to false, and configure ProxyServer and ProxyPort. To authenticate, set ProxyAuthScheme and set ProxyUser and ProxyPassword, if needed.

Proxy Password A password to be used to authenticate to the ProxyServer proxy.

Data Type string

Default Value ""

TIBCO® Data Virtualization Advanced Tab | 31

Remarks This property is used to authenticate to an HTTP proxy server that supports NTLM (Windows), Kerberos, or HTTP authentication. To specify the HTTP proxy, you can set ProxyServer and ProxyPort. To specify the authentication type, set ProxyAuthScheme. If you are using HTTP authentication, additionally set ProxyUser and ProxyPassword to HTTP proxy. If you are using NTLM authentication, set ProxyUser and ProxyPassword to your Windows password. You may also need these to complete Kerberos authentication. For SOCKS 5 authentication or tunneling, see Firewall Type. By default, the adapter uses the system proxy. If you want to connect to another proxy, set ProxyAutoDetect to false.

Proxy Port The TCP port the ProxyServer proxy is running on.

Data Type string

Default Value "80"

Remarks The port the HTTP proxy is running on that you want to redirect HTTP traffic through. Specify the HTTP proxy in ProxyServer. For other proxy types, see Firewall Type.

Proxy Server The hostname or IP address of a proxy to route HTTP traffic through.

Data Type string

TIBCO® Data Virtualization 32 | Advanced Tab

Default Value ""

Remarks The hostname or IP address of a proxy to route HTTP traffic through. The adapter can use the HTTP, Windows (NTLM), or Kerberos authentication types to authenticate to an HTTP proxy. If you need to connect through a SOCKS proxy or tunnel the connection, see Firewall Type. By default, the adapter uses the system proxy. If you need to use another proxy, set ProxyAutoDetect to false.

Proxy SSL Type The SSL type to use when connecting to the ProxyServer proxy.

Data Type string

Default Value "AUTO"

Remarks This property determines when to use SSL for the connection to an HTTP proxy specified by ProxyServer. This value can be AUTO, ALWAYS, NEVER, or TUNNEL. The applicable values are the following:

AUTO Default setting. If the URL is an HTTPS URL, the adapter will use the TUNNEL option. If the URL is an HTTP URL, the component will use the NEVER option.

ALWAY The connection is always SSL enabled. S

NEVER The connection is not SSL enabled.

TUNNEL The connection is through a tunneling proxy: The proxy server opens a connection to the remote host and traffic flows back and forth through the proxy.

TIBCO® Data Virtualization Advanced Tab | 33

Proxy User A user name to be used to authenticate to the ProxyServer proxy.

Data Type string

Default Value ""

Remarks The ProxyUser and ProxyPassword options are used to connect and authenticate against the HTTP proxy specified in ProxyServer. You can select one of the available authentication types in ProxyAuthScheme. If you are using HTTP authentication, set this to the username of a user recognized by the HTTP proxy. If you are using Windows or Kerberos authentication, set this property to a username in one of the following formats: user@domain domain\user

Readonly You can use this property to enforce read-only access to DynamoDB from the provider.

Data Type bool

Default Value false

Remarks If this property is set to true, the adapter will allow only SELECT queries. INSERT, UPDATE, DELETE, and stored procedure queries will cause an error to be thrown.

TIBCO® Data Virtualization 34 | Advanced Tab

Retry Wait Time The minimum number of seconds the provider will wait to retry a request.

Data Type string

Default Value ""

Remarks The value of this property is doubled on every retry to determine how long to wait until the next retry. Specify the maximum number of retries with MaximumRequestRetries.

Row Scan Depth The maximum number of rows to scan to look for the columns available in a table.

Data Type string

Default Value "50"

Remarks Since DynamoDB is schemaless, the columns in a table must be determined by scanning table rows. This value determines the maximum number of rows that will be scanned. See Table Columns for more details on the types discovered and how to define a static schema instead, an alternative that gives you more control over the projected columns.

Separator Character The character or characters used to denote hierarchy.

TIBCO® Data Virtualization Advanced Tab | 35

Data Type string

Default Value "."

Remarks In order to flatten out structures such as Maps and List attributes in DynamoDB, we need some specifier that states what the separation is between those columns and other columns. If this value is "." and a column comes back with the name address.city, this indicates that there is a mapped attribute with a child called city. If your data has columns that already use a single period within the attribute name, set the SeparatorCharacter to a different character or characters.

SSL Server Cert The certificate to be accepted from the server when connecting using TLS/SSL.

Data Type string

Default Value ""

Remarks If using a TLS/SSL connection, this property can be used to specify the TLS/SSL certificate to be accepted from the server. Any other certificate that is not trusted by the machine will be rejected. This property can take the forms:

Description Example

A full PEM Certificate (example shortened for brevity) -----BEGIN CERTIFICATE----- MIIChTCCAe4CAQAwDQYJKoZIhv...... Qw== -----END CERTIFICATE-----

A path to a local file containing the certificate C:\cert.cer

TIBCO® Data Virtualization 36 | Advanced Tab

The public key (example shortened for brevity) -----BEGIN RSA PUBLIC KEY----- MIGfMA0GCSq...... AQAB -----END RSA PUBLIC KEY-----

The MD5 Thumbprint (hex values can also be either space ecadbdda5a1529c58a1e9e09828d70e4 or colon separated)

The SHA1 Thumbprint (hex values can also be either 34a929226ae0819f2ec14b4a3d904f801c space or colon separated) bb150d

If not specified, any certificate trusted by the machine will be accepted. Use '*' to signify to accept all certificates (not recommended for security concerns).

Temporary Token Duration The amount of time (in seconds) a temporary token will last.

Data Type string

Default Value "3600"

Remarks Temporary tokens are used with both MFA and Role based authentication. Temporary tokens will eventually time out, at which time a new temporary token must be obtained. For situations where MFA is not used, this is not a big deal. The Amazon DynamoDB Adapter will internally request a new temporary token once the temporary token has expired. However, for MFA required connection, a new MFAToken must be specified in the connection to retrieve a new temporary token. This is a more intrusive issue since it requires an update to the connection by the user. The maximum and minimum that can be specified will depend largely on the connection being used. For Role based authentication, the minimum duration is 900 seconds (15 minutes) while the maximum if 3600 (1 hour). Even if MFA is used with role based authentication, 3600 is still the maximum. For MFA authentication by itself (using an IAM User or root user), the minimum is 900 seconds (15 minutes), the maximum is 129600 (36 hours).

TIBCO® Data Virtualization Advanced Tab | 37

Thread Count The number of threads to use when selecting data via a parallel scan. Setting ThreadCount to 1 will disable parallel scans.

Data Type string

Default Value "4"

Remarks Parallel scans allow data to be retrieved faster by splitting up the retrieval process across multiple threads. This can greatly improve performance when scanning data in Amazon DynamoDB. However, this will also consume your read units for a table much faster than a single thread. Consider your available cores, bandwidth, and read units for your tables before increasing the ThreadCount.

Timeout The value in seconds until the timeout error is thrown, canceling the operation.

Data Type string

Default Value "60"

Remarks If the Timeout property is set to 0, operations do not time out: They run until they complete successfully or encounter an error condition. If Timeout expires and the operation is not yet complete, the adapter throws an exception.

Type Detection Scheme Determines how to determine the data type of columns.

TIBCO® Data Virtualization 38 | Advanced Tab

Data Type string

Default Value ""

Remarks

None Setting TypeDetectionScheme to None will return all columns as string type. Note: Even when set to None, the column names will still be scanned when Header=True.

RowSca Setting TypeDetectionScheme to RowScan will scan rows to heuristically determine the data n type. The RowScanDepth determines the number of rows to be scanned. If no value is specified, RowScan will be used by default.

Use Consistent Reads Whether to alyways use Consistent Reads or not when querying DynamoDb.

Data Type bool

Default Value false

Remarks This parameter is not supported on global secondary indexes. If you scan or query using a secondary index, Constistent Reads will not be used even if the property is set to true.

Use Simple Names Boolean determining if simple names should be used for tables and columns.

Data Type bool

TIBCO® Data Virtualization Logging | 39

Default Value false

Remarks Boolean determining if simple names should be used for tables and columns. DynamoDB tables and columns can use special characters in names that are normally not allowed in standard databases. UseSimpleNames makes the adapter easier to use with traditional database tools. Setting UseSimpleNames to true will simplify the names of tables and columns returned. It will enforce a naming scheme such that only alphanumeric characters and the underscore are valid for displayed table and column names. Any non-alphanumeric characters will be converted to an underscore.

Logging

The adapter uses log4j to generate log files. The settings within the log4j configuration file will be used by the adapter to determine the type of messages to log. The following categories can be specified: Error: Only error messages will be logged. Info: Both Error and Info messages will be logged. Debug: Error, Info, and Debug messages will be logged. The Other property of the adapter can be used to set Verbosity to specify the amount of detail to be included in the log file, i.e. Verbosity=4; You can use Verbosity to specify the amount of detail to include in the log within a category. The following verbosity levels are mapped to the log4j categories: 0 = Error 1-2 = Info 3-5 = Debug For example, if the log4j category is set to DEBUG, the Verbosity option can be set to 3 for the minimum amount of debug information or 5 for the maximum amount of debug information.

TIBCO® Data Virtualization 40 | Logging

Note that the log4j settings override the Verbosity level specified. The adapter will never log at a Verbosity level greater than what is configured in the log4j properties. In addition, if Verbosity is set to a level less than the log4j category configured, Verbosity will default to the minimum value for that particular category. For example, if Verbosity is set to a value less than three and the Debug category is specified, the Verbosity will default to 3. Here is a breakdown of the Verbosity levels and the information that they log: 1 - Will log the query, the number of rows returned by it, the start of execution and the time taken, and any errors. 2 - Will log everything included in Verbosity 1 and HTTP headers. 3 - Will additionally log the body of the HTTP requests. 4 - Will additionally log transport-level communication with the data source. This includes SSL negotiation. 5 - Will additionally log communication with the data source and additional details that may be helpful in troubleshooting problems. This includes interface commands.

Configure Logging for the Amazon DynamoDB Adapter By default, logging is turned on without debugging. If debugging information is desired, the following line from the TDV Server's log4j.properties file can be uncommented (default location of this file is: C:\Program Files\TIBCO\TDV Server \conf\server).

log4j.logger.com.cdata=DEBUG The TDV Server will need to be restarted after changing the log4j.properties file, which can be accomplished by running the composite.bat script located at C:\Program Files\TIBCO\TDV Server \conf\server. Note reauthenticating to the TDV Studio will be required after restarting the server. An example of the calls would be: .\composite.bat monitor restart All logs for the adapter will be written to the "cs_cdata.log" file as specified in the log4j properties. Note the "log4j.logger.com.cdata=DEBUG" option is not required if the "Debug Output Enabled" option is set to true within the TDV Studio. To accomplish this, navigate to Administrator -> Configuration to display the configuration window. Then expand Server -> Configuration -> Debugging and set the Debug Output Enabled option to True.

TIBCO® Data Virtualization Querying The Data | 41

Querying The Data

DynamoDB has two distinct operations to read data from tables: query and scan. The Amazon DynamoDB Adapter will automatically attempt to figure out which type to use based on the WHERE clause of the SQL statement.

Query vs. Scan In DynamoDB, a query is used when some of the data can be filtered before results are returned. This is done by the use of partition keys and sort keys that are defined on the table to perform the filter. It is typically much faster than a scan. A scan is performed when anything other than a partition key or a sort key is used to filter the data. In the case of the scan, data is only filtered after it is returned. This is much slower since it will cause DynamoDB to return everything and simply exclude nonmatching results from the response. In the worst cases it will return full pages of data that are simply empty due to no matching results.

Secondary Indexes Secondary indexes in DynamoDB behave like tables in themselves that are only accessible off of a given table. A secondary index behaves like a view either with predefined columns or simply with all columns being accessible. The main difference is that different partition and sort keys may be selected for them. There is no restriction on which keys to use - they may be the same as the partition and sort keys for the table itself, the same as ones from a different secondary index, or completely unique to that index. The adapter by default will attempt to determine if a secondary index should be used to perform a query operation. This is done by analyzing the WHERE clause of the SQL statement. If it is determined that a query should be used and that a specific index should be used for that query, then the secondary index will automatically be selected. This behavior can be disabled by setting AutoDetectIndex to false. A secondary index can be explicitly specified by using the SecondaryIndex pseudo column. This will override AutoDetectIndex and will be used for both query and scan operations.

Query Examples A query operation will be submitted to DynamoDB if the WHERE clause includes a single PartitionKey with an '=' operator and optionally up to one SortKey using any DynamoDB supported operator. For instance:

TIBCO® Data Virtualization 42 | Advanced Settings

SELECT * FROM Table WHERE PartitionKey = 'x'

SELECT * FROM Table WHERE PartitionKey = 'x' AND SortKey > 'y' AND SortKey < 'z' Additionally, if AutoDetectIndex is set to true, the adapter will attempt to automatically determine an index based on the specific keys used. For instance: SELECT * FROM Table WHERE SecondaryIndexPartitionKey = 'x' The preceding query would result in a query against the SecondaryIndex, provided that it was the only one that used the specified PartitionKey. A secondary index may be explicitly specified via the SecondaryIndex pseudo column. This will cause the specified index to be used regardless of if a Query or Scan is submitted: SELECT * FROM Table WHERE SecondaryIndex = 'IndexName'

Parallel Scans For better performance when scanning results, parallel scans may be used. Parallel scans are a way to use multiple threads to execute the same SELECT statement. Each thread is responsible for retrieving an equal share of the data. Because threads retrieve data independently, it can exponentially improve performance. However, performance gains using this method will be limited to CPU performance, number of cores, bandwidth constraints, and read units for the table. A machine with more cores and more bandwidth may benefit from using additional threads whereas a machine with fewer cores and less bandwith available will see no benefit. Amazon DynamoDB offers no equivalent for parallel scans when querying results. Queries can only be executed in a single thread. By default, parallel scans are enabled. They can be disabled or changed by updating the ThreadCount connection property. Please be aware that using parallel scans will consume your allocated read units for a given table at a much faster rate than using a single thread. If you are getting throttle exceptions, it is recommended to decrease the ThreadCount or set it to 1.

Advanced Settings

Accessing NoSQL Tables The adapter implements Automatic Schema Discovery that is highly configurable. The following sections outline the adapter's defaults and link to ways to further customize.

TIBCO® Data Virtualization Advanced Settings | 43

Flattening Nested JSON By default, the adapter projects columns over the properties of objects. Arrays are returned as JSON strings, by default. You can use the following properties to access array elements, including objects nested in arrays. FlattenArrays: Set this property to the number of array elements that you want to return as column values. You can also use this property with FlattenObjects to extract the properties of objects nested in arrays. FlattenObjects: By default, this is true; that is, the properties of objects and nested objects are returned as columns. When you set FlattenArrays, objects nested in the specified array elements are also flattened and returned as columns. Other mechanisms for accessing nested objects are detailed in NoSQL Database.

Inferring the Data Type You can use the following properties to configure automatic data type detection, which is enabled by default. TypeDetectionScheme: You can use this property to enable or disable automatic type detection based on the value specified in RowScanDepth. RowScanDepth: This property determines the number of rows that will be scanned to determine column data types.

Fine Tuning Data Access You can use the following properties to gain greater control over Amazon DynamoDB API features and the strategies the adapter uses to surface them: GenerateSchemaFiles: This property enables you to persist table metadata in static schema files that are easy to customize, to change column data types, for example. You can set this property to "OnStart" to generate schema files for all tables in your database at connection. Or, you can generate schemas as you execute SELECT queries to tables. The resulting schemas are based on the connection properties you use to configure Automatic Schema Discovery . NumberColumnMode: This property enables you to specify an SQL data type for the Amazon DynamoDB "number" type, such as double, decimal, or string. InsertMode: Inserts in Amazon DynamoDB have several key differences from SQL inserts. An insert in Amazon DynamoDB is an upsert; that is, a row that already exists is replaced by the incoming row. Also, the primary key must be specified.

TIBCO® Data Virtualization 44 | Advanced Settings

By default, the adapter will perform an upsert based on the primary key. You can use this connection property to throw an error instead of overwriting existing records. To use the resulting schema files, set the Location property to the folder containing the schemas. UseSimpleNames: Amazon DynamoDB supports attribute names with special characters that many database-oriented tools do not support. In addition, Amazon DynamoDB table names can include dots and dashes -- the adapter interprets dots within table names as hierarchy separators that enable you to drill down to nested fields, similar to XPath. You can use this property to replace any nonalphanumeric character with an underscore. SeparatorCharacter: You can use this property to more easily access nested fields when Querying Documents and Lists ; specify the hierarchy separator with this property. By default, this character is the '.' (dot) character.

Fine Tuning Performance You can use the following connection properties to avoid "maximum throughput exceeded" errors.

Using Paging Effectively You can use the Pagesize property to optimize use of your provisioned throughput, based on the size of your items and Amazon DynamoDB's 1MB page size. Set this property to the number of items to return. Generally, a smaller page size reduces spikes in throughput that cause throttling. A smaller page size also inserts pauses between requests. This interval evens out the distribution of requests and allows more requests to be successful by avoiding throttling. You can also take advantage of secondary indexes by specifying them in the WHERE clause. You can specify partition keys and sort keys in the WHERE clause to improve the performance of query operations. Based on these inputs, the adapter automatically detects the secondary index for a query operation. See Querying The Data for more information on scan operations vs. query operations.

TIBCO® Data Virtualization NoSQL Database | 45

Setting a Retry Interval You can set the following properties to retry queries instead of throwing a temporary error such as "maximum throughput exceeded": RetryWaitTime: The minimum number of seconds the adapter will wait to retry a request. MaximumRequestRetries: The maximum number of times to retry a request.

Connecting Through a Firewall or Proxy To connect through the Windows system proxy, set only the Amazon DynamoDB authentication properties and, if necessary, Region and Domain. To connect to other proxies, set ProxyAutoDetect to false and in addition set the following. To authenticate to an HTTP proxy, set ProxyAuthScheme, ProxyUser, and ProxyPassword, in addition to ProxyServer and ProxyPort. To connect to other proxies, set FirewallType, FirewallServer, and FirewallPort. To tunnel the connection, set FirewallType to TUNNEL. To authenticate to a SOCKS proxy, set FirewallType to SOCKS5. Additionally, specify FirewallUser and FirewallPassword.

Troubleshooting the Connection To show adapter activity from query execution to HTTP calls, use Logfile and Verbosity. The examples of common connection errors below show how to use these properties to get more context. Contact the support team for help tracing the source of an error or circumventing a performance issue. Authentication errors: Typically, recording a Logfile at Verbosity 4 is necessary to get full details on an authentication error. Queries time out: A server that takes too long to respond will exceed the adapter's client-side timeout. Often, setting the Timeout property to a higher value will avoid a connection error. Another option is to disable the timeout by setting the property to 0. Setting Verbosity to 2 will show where the time is being spent.

NoSQL Database

Amazon DynamoDB is a schemaless database that provides high performance, availability, and scalability. These features are not necessarily incompatible with a standards-compliant query language like SQL-92. In this section we will show various schemes that the adapter offers to bridge the gap with relational SQL and a document database.

TIBCO® Data Virtualization 46 | NoSQL Database

The adapter models the schemaless Amazon DynamoDB tables into relational tables and translates SQL queries into Amazon DynamoDB queries to get the requested data. The adapter offers two ways, Automatic Schema Discovery and Custom Schema Definitions, to model Amazon DynamoDB tables as relational tables. The Automatic Schema Discovery scheme automatically finds the data types in a Amazon DynamoDB table by scanning a configured number of rows of the table. You can use RowScanDepth, FlattenArrays, and FlattenObjects to control the relational representation of the tables in Amazon DynamoDB. Optionally, you can use Custom Schema Definitions to project your chosen relational structure on top of a Amazon DynamoDB table. This allows you to define your chosen column names, their data types, and the location of their values in the Amazon DynamoDB table.

Automatic Schema Discovery The adapter automatically infers a relational schema by inspecting a series of Amazon DynamoDB items in a table. You can use the RowScanDepth property to define the number of items the adapter will scan to do so. The columns identified during the discovery process depend on the FlattenArrays and FlattenObjects properties.

Flattening Objects If FlattenObjects is set, all nested objects will be flattened into a series of columns. For example, consider the following item: { id: 12, name: "Lohia Manufacturers Inc.", address: {street: "Main Street", city: "Chapel Hill", state: "NC"}, offices: ["Chapel Hill", "London", "New York"], annual_revenue: 35600000 } This document will be represented by the following columns:

Column Name Data Example Value Type

id Integer 12

TIBCO® Data Virtualization NoSQL Database | 47

name String Lohia Manufacturers Inc.

address_street String Main Street

address_city String Chapel Hill

address_state String NC

offices String ["Chapel Hill", "London", "New York"]

annual_revenu Double 35,600,000 e

If FlattenObjects is not set, then the address_string, address_city, and address_state columns will not be broken apart. The address column of type string will instead represent the entire object. Its value would be {street: "Main Street", city: "Chapel Hill", state: "NC"}. See JSON Functions for more details on working with JSON aggregates.

Flattening Arrays The FlattenArrays property can be used to flatten array values into columns of their own. This is only recommended for arrays that are expected to be short, for example the coordinates below: "coord": [ -73.856077, 40.848447 ] The FlattenArrays property can be set to 2 to represent the array above as follows:

Column Name Data Type Example Value

coord_0 Float -73.856077

coord_1 Float 40.848447

It is best to leave other unbounded arrays as they are and piece out the data for them as needed using JSON Functions.

TIBCO® Data Virtualization 48 | NoSQL Database

Vertical Flattening It is possible to retrieve an array of objects as if it were a separate table. Take the following JSON structure from the restaurants collection for example: { "restaurantid" : "30075445", "address" : { "building" : "1007", "coord" : "-73.856077, 40.848447", "street" : "Morris Park Ave", "zipcode" : "10462" }, "borough" : "Bronx", "cuisine" : "Bakery", "grades" : "{ "date" : 1393804800000, "grade" : "A", "score" : 2 }, { "date" : 1378857600000, "grade" : "A", "score" : 6 }, { "date" : 1358985600000, "grade" : "A", "score" : 10 }, { "date" : 1322006400000, "grade" : "A", "score" : 9 }, { "date" : 1299715200000, "grade" : "B", "score" : 14 }", "name" : "Morris Park Bake Shop", } Vertical flattening will allow you to retrieve the grades array as a separate table: SELECT * FROM "restaurants.grades" This query returns the following data set:

date gra sco restaurantid _inde de re x

1393804800000 A 2 30075445 1

TIBCO® Data Virtualization NoSQL Database | 49

1378857600000 A 6 30075445 2

1358985600000 A 10 30075445 3

You may also want to include information from the base restaurants table. You can do this with a join. SELECT restaurants.name, "restaurants.grades".* FROM "restaurants" JOIN "restaurants.grades" ON restaurants.restaurantid = "restaurants.grades".restaurantid WHERE restaurants.restaurantid = 30075445 AND "restaurants.grades".restaurantid = 30075445 This query returns the following data set:

name restaura date gra sco _inde ntid de re x

Morris Park Bake Shop 30075445 1393804800000 A 2 1

Morris Park Bake Shop 30075445 1378857600000 A 6 2

Morris Park Bake Shop 30075445 1358985600000 A 10 3

Morris Park Bake Shop 30075445 1322006400000 A 9 4

Morris Park Bake Shop 30075445 1299715200000 B 14 5

JSON Functions The adapter can return JSON structures as column values. The adapter enables you to use standard SQL functions to work with these JSON structures. The examples in this section use the following array: [ { "grade": "A", "score": 2 }, { "grade": "A", "score": 6 }, { "grade": "A", "score": 10 }, { "grade": "A", "score": 9 }, { "grade": "B", "score": 14 } ]

TIBCO® Data Virtualization 50 | NoSQL Database

JSON_EXTRACT The JSON_EXTRACT function can extract individual values from a JSON object. The following query returns the values shown below based on the JSON path passed as the second argument to the function: SELECT Name, JSON_EXTRACT(grades,'[0].grade') AS Grade, JSON_EXTRACT(grades,'[0].score') AS Score FROM Students;

Column Name Example Value

Grade A

Score 2

JSON_COUNT The JSON_COUNT function returns the number of elements in a JSON array within a JSON object. The following query returns the number of elements specified by the JSON path passed as the second argument to the function: SELECT Name, JSON_COUNT(grades,'[x]') AS NumberOfGrades FROM Students;

Column Name Example Value

NumberOfGrades 5

JSON_SUM The JSON_SUM function returns the sum of the numeric values of a JSON array within a JSON object. The following query returns the total of the values specified by the JSON path passed as the second argument to the function: SELECT Name, JSON_SUM(score,'[x].score') AS TotalScore FROM Students;

Column Name Example Value

TotalScore 41

TIBCO® Data Virtualization NoSQL Database | 51

JSON_MIN The JSON_MIN function returns the lowest numeric value of a JSON array within a JSON object. The following query returns the minimum value specified by the JSON path passed as the second argument to the function: SELECT Name, JSON_MIN(score,'[x].score') AS LowestScore FROM Students;

Column Name Example Value

LowestScore 2

JSON_MAX The JSON_MAX function returns the highest numeric value of a JSON array within a JSON object. The following query returns the maximum value specified by the JSON path passed as the second argument to the function: SELECT Name, JSON_MAX(score,'[x].score') AS HighestScore FROM Students;

Column Name Example Value

HighestScore 14

DynamoDB Queries Because Amazon DynamoDB is a NoSQL data source, queries need to be handled a bit differently than standard relational databases.

Value-Sensitive Queries The lack of a required data type for a given column means that you could store different types of data in a single column. For instance, one row could have a String called EmailAddresses and another could have a StringSet also called EmailAddresses. For these and other kinds of cases, the adapter largely determines what data type to use based on the values in the query.

TIBCO® Data Virtualization 52 | NoSQL Database

For instance, say you have an Items table where the PartNumber could store either a String or a Number. To get back a part with the PartNumber of the number value 12345, you would issue the following query: SELECT Name, Location, Quantity, PartNumber FROM Items WHERE PartNumber = 12345 Alternatively, the PartNumber could have been stored as the string "12345". To get back a part with the PartNumber of the literal string 12345, issue the following query: SELECT Name, Location, Quantity, PartNumber FROM Items WHERE PartNumber = '12345' If the data type of the specified value is not ambiguous, it is always used before the autodetected data type. In both of these cases if a parameter was used instead of of a hardcoded value, then the data type of the parameter would be used to determine what type to submit to Amazon DynamoDB.

Detected Column Data Type If a value is not obvious based purely on the detected data type, the adapter compares it to the autodetected column. For instance, if you want to insert a column called Coordinates into the Location table, your insert would look like: INSERT INTO Locations (Address, Coordinates) VALUES ('123 Fake Street', '[40.7127, 74.0059]') Based on the input value alone, the detected data type is a string. However, because a Coordinates column was previously autodetected, the adapter inserts a NumberSet and not a simple String. If a Coordinates column was not autodetected when scanning the Locations table, the data type of the inserted value is used. In this case, we could still resolve that the insert is a NumberSet, but it will cost a bit more overhead to do this.

Count Amazon DynamoDB supports 2 different methods of of using the COUNT aggregate function. To simply return the number of Items in you table, issue the following query: SELECT COUNT(*) FROM MyTable The Amazon DynamoDB Adapter will read the ItemCount from the DescribeTable Action. This avoids using too many read units to scan the full table. However, DynamoDB updates this value approximately every six hours and recent changes might not be reflected in this value. Issuing the below example queries will instead scan the full table for count:

TIBCO® Data Virtualization NoSQL Database | 53

SELECT COUNT(*) FROM MyTable WHERE MyInt > 10 SELECT COUNT(MyInt) FROM MyTable

Querying Documents and Lists Amazon DynamoDB documents and lists are supported with the Amazon DynamoDB Adapter. You can access documents and lists directly at the root level or use the '.' character as a hierarchy divider to drill down to documents and lists.

Reporting Values in Documents and Lists When data types are autodetected, they are reported down to the lowest level that can be reliably detected. For instance, a document called Customer with a child called Address and a child on Address called Street would be represented by the column Customer.Address.Street. However, this process does not apply to Lists since a list could have any number of entries. Once a List or a Set is detected, additional values are not reported as being available in the table schema.

Getting Back Unreported Values If there are attributes that frequently do not have a value and thus are not autodetected, these can still be retrieved by specifying the correct path to them. For instance, to get the Special attribute from the Customer document: SELECT [Customer.Address.Street], [Customer.Special] FROM MyTable Once a List has been detected, additional values are not reported. But individual values on the list can be referenced by specifying '.' and a number. For instance: SELECT [MyList.0], [MyList.1.Email], [MyList.1.Age] FROM MyTable This will retrieve the first value on the list and the second value's Email and Age attributes.

Inserting Documents and Lists Inserts in Amazon DynamoDB require that the full object is specified during insert. Insert a document or list at the root. Pass in the full JSON aggregate. For instance: INSERT INTO MyTable (PrimaryKey, EmailAddresses, Address, MyList) VALUES ('uniquekey', '["[email protected]", "[email protected]"]', '{"Street":"123 Fake Street", "City":"Chapel Hill", "Zip":"27713"}', '[{"S":"somestr"},{"NS":[1,2]},{"N":4}]') In this case, the EmailAddress is inserted as a StringSet, Address is inserted as a document, and MyList is inserted as a list.

TIBCO® Data Virtualization 54 | Data Type Mapping

Updating Documents and Lists Updates are supported using the same syntax that is available during selects. Documents and Lists can be specified using the '.' character to specify hierarchy. For instance: UPDATE MyTable SET [EmailAddress.0]='[email protected]', [EmailAddress.1]='[email protected]', [Address.Street]='123 Fake Street', [Address.City]='Chapel Hill', [Address.Zip]='27713', [MyList.0]='somestr', [MyList.1]='[1,2]', [MyList.2]=4 WHERE PrimaryKey='uniquekey' Note that EmailAddress and MyList must be autodetected to resolve how to handle EmailAddress differently from MyList. If you are in doubt about whether or not something will be automatically detected, specifying the full JSON to update will always work.

Data Type Mapping

The adapter maps types from the data source to the corresponding data type available in the schema. The table below documents these mappings.

Amazon DynamoDB CData Schema

String string

Binary string

Number float

StringSet string

NumberSet string

BinarySet string

Map string

List string

Boolean bool

Null string

TIBCO® Data Virtualization Data Type Mapping | 55

Custom Schema Definitions In addition to Automatic Schema Discovery the adapter also allows you to statically define the schema for your Amazon DynamoDB table. Let's consider a schema for the restaurants data set. Below is an example item from the table: { "address":{ "building":"461", "coord":[ -74.138492, 40.631136 ], "street":"Port Richmond Ave", "zipcode":"10302" }, "borough":"Staten Island", "cuisine":"Other", "grades":[

], "name":"Indian Oven", "restaurant_id":"50018994" }

Defining a Custom Schema You can define a custom schema to extract out nested properties as their own columns. The following schema uses the other:bsonpath property to define where the data for a particular column should be retrieved from. Using this model you can flatten arbitrary levels of hierarchy. The collection attribute specifies the collection to parse. The collection attribute gives you the flexibility to use multiple schemas for the same collection. If collection is not specified, the filename determines the collection that is parsed. In Custom Schema Example, you will find the complete schema that contains the example above.

TIBCO® Data Virtualization 56 | Data Type Mapping

Custom Schema Example In this section is a complete schema. The info section enables a relational view of a Amazon DynamoDB table. For more details, see Custom Schema Definitions. The table below allows the SELECT, INSERT, UPDATE, and DELETE commands as implemented in the GET, POST, MERGE, and DELETE sections of the schema below. Use the tablename attribute to specify the name of the table you want to parse. You can use the tablename attribute to define multiple schemas for the same table. If tablename is not specified, the filename determines the table that is parsed. Use the pathseparator attribute to specify the character or characters to use to separate the fields in path to a given column. This will only be used when parsing the other:path input. If not specified, the default is the '.' character. Note: Amazon DynamoDB is case sensitive. Your tablename and specified paths must match the case of how your fields appear in Amazon DynamoDB. Copy the rows@next input as-is into your schema. The operations, such as dynamodbadoSelect, are internal implementations and can also be copied as is.

TIBCO® Data Virtualization Data Type Mapping | 57

TIBCO® Data Virtualization 58 | SQL Compliance

SQL Compliance

The Amazon DynamoDB Adapter supports several operations on data, including querying, deleting, modifying, and inserting.

SELECT Statements See SELECT Statements for a syntax reference and examples. See NoSQL Database for information on the capabilities of the Amazon DynamoDB API.

INSERT Statements See INSERT Statements for a syntax reference and examples, as well as retrieving the new records' Ids.

UPDATE Statements The primary key Id is required to update a record. See UPDATE Statements for a syntax reference and examples.

DELETE Statements The primary key Id is required to delete a record. See DELETE Statements for a syntax reference and examples.

EXECUTE Statements Use EXECUTE or EXEC statements to execute stored procedures. See EXECUTE Statements for a syntax reference and examples.

Names and Quoting Table and column names are considered identifier names; as such, they are restricted to the following characters: [A-Za-z0-9_:@]. To use a table or column name with characters not listed above, the name must be quoted using double quotes ("name") in any SQL statement. Strings must be quoted using single quotes (e.g., 'John Doe').

Transactions and Batching Transactions are not currently supported.

TIBCO® Data Virtualization SQL Compliance | 59

Additionally, the adapter does not support batching of SQL statements. To execute multiple commands, you can create multiple instances and execute each separately. Or, use Batch Processing.

SELECT Statements A SELECT statement can consist of the following basic clauses. SELECT INTO FROM JOIN WHERE GROUP BY HAVING UNION ORDER BY LIMIT

SELECT Syntax The following syntax diagram outlines the syntax supported by the Amazon DynamoDB adapter: SELECT { [ TOP ] { * | { [ [ AS ] ] | { | } .* } [ , ... ] } [ INTO csv:// [ filename= ] [ ;delimiter=tab ] ] { FROM [ [ AS ] ] } [ WHERE ] } | SCOPE_IDENTITY()

::= | | @

TIBCO® Data Virtualization 60 | SQL Compliance

| ? | COUNT( * | { } ) | { AVG | MAX | MIN | SUM | COUNT } ( ) | |

::= { { = | != | > | < | >= | <= | IS NULL | IS NOT NULL | BEGINSWITH | CONTAINS | NOT CONTAINS | BETWEEN | IN | NOT IN | LIKE | NOT LIKE | AND | OR } [ ] } [ { AND | OR } ... ]

Examples Return all columns: SELECT * FROM Account Rename a column: SELECT "Name" AS MY_Name FROM Account Search data: SELECT * FROM Account WHERE FirstName <> 'Bob'; The Amazon DynamoDB APIs support the following operators in the WHERE clause: =, !=, >, <, >=, <=, IS NULL, IS NOT NULL, BEGINSWITH, CONTAINS, NOT CONTAINS, BETWEEN, IN, NOT IN, LIKE, NOT LIKE, AND, OR. SELECT * FROM Account WHERE FirstName <> 'Bob';

Projection Functions

JSON_AVG(json, jsonpath) Computes the average value of a JSON array within a JSON object. json: The column containing JSON data. jsonpath: The path to the json array.

JSON_COUNT(json, jsonpath) Returns the number of elements in a JSON array within a JSON object. json: The column containing JSON data. jsonpath: The path to the json array.

TIBCO® Data Virtualization SQL Compliance | 61

JSON_MAX(json, jsonpath) Gets the maximum value in a JSON array within a JSON object. json: The column containing JSON data. jsonpath: The path to the json array.

JSON_MIN(json, jsonpath) Gets the minimum value in a JSON array within a JSON object. json: The column containing JSON data. jsonpath: The path to the json array.

JSON_SUM(json, jsonpath) Computes the sum of the elements in a JSON within a JSON object. json: The column containing JSON data. jsonpath: The path to the json array.

JSON_EXTRACT(json, jsonpath) Selects any value in a JSON array or object. The path to the array is specified in the jsonpath argument. Return value is numeric or null. json: The JSON document to extract. jsonpath: The XPath used to select the nodes. The JSONPath must be a string constant. The values of the nodes selected will be returned in a token-separated list.

XML_EXTRACT(xml, xpath [, separator]) Extracts an XML document using the specified XPath to flatten the XML. A comma is used to separate the outputs by default, but this can be changed by specifying the third parameter. xml: The XML document to extract. xpath: The XPath used to select the nodes. The nodes selected will be returned in a token-separated list. separator: The optional token used to separate the items in the flattened response. If this is not specified, the separator will be a comma.

TIBCO® Data Virtualization 62 | SQL Compliance

SELECT INTO Statements You can use the SELECT INTO statement to export formatted data to a file.

Data Export with an SQL Query The following query exports data into a file formatted in comma-separated values (CSV): SELECT Id, Name INTO "csv://Account.txt" FROM "Account" WHERE FirstName = 'Bob' You can specify other formats in the file URI. The possible delimiters are tab, semicolon, and comma with the default being comma. The following example exports tab-separated values: SELECT Id, Name INTO "csv://Account.txt;delimiter=tab" FROM "Account" WHERE FirstName = 'Bob'

INSERT Statements To create new records, use INSERT statements.

INSERT Syntax The INSERT statement specifies the columns to be inserted and the new column values. You can specify the column values in a comma-separated list in the VALUES clause: INSERT INTO ( [ , ... ] ) VALUES ( { | NULL } [ , ... ] )

::= | @ | ? | You can use the executeUpdate method of the Statement and PreparedStatement classes to execute data manipulation commands and retrieve the rows affected. To retrieve the Id of the last inserted record use getGeneratedKeys. Additionally, set the RETURN_GENERATED_KEYS flag of the Statement class when you call prepareStatement. String cmd = "INSERT INTO Account (Name) VALUES (?)"; PreparedStatement pstmt = connection.prepareStatement(cmd,Statement.RETURN_GENERATED_KEYS); pstmt.setString(1, "John"); int count = pstmt.executeUpdate();

TIBCO® Data Virtualization SQL Compliance | 63

System.out.println(count+" rows were affected"); ResultSet rs = pstmt.getGeneratedKeys(); while(rs.next()){ System.out.println(rs.getString("Id")); } connection.close();

UPDATE Statements To modify existing records, use UPDATE statements.

Update Syntax The UPDATE statement takes as input a comma-separated list of columns and new column values as name-value pairs in the SET clause. UPDATE SET { = } [ , ... ] WHERE { Id = } [ { AND | OR } ... ]

::= | @ | ? | You can use the executeUpdate method of the Statement or PreparedStatement classes to execute data manipulation commands and retrieve the rows affected. String cmd = "UPDATE Account SET Name='John' WHERE Id = ?"; PreparedStatement pstmt = connection.prepareStatement(cmd); pstmt.setString(1, "1"); int count = pstmt.executeUpdate(); System.out.println(count + " rows were affected"); connection.close();

DELETE Statements To delete from a table, use DELETE statements.

DELETE Syntax The DELETE statement requires the table name in the FROM clause and the row's primary key in the WHERE clause. ::= DELETE FROM WHERE { Id = } [ { AND | OR } ... ]

::= | @ | ?

TIBCO® Data Virtualization 64 | SQL Compliance

| You can use the executeUpdate method of the Statement or PreparedStatement classes to execute data manipulation commands and retrieve the number of affected rows. Connection connection = DriverManager.getConnection("jdbc:dynamodb:Access Key=xxx;Secret Key=xxx;Domain=amazonaws.com;Region=OREGON;",); String cmd = "DELETE FROM Account WHERE Id = ?"; PreparedStatement pstmt = connection.prepareStatement(cmd); pstmt.setString(1, "1"); int count=pstmt.executeUpdate(); connection.close();

EXECUTE Statements To execute stored procedures, you can use EXECUTE or EXEC statements. EXEC and EXECUTE assign stored procedure inputs, referenced by name, to values or parameter names.

Stored Procedure Syntax To execute a stored procedure as an SQL statement, use the following syntax:

{ EXECUTE | EXEC } { [ @ ] = } [ , ... ]

::= | @ | ? |

Example Statements Reference stored procedure inputs by name: EXECUTE my_proc @second = 2, @first = 1, @third = 3; Execute a parameterized stored procedure statement: EXECUTE my_proc second = @p1, first = @p2, third = @p3;

TIBCO® Data Virtualization Data Model | 65

Data Model

The adapter allows you to access data in Amazon DynamoDB using a standard database-like interface. Amazon DynamoDB is a highly scalable NoSQL cloud database that is very different from a regular database. In this section we describe how we model schemaless Amazon DynamoDB tables as regular Tables and Stored Procedures. The adapter can dynamically detect schemas at connection time. See Automatic Schema Discovery for more information on defining schemas implicitly at connection time. This method is useful if the structure of your data is volatile. You can also persist schemas in static schema definitions. The adapter's schema files have a simple format. See Custom Schema Definitions for more information on defining and extending static schemas.

Tables The list of tables is dynamically retrieved from your Amazon DynamoDB account. You can use the CreateTable stored procedure to create a new table, or you can create a table using the Amazon Web Services Admin Console. Because DynamoDB tables are partitioned based on their key, you should take care in selecting a proper key based on the query requirements of your table. Refer to the documentation for DynamoDB for more information about using best practices to model data in DynamoDB tables. DynamoDB supports two types of primary keys: • Hash Primary Key: This is a single-column key. • Hash and Range Primary Key: This is a two-column key that includes a hash column and a range column. The adapter will model all key attributes in DynamoDB as key columns.

Table Columns Since Amazon DynamoDB tables are schemaless, the adapter offers the following two mechanisms to uncover the schema.

TIBCO® Data Virtualization 66 | Data Model

Dynamic Schemas The columns of a table are dynamically determined by scanning data in the first few rows. You can adjust the number of rows that are used by modifying the RowScanDepth property. In addition to the name of the column, the row scan also determines the data type. The following table shows how the different data types supported by Amazon DynamoDB are modeled in the adapter.

Amazon Modele Encodin Sample Value Dynamo-DB d Type g Type

Boolean Boolean Not USA Required

String String Not USA Required

Blob String Not Required

Number Double Not 24.0 Required

String Array String JSON ["USA","Canada","UK"] Array

Number Array String JSON [20,200.5,500] Array

Blob Array JSON JSON ["ABCD","EFGH"] Array Array

Document JSON JSON {"Address":"123 Fake Object Object Street","City":"Chapel Hill","Zip":"27516"}

List JSON JSON [{"S":"mystring"},{"NS":[1, Array Array 2]},{"N":4}]

TIBCO® Data Virtualization Data Model | 67

Static Schemas Instead of using dynamically discovered schemas, you can define your own schemas. This will give you more control over the projected columns and also enable you to use other data types such as boolean, datetime, etc. Refer to the CreateSchema Stored Procedure in order to create your own schema. You can simply specify the FileName (fullpath) and TableName of the new schema file, which should match with the name of the Amazon DynamoDB table, and edit the column listing to use it for your own table.

Schemaless Operations While the schema of the table is necessary to report metadata, data may be selected, inserted, updated, or deleted from columns that do not exist in the schema. Columns that do not already exist in the table schema will have their data types dynamically determined based on the data that is specified. See DynamoDB Queries for more information.

Stored Procedures Stored procedures are available to complement the data available from the NoSQL Database. It may be necessary to update data available from a view using a stored procedure because the data does not provide for direct, table-like, two-way updates. In these situations, the retrieval of the data is done using the appropriate view or table, while the update is done by calling a stored procedure. Stored procedures take a list of parameters and return back a dataset that contains the collection of tuples that constitute the response.

Amazon DynamoDB Adapter Stored Procedures

Name Description

CreateSchema Creates a schema file for the specified table or view.

CreateTable Creates a table in DynamoDB.

CreateSchema Creates a schema file for the specified table or view.

TIBCO® Data Virtualization 68 | Data Model

Input

Name Type Description

TableName String The name of the table or view.

FileName String The full file path and name of the schema to generate. Ex : 'C:\\Users\\User\\Desktop\\SmartSheet\\sheet.rsd'

Output

Name Type Description

Result String Returns Success or Failure.

CreateTable Creates a table in DynamoDB.

Input

Name Type Description

Tablename String The name of the table to create. A minimum of 3 characters and maximum of 255 characters are allowed.

Keyschema_at String The attribute name of a primary key for the schema. At least one tributename# is required. A maximum of 2 can be specified.

Keyschema_ke String The type of key for a primary key for the schema. At least one is ytype# required. A maximum of 2 can be specified. The allowed values are HASH, RANGE.

Provisionedthr String A long representing the maximum number of strongly consistent oughput_read reads consumed per second before DynamoDB returns a capacityunits ThrottlingException.

Provisionedthr String A long representing the maximum number of strongly consistent oughput_write writes consumed per second before DynamoDB returns a capacityunits ThrottlingException.

TIBCO® Data Virtualization Data Model | 69

Attributedefini String The name of an attribute or column to add to the table. Minimum tions_attribute length of 1, Maximum length of 255. name#

Attributedefini String The type of attribute or column to add to the table. Enter S, N, or B tions_attribute for string, number or boolean. type# The allowed values are S, N, B.

Globalseconda String One or more global secondary indexes. The maximum is five and ryindexes_ind the minimum 1. The name of the global secondary index, which exname# must be unique only within this table. Minimum length of 3; maximum 255 characters.

Globalseconda String The full KeySchema JSON structure, which consists of an array of ryindexes_key attribute names and key types. There must be at least one entry schema# and a maximum of two can be specified.

Globalseconda String The full Projection JSON structure, which represents attributes ryindexes_proj that are copied (projected) from the table into an index. These are ection# in addition to the primary key attributes and index key attributes, which are automatically projected. A minimum of one projection object and a maximum of 20 are in the list.

Globalseconda String A long representing the maximum number of strongly consistent ryindexes_pro reads consumed per second before DynamoDB returns a visionedthrou ThrottlingException. ghput_readcap acityunits#

Globalseconda String A long representing the maximum number of strongly consistent ryindexes_pro writes consumed per second before DynamoDB returns a visionedthrou ThrottlingException. ghput_writeca pacityunits#

Localsecondar String One or more global secondary indexes. The maximum is five. The yindexes_inde minimum is 1. The name of the global secondary index must be xname# unique only within this table. Minimum length of 3; maximum 255 characters.

Localsecondar String The full KeySchema JSON structure, which consists of an array of yindexes_keys attribute names and key types. There must be at least one entry chema# and a maximum of two can be specified.

TIBCO® Data Virtualization 70 | Data Model

Localsecondar String The full Projection JSON structure, which represents attributes yindexes_proje that are copied (projected) from the table into an index. These are ction# in addition to the primary key attributes and index key attributes, which are automatically projected. A minimum of one projection object and a maximum of 20 are in the list.

Output

Name Type Description

Attributedefini String The name of an attribute on the table. This will be either a tions_attribute primary key, a global secondary index, or a local secondary name# index.

Attributedefini String The type of an attribute on the table. This will be either a tions_attribute primary key, a global secondary index, or a local secondary type# index.

Createdondate String The date and time when the table was created in UNIX epoch time time.

Globalseconda String The name of the global secondary index. ryindexes_ind exname#

Globalseconda String The total size of the global secondary index, in bytes. ryindexes_ind DynamoDB updates this value approximately every six hours. exsizebytes# Recent changes might not be reflected in this value.

Globalseconda String The current status of the global secondary index. Valid values ryindexes_ind are CREATING, UPDATING, DELETING, and ACTIVE. exstatus#

Globalseconda String The number of items in the global secondary index. DynamoDB ryindexes_ite updates this value approximately every six hours. Recent mcount# changes might not be reflected in this value.

Globalseconda String The date and time of the last provisioned throughput decrease ryindexes_pro for this index. visionedthrou ghput_lastdecr easedatetime#

TIBCO® Data Virtualization Data Model | 71

Globalseconda String The date and time of the last provisioned throughput increase ryindexes_pro for this index. visionedthrou ghput_lastincr easedatetime#

Globalseconda String The number of provisioned throughput decreases for this index ryindexes_pro during this UTC calendar day. For current maximums on visionedthrou provisioned throughput decreases, see Limits in the Amazon ghput_number DynamoDB Developer Guide. ofdecreasestod ay#

Globalseconda String The maximum number of strongly consistent reads consumed ryindexes_pro per second before DynamoDB returns a ThrottlingException. visionedthrou Eventually consistent reads require less effort than strongly ghput_readcap consistent reads, so a setting of 50 ReadCapacityUnits per acityunits# second provides 100 eventually consistent ReadCapacityUnits per second.

Globalseconda String The maximum number of writes consumed per second before ryindexes_pro DynamoDB returns a ThrottlingException. visionedthrou ghput_writeca pacityunits#

Keyschema_at String The name of a key attribute. tributename#

Keyschema_ke String The type of key for a primary key for the schema. ytype#

Localsecondar String The name of the local secondary index. The name must be yindexes_inde unique among all other indexes on this table. xname#

Provisionedthr String A long representing the maximum number of strongly oughput_read consistent reads consumed per second before DynamoDB capacityunits returns a ThrottlingException.

Provisionedthr String A long representing the maximum number of strongly oughput_write consistent writes consumed per second before DynamoDB capacityunits returns a ThrottlingException.

TIBCO® Data Virtualization 72 | Data Model

TIBCO® Data Virtualization