Sqream DB Release 2021.1
Total Page:16
File Type:pdf, Size:1020Kb
SQream DB Release 2021.1 Sep 27, 2021 Contents: 1 First steps with SQream DB 3 1.1 Preparing for this tutorial ...................................... 3 1.2 Create a new database for playing around in ........................... 4 1.3 Creating your first table ....................................... 4 1.4 Listing tables ............................................. 5 1.5 Inserting rows ............................................ 5 1.6 Queries ................................................ 6 1.7 Deleting rows ............................................. 8 1.8 Saving query results to a CSV or PSV file ............................. 8 2 Guides 11 2.1 Full list of guides ........................................... 11 2.1.1 Inserting data ........................................ 11 2.1.1.1 Data loading overview ............................... 12 2.1.1.2 Data loading considerations ............................ 12 2.1.1.2.1 Verify data and performance after load ................ 12 2.1.1.2.2 File source location for loading ..................... 13 2.1.1.2.3 Supported load methods ........................ 13 2.1.1.2.4 Unsupported data types ......................... 13 2.1.1.2.5 Extended error handling ........................ 13 2.1.1.2.6 Best practices for CSV ......................... 13 2.1.1.2.7 Best practices for Parquet ........................ 14 2.1.1.2.7.1 Type support and behavior notes ................ 14 2.1.1.2.8 Best practices for ORC ......................... 14 2.1.1.2.8.1 Type support and behavior notes ................ 14 2.1.1.3 Further reading and migration guides ...................... 15 2.1.1.3.1 Insert from CSV ............................. 15 2.1.1.3.1.1 1. Prepare CSVs ......................... 16 2.1.1.3.1.2 2. Place CSVs where SQream DB workers can access ..... 16 2.1.1.3.1.3 3. Figure out the table structure ................ 17 2.1.1.3.1.4 4. Bulk load the data with COPY FROM ........... 18 2.1.1.3.1.5 Loading different types of CSV files .............. 18 2.1.1.3.1.6 Loading a standard CSV file from a local filesystem ..... 18 2.1.1.3.1.7 Loading a PSV (pipe separated value) file ........... 18 2.1.1.3.1.8 Loading a TSV (tab separated value) file ........... 18 2.1.1.3.1.9 Loading a text file with non-printable delimiter ........ 18 i 2.1.1.3.1.10 Loading a text file with multi-character delimiters ...... 18 2.1.1.3.1.11 Loading files with a header row ................. 19 2.1.1.3.1.12 Loading files formatted for Windows (\r\n) .......... 19 2.1.1.3.1.13 Loading a file from a public S3 bucket ............. 19 2.1.1.3.1.14 Loading files from an authenticated S3 bucket ........ 19 2.1.1.3.1.15 Loading files from an HDFS storage .............. 19 2.1.1.3.1.16 Saving rejected rows to a file .................. 19 2.1.1.3.1.17 Stopping the load if a certain amount of rows were rejected . 19 2.1.1.3.1.18 Load CSV files from a set of directories ............ 20 2.1.1.3.1.19 Rearrange destination columns ................. 20 2.1.1.3.1.20 Loading non-standard dates ................... 20 2.1.1.3.2 Insert from Parquet ........................... 20 2.1.1.3.2.1 1. Prepare the files ........................ 21 2.1.1.3.2.2 2. Place Parquet files where SQream DB workers can access them ................................ 21 2.1.1.3.2.3 3. Figure out the table structure ................ 22 2.1.1.3.2.4 4. Verify table contents ..................... 23 2.1.1.3.2.5 5. Copying data into SQream DB ............... 23 2.1.1.3.2.6 Working around unsupported column types .......... 23 2.1.1.3.2.7 Modifying data during the copy process ............ 24 2.1.1.3.2.8 Further Parquet loading examples ............... 24 2.1.1.3.2.9 Loading a table from a directory of Parquet files on HDFS . 24 2.1.1.3.2.10 Loading a table from a bucket of files on S3 .......... 24 2.1.1.3.3 Insert from ORC ............................. 25 2.1.1.3.3.1 1. Prepare the files ........................ 25 2.1.1.3.3.2 2. Place ORC files where SQream DB workers can access them 25 2.1.1.3.3.3 3. Figure out the table structure ................ 26 2.1.1.3.3.4 4. Verify table contents ..................... 27 2.1.1.3.3.5 5. Copying data into SQream DB ............... 27 2.1.1.3.3.6 Working around unsupported column types .......... 27 2.1.1.3.3.7 Modifying data during the copy process ............ 28 2.1.1.3.3.8 Further ORC loading examples ................. 28 2.1.1.3.3.9 Loading a table from a directory of ORC files on HDFS ... 28 2.1.1.3.3.10 Loading a table from a bucket of files on S3 .......... 28 2.1.1.3.4 Migrating from Oracle .......................... 29 2.1.1.3.4.1 1. Preparing the tools and login information ......... 29 2.1.1.3.4.2 2. Export the desired schema .................. 30 2.1.1.3.4.3 Examples ............................. 30 2.1.1.3.4.4 Dump all tables ......................... 30 2.1.1.3.4.5 Dump only specific tables .................... 30 2.1.1.3.4.6 3. Convert the Oracle dump to standard SQL ......... 30 2.1.1.3.4.7 Example ............................. 30 2.1.1.3.4.8 4. Figure out the database structures ............. 30 2.1.1.3.4.9 Remove unsupported attributes ................. 31 2.1.1.3.4.10 Match data types ........................ 31 2.1.1.3.4.11 Additional considerations .................... 31 2.1.1.3.4.12 5. Create the tables in SQream DB .............. 31 2.1.1.3.4.13 Example ............................. 31 2.1.1.3.4.14 6. Export tables to CSVs .................... 33 2.1.1.3.4.15 Using SQL*Plus to export data lists .............. 33 2.1.1.3.4.16 Creating CSVs using stored procedures ............ 34 2.1.1.3.4.17 CSV generation considerations ................. 34 2.1.1.3.4.18 7. Place CSVs where SQream DB workers can access ..... 34 2.1.1.3.4.19 8. Bulk load the CSVs ...................... 35 ii 2.1.1.3.4.20 Example ............................. 35 2.1.1.3.4.21 9. Rewrite Oracle queries .................... 35 2.1.2 Client drivers for v2021.1 ................................. 35 2.1.2.1 Client driver downloads .............................. 36 2.1.2.1.1 All operating systems .......................... 36 2.1.2.1.2 Windows ................................. 36 2.1.2.1.3 Linux ................................... 36 2.1.2.1.3.1 Python (pysqream) ....................... 36 2.1.2.1.3.2 Installing the Python connector ................ 37 2.1.2.1.3.3 Prerequisites ........................... 37 2.1.2.1.3.4 1. Python ............................. 37 2.1.2.1.3.5 2. PIP ............................... 37 2.1.2.1.3.6 3. OpenSSL for Linux ...................... 38 2.1.2.1.3.7 4. Cython (optional) ....................... 38 2.1.2.1.3.8 Install via pip .......................... 38 2.1.2.1.3.9 Upgrading an existing installation ............... 39 2.1.2.1.3.10 Validate the installation ..................... 39 2.1.2.1.3.11 SQLAlchemy examples ..................... 40 2.1.2.1.3.12 A simple connection example .................. 40 2.1.2.1.3.13 Pulling a table into Pandas ................... 40 2.1.2.1.3.14 API Examples .......................... 41 2.1.2.1.3.15 Explaining the connection example ............... 41 2.1.2.1.3.16 Using the cursor ......................... 42 2.1.2.1.3.17 Reading result metadata .................... 43 2.1.2.1.3.18 Loading data into a table .................... 44 2.1.2.1.3.19 Reading data from a CSV file for load into a table ...... 45 2.1.2.1.3.20 Using SQLAlchemy ORM to create tables and fill them with data ................................ 46 2.1.2.1.3.21 pysqream API reference ..................... 47 2.1.2.1.3.22 C++ Driver ........................... 49 2.1.2.1.3.23 Installing the C++ driver .................... 49 2.1.2.1.3.24 Prerequisites ........................... 49 2.1.2.1.3.25 Getting the library ........................ 50 2.1.2.1.3.26 Extract the tarball archive ................... 50 2.1.2.1.3.27 Examples ............................. 50 2.1.2.1.3.28 Testing the connection to SQream DB ............. 50 2.1.2.1.3.29 Compiling and running the application ............ 51 2.1.2.1.3.30 Creating a table and inserting values .............. 51 2.1.2.1.3.31 Compiling and running the application ............ 52 2.1.2.1.3.32 JDBC ............................... 52 2.1.2.1.3.33 Installing the JDBC driver ................... 53 2.1.2.1.3.34 Prerequisites ........................... 53 2.1.2.1.3.35 Getting the JAR file ....................... 53 2.1.2.1.3.36 Extract the zip archive ..................... 53 2.1.2.1.3.37 Setting up the Class Path .................... 53 2.1.2.1.3.38 Connect to SQream DB with a JDBC application ...... 53 2.1.2.1.3.39 Driver class ............................ 53 2.1.2.1.3.40 Connection string ........................ 54 2.1.2.1.3.41 Connection parameters ..................... 54 2.1.2.1.3.42 Connection string examples ................... 54 2.1.2.1.3.43 Sample Java program ...................... 54 2.1.2.1.3.44 ODBC ............................... 56 2.1.2.1.3.45 Install and Configure ODBC on Windows ........... 56 2.1.2.1.3.46 Installing the ODBC Driver ................... 56 iii 2.1.2.1.3.47 Prerequisites ........................... 56 2.1.2.1.3.48 Visual Studio 2015 Redistributables .............. 56 2.1.2.1.3.49 Administrator Privileges ....................