AWS Marketplace Getting Started
Total Page:16
File Type:pdf, Size:1020Kb
AWS Marketplace Getting Started Actian Vector VAWS-51-GS-14 Copyright © 2018 Actian Corporation. All Rights Reserved. This Documentation is for the end user’s informational purposes only and may be subject to change or withdrawal by Actian Corporation (“Actian”) at any time. This Documentation is the proprietary information of Actian and is protected by the copyright laws of the United States and international treaties. The software is furnished under a license agreement and may be used or copied only in accordance with the terms of that agreement. No part of this Documentation may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, or for any purpose without the express written permission of Actian. To the extent permitted by applicable law, ACTIAN PROVIDES THIS DOCUMENTATION “AS IS” WITHOUT WARRANTY OF ANY KIND, AND ACTIAN DISCLAIMS ALL WARRANTIES AND CONDITIONS, WHETHER EXPRESS OR IMPLIED OR STATUTORY, INCLUDING WITHOUT LIMITATION, ANY IMPLIED WARRANTY OF MERCHANTABILITY, OF FITNESS FOR A PARTICULAR PURPOSE, OR OF NON-INFRINGEMENT OF THIRD PARTY RIGHTS. IN NO EVENT WILL ACTIAN BE LIABLE TO THE END USER OR ANY THIRD PARTY FOR ANY LOSS OR DAMAGE, DIRECT OR INDIRECT, FROM THE USE OF THIS DOCUMENTATION, INCLUDING WITHOUT LIMITATION, LOST PROFITS, BUSINESS INTERRUPTION, GOODWILL, OR LOST DATA, EVEN IF ACTIAN IS EXPRESSLY ADVISED OF SUCH LOSS OR DAMAGE. The manufacturer of this Documentation is Actian Corporation. For government users, the Documentation is delivered with “Restricted Rights” as set forth in 48 C.F.R. Section 12.212, 48 C.F.R. Sections 52.227-19(c)(1) and (2) or DFARS Section 252.227-7013 or applicable successor provisions. Actian, Actian DataCloud, Actian DataConnect, Actian X, Versant, PSQL, Actian Director, Actian Vector, Actian Vector in Hadoop, EDBC, Enterprise Access, Ingres, OpenROAD, and Vectorwise are trademarks or registered trademarks of Actian Corporation and its subsidiaries. All other trademarks, trade names, service marks, and logos referenced herein belong to their respective companies. Contents 1. Introduction 5 What Is Actian Analytics Database - Vector?. 5 Vector Technology . 5 Vectorised Processing--Calculating Query Answers Fast . 5 Storage Innovations--Beating the Disk Bottleneck . 6 Vector Amazon Machine Image (AMI) . 7 2. Deploying Vector from the AWS Marketplace 9 Access the Vector AMI. 9 Launch and Deploy the Vector AMI. 9 3. Running Queries, Creating Databases and Tables, and Loading Data 15 Using Actian Director. 15 Download and Install Actian Director . 16 Start Director and Connect to the Vector EC2 Instance . 16 Run Queries with Actian Director . 18 Load Sample Data Using Vector CLI and Director . 20 Using the Vector Command Line Interface . 23 Start the Vector Command Line Interface . 24 Run Queries with Actian Vector CLI . 25 Load Sample Data Using the Vector CLI. 27 Remove the Sample Databases. 33 More About Actian Vector . 33 A. More Sample Queries 35 B. Configuring Actian Vector Enterprise Edition on AWS Marketplace 37 1. Choose an Instance Type . 37 2. Choose an Instance Size . 38 3. Select a Price Plan . 40 C. Configuring Storage for Vector on AWS 41 AWS EC2 Storage Concepts and Options . 41 AWS EC2 Root Device Volume. 41 Storage Options . 41 Tuning Volume Layout for Performance . 42 3 Instance Store Volumes . 42 EBS Volumes. 43 Configuring Vector to Use the Newly Set Up Disks. 44 D. Migrating Vector Databases Between AMIs 47 4 1. Introduction This section contains the following topics: What Is Actian Analytics Database - Vector? . 5 Vector Technology . 5 Vector Amazon Machine Image (AMI) . 7 What Is Actian Vector? Actian Vector (hereafter Vector) is a next generation database management system from the Actian family of database products. Vector is targeted at analytical database applications—applications that need to process large volumes of data and perform complex operations on it to derive useful information. Typical examples include data warehousing, data mining, and reporting. Vector is optimized to work with both memory- and disk-resident datasets, allowing it to efficiently process large amounts of data (hundreds of gigabytes). Note: Although it is fast for data analysis, Vector is not meant to be used for traditional transaction processing. For OLTP, you can use other products from Actian such as Ingres Database. Vector Technology Vector introduces a new way of storing data and a completely new mechanism for evaluating queries. Innovations such as vectorised processing, compression, and columnar data layout allow analytical queries to be run fast on a single server, even a laptop. Vectorised Processing--Calculating Query Answers Fast The most distinctive feature of Vector is the “vectorised” method it uses for evaluating queries. Rather than operating on single values from single table records at a time, Vector makes the CPU operate on “vectors,” which are arrays of values from many different records. Such vectorised execution brings out the best in modern CPU technology. It brings to the world of databases the Introduction 5 Vector Technology high performance that modern computers exhibit for scientific calculation, gaming, and multimedia applications. The technical basis for efficiency of vectorised processing is that modern chip technology (be it Intel, AMD, or IBM manufactured) now uses deeply pipelined CPU designs. Keeping all pipelines full—and thus efficiency near peak—is impossible for traditional database engines primarily due to code complexity. Similarly, crucial CPU features such as Intel's Streaming SIMD Extensions (SSE) and Advanced Vector Extensions (AVX) are not used well by traditional database systems. Vectorised processing changes that. It provides efficiency that traditionally is only obtained by computer programs handwritten for one particular task. Also, because of the high clock frequency of current CPUs, database systems now need to treat main memory access as a significant cost factor. Vector tackles this by ensuring that the vectors it operates on fit inside the CPU caches, avoiding unnecessary (and in multi-core systems, often contended) main memory access. Vector takes advantage of multi-core systems by handling multiple queries concurrently or by running single queries in parallel. The improved overall computational efficiency of Vector over traditional commercial relational database technology is at least an order of magnitude for long running analytical queries. Storage Innovations--Beating the Disk Bottleneck Any database system with such a high computational speed runs the risk of becoming I/O bound. For this reason, the second major component of Vector consists of storage innovations designed for high I/O throughput. These innovations include: • Columnar data layout • Advanced compression • Storage indexes The Vector storage mechanism uses columnar data layout, which allows analytical queries to avoid disk access for columns not involved in a query. While you can generally think of Vector storage as a column store, Vector can mix columnar and row-based storage so that certain columns that are always accessed together get stored in the same disk block. Layout decisions are handled automatically by the system, but can also be controlled by the user. To further avoid I/O becoming a performance bottleneck, Vector introduces a number of advanced compression schemes. These schemes are designed for fast decompression. Therefore, accessing compressed data in Vector means that less data needs to come from disk, yet queries do not slow down due to decompression. 6Introduction Vector Amazon Machine Image (AMI) Finally, Vector uses storage indexes. The storage indexes are small and store the minimum and maximum value per data block. The storage index, which is automatically created and maintained, enables the execution engine to rapidly identify candidate data blocks. Vector Amazon Machine Image (AMI) An Amazon Machine Image (AMI) is a special type of virtual appliance used to create a virtual machine within the Amazon Elastic Compute Cloud (“EC2”). The AMI serves as the basic unit of deployment for services delivered using EC2. An AMI is a template that contains a software configuration (for example, an operating system, an application server, and applications). An AMI provides the information required to launch an instance—a virtual server in the cloud. You specify an AMI when you launch an instance, and you can launch as many instances from the AMI as you need.1 The Vector AMI is a Linux image that contains: • Linux operating system—CentOS 7.4 • Vector Community Edition or Enterprise Edition • Sample database table • Sample data—real-world data from the U.S. Bureau of Transportation It is deployed in the Amazon Web Services (AWS) cloud. For more information, see Deploying Vector from the AWS Marketplace on page 9. 1. http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instances-and-amis.html Introduction 7 Vector Amazon Machine Image (AMI) 8Introduction 2. Deploying Vector from the AWS Marketplace This section contains the following topics: Access the Vector AMI . 9 Launch and Deploy the Vector AMI . 9 Access the Vector AMI The Actian Vector AMI is available from the AWS Marketplace at the following locations: • Community Edition: https://aws.amazon.com/marketplace/pp/B07FXYD6GX The Community Edition is limited in size to 100 GB. • Enterprise Edition: https://aws.amazon.com/marketplace/pp/B07FMYGCJL Vector Enterprise Edition has no data limits and entitles you to