Vertica Overview Data Sheet
Total Page:16
File Type:pdf, Size:1020Kb
Data Sheet Analytics and Big Data Vertica Overview The Vertica Analytics Platform delivers speed, scalability, and built-in machine learning that today’s most analytically intensive workloads demand, whether in the public clouds, on-premises, on Hadoop, or any hybrid combination. Key Benefits Key Features ■ Flexibility and scalability to easily ramp up when workloads increase. Step Up to the Fastest, Most Flexible At the core of the Vertica Advanced Analytics Platform is a column-oriented, relational data- Big Data Analytics Platform ■ Better load throughput and concurrency base built specifically to handle today’s analytic What should you look for in a data analyt- with querying. workloads. This powerful analytics platform ics warehouse to address today and tomor- ■ In-database machine learning addressing pro vides you with: row’s data challenges? Consider the following every step in the ML process—algorithms, Vertica capabilities: ■ Complete and advanced SQL-based R support, Python extensibility, and more. analytical functions to provide powerful ■ Unify your analytics, not the data: Vertica’s ■ Analyze data in place and in any format, SQL analytics. Unified Analytics Warehouse allows you including complex data types like Maps to combine data siloes that are growing ■ A clustered approach to storing big data, and Arrays, Structs in Parquet on S3, exponentially—without moving the data. offering superior query and analytic HDFS, open SQL-based analytics, and new performance. use cases. ■ Save on both storage and computational charges: While cloud-based data storage ■ Better compression, requiring less ■ Vertica in Eon Mode provisions dynamic is low cost, analyzing that data can lead to hardware and storage than comparable workloads as needed, separates storage prohibitively expensive compute charges. data analytics solutions. and compute, and enables workload Vertica in Eon Mode manages dynamic workloads, so you can spin up storage and Built-in Functions Analyze External Tables in the Right Place compute resources as you need them, Geospatial Machine Learning Event Series Text Analytics Parquet HDFS and spin then down afterward to eliminate Time Series Pattern Matching Amazon S3 Apache ORC unnecessary costs. Real-Time Regression Statistics ■ Meet business expectations: Users don’t want to wait for results. Vertica provides the Data Visualization On-Premises Deployment Options scalability to meet service level agreements Logi Analytics ODBC Openstack Commodity Hardware (SLAs) and business needs with the best Looker JDBC MINIO Pure Storage Power BI OLEDB Hadoop TCO and fastest ROI, including the ability Qlik to dedicate compute resources to individual Tableau Cloud Deployment Options use cases without replicating the data. User Defined Functions Microsoft Azure ■ Embrace popular tools: Vertica provides R Python Amazon Web Services robust and powerful SQL and is certified Java SQL Google Cloud Platform C/C++ to work with all of your tools—not just those from your primary vendor or limited Security Integrations User Defined Loads to a single infrastructure. Use the extract, LDAP FIPS Data Transformation―(Spark) transform, load (ETL) tools or SQL-based Kerberos Voltage Messaging―(Kafka) ETL―(Attunity, Informatica) visualizations of your choosing. Figure 1. Vertica’s open architecture and rich ecosystem Data Sheet Vertica Overview isolation to serve multiple departments • Statistical Summary • Outlier Detection • Support Vector • Model-level stats • In-Database Machines Scoring without duplicating the data. Vertica in • Time Series • Normalization • ROC Tables • Random Forests • Speed Enterprise Mode is ideal for stable • Machine Learning • Sessionize • Imbalanced Data • Error Rate Speed Processing • Logistic Regression • Scale workloads and regular queries. • Pattern Matching • Lift Table • ANSI SQL • Sampling • Linear Regression • Security • Date/Time Algebra • Confusion Matrix • Scalability • Missing Value • Ridge Regression ■ Runs in the clouds, including Google Cloud •Window/Partition • R-Squared Imputation • Massively Parallel • Naive Bayes • Data Type Handling • MSE Platform (GCP), Azure, AWS, VMware clouds; • and more ... Processing • Cross Validation and runs on-premises with commodity • Sequences • Deploy Anywhere • and more ... hardware and support for a range of object •and more ... stores, such as Apache Hadoop HDFS for communal storage, MinIO, and on Business Data Analysis & Data Preparation Modeling Evaluation Deployment Understanding Understanding Pure Storage FlashBlade S3. SQL SQL SQL SQL Product Overview Figure 2. Vertica built-in machine learning process flow Vertica provides blazingly fast speed (que- ries run 10–50X faster), exabyte scale (store 10–30X more data per server), and broad eco- Operationalize Machine Learning— data compression, so it delivers extremely system integration (use any business intelli- at Scale fast results, reducing query times from hours to minutes, or minutes to seconds—some- gence tools, ETL tools, storage, etc.) at a much Not long ago, data science was limited by the thing outdated row-store technologies cannot lower cost than traditional data warehouses or inability to base models on full data volumes, achieve. cloud-only data warehouses. which led to inaccurate predictions. To make matters worse, the majority of machine learn- Vertica offers advanced SQL-based analyt- ing initiatives never make their way into produc- The Power to Handle Today’s ics—from graph analysis, to triangle counting, tion at all, so only portions of the organizations Massive Data Volumes to Monte Carlo simulations, to time series and benefit from the work from data scientists. Modern businesses must manage more data geospatial, and more. All this can be applied to sources than ever before—no longer just CRM your “hot” data loaded directly into Vertica for With Vertica, you can finally operationalize ma- and ERP, systems, but also IoT sensors, social the most demanding use cases. media data, Web logs and data streams, gas chine learning, so that you can understand— and electrical grids, and mobile networks, just and act on—what that data is telling you, with You also get choices. Vertica is the only ad- to name a few. Organizations that are truly the speed and scalability to make a difference. vanced analytics platform that can analyze data-driven must manage this explosive data Vertica’s in-database machine learning sup- data in HDFS, in S3 Object Storage, and within growth, and discover the patterns and trends ports the entire predictive analytics process the Vertica data warehouse itself, including that can lead to new business opportunities, as with massively parallel processing and a fa- the ability to join these disparate data sets well as repeat business from their customers. miliar SQL interface, allowing data scientists into unified analytics. Perhaps most impor- and analysts to build their models using their tant, Vertica offers the broadest choice of Vertica answers these needs. It handles data preferred tools and languages to embrace the deployment modes—Vertica in Eon Mode for at exabyte scale, and enables your organiza- power of big data and accelerate business dynamic workloads that benefit from a separa- tion to unify data siloes across multiple cloud outcomes with no limits and no compromises. tion of compute and storage architecture, and and hybrid (cloud and on-premises) environ- More details below. Vertica in Enterprise Mode for more predict- ments. Not only can Vertica manage massive able workloads on servers with tightly coupled data volumes, it keeps you from getting locked The Technology Big Data Demands storage—so you can choose the architecture into a single cloud vendor. Use the tools of your Vertica is built from the ground up to tackle the that works for you today. choice, and take full advantage of the under- challenges of big data analytics. Its massively lying infrastructure you already have in place, parallel processing system addresses the Unifying Today’s Big Data Siloes with portability across multi-cloud, on-prem, most demanding analytics use cases in the Vertica ensures that all the time, money, and and Hadoop data lakes. industry. Its columnar store offers aggressive effort you’ve put into storing your data turns 2 into business value. It provides a unified ana- imbalanced data processing, missing value lytics platform that can analyze data where it imputation and more. resides—HDFS or Cloud Object Storage—and ■ Create, train, and test advanced machine in all popular formats—ORC, Parquet, JSON, learning models on massive data sets. or ROS (native Vertica). ■ Evaluate model-level statistics including ROC tables and confusion matrices. Along with eliminating data center mainte- ■ Revert back to previous model iterations Protection of data Protection that scales nance, public clouds have provided the archi- everywhere it goes with big data using model management and version tectural advantage of separating data compute SecureData Cloud provides SecureData delivers and storage, and provisioning on-demand. But control features. security in the cloud across protection that scales with Hybrid IT systems. NiFi the growth of nodes, data high compute charges from any given cloud ■ Massively Parallel Processing (MPP) integration enables IoT volumes and data types vendor can quickly sink your big data budget. architecture allows you to build and deploy protection at the edge Vertica solves this problem when deployed models at petabyte-scale with extreme in Eon Mode. Analytics