White paper Business

The world’s fastest analytic White paper Business

Contents The world’s fastest analytic database 01 02 03 Introduction 3 What is Exasol? 6 Value proposition 10 Key capabilities 4 Core architecture 6

Other features 9 04 05 Deployment benefits 12 Conclusion 14 01 White paper Business 3 Introduction

Exasol was founded in Supported from Exasol offices As a and Nuremberg, Germany with a in Germany, UK, USA and Brazil analytics engine, either single purpose – to engineer and with partners across Europe, standalone or integrated with the world’s fastest database Israel and Japan, over 300 Hadoop, Exasol is being used in a for analytics, with no limits on organisations are improving their wide range of Big Data use cases, data volumes. For more than business operational efficiencies including accelerating standard a decade, the company has and/or delivering excellent reporting, running multi-user focused exclusively on delivering customer service using Exasol. ad-hoc analytics, and performing ultra-fast, massively scalable, Exasol customers span many complex modelling using analytic performance. market sectors including Digital predictive in-database analytics. Media, Retail, Communications, The company’s flagship product, Financial Services, Manufacturing Exasol, is a high-performance, in- and Research. memory, MPP database designed specifically for analytics. In 2011, Designed from the ground up, Exasol set a new performance using state-of-the-art software record in the TPC-H benchmark techniques and principles, Exasol for clustered, decision support runs on low cost, commodity , and in 2014 extended x86 hardware; scales from 10’s these performance and price/ GBs to 100’s TBs data; is quick to performance records. Exasol implement and delivers extreme remains in the number one performance, without cost and position for all volumes of data, complexity. from 100GB right up to 100TB. 01 White paper Business 4 Key capabilities

In-memory technology Column-based storage Massively Parallel Processing

Innovative in-memory algorithms and compression Exasol was developed as a parallel enable large amounts of data to Columnar storage and compression system based on a shared nothing be processed in main memory for reduces the number of I/O operations architecture. Queries are distributed dramatically faster access times. and amount of data needed for across all nodes in a cluster using processing in main memory and optimised, parallel algorithms that accelerates performance. process data locally in each node’s main memory.

High user concurrency Scalability Tuning-free database

Thousands of users can simultaneously Linear scalability lets you to extend Intelligent algorithms continuously access and analyse large amounts of your system and increase performance monitor usage and perform self-tuning, data without compromising query by adding additional nodes optimizing system performance and performance. minimizing administrative work. 01 White paper Business 5 Key capabilities

Fast access to all data Comprehensive Hadoop Advanced connectivity sources integration In addition to existing JDBC, ODBC

Exasol’s data virtualization framework Data that is in any native format and .NET interfaces, Exasol now also (“virtual schemas”) and high- supported by HCatalog can be loaded supports a web socket-based SQL performance data import framework directly from HDFS, which makes interface. Using this new interface, allow users to connect to new data high-speed analysis of structured almost any platform can easily access sources more easily and analyze them and unstructured data easy and Exasol, even if there is no dedicated faster than before. straightforward. Data is transferred fast driver available. Exasol already offers a and in a parallelized manner. Python adapter based on this API.

Various use cases Advanced In-Database

Exasol is a very flexible solution and Analytics can be integrated into various business User Defined Functions (UDF) allow models. Exasol can be deployed either in-database advanced analytics to be as a software-only solution, as an easily run using R, Python, Lua and appliance or in the cloud (EXACloud, Java. Microsoft Azure or Amazon Web Services). 02 White paper Business 6 What is Exasol? Core architecture Logical Achitecture

Exasol’s architecture can be seen in Figure 1. Exasol offers a user- friendly, web-based graphical user interface (EXAOperation), its own cluster management system (EXAClusterOS) as well as its own storage management module (EXAStorage). Exasol supports ANSI ExaSolution standard SQL 2008 (including all analytical functions) as well as a large selection of the commonly-used Oracle SQL dialect. Its support of parts of Oracle SQL language set is of particular benefit when migrating from Oracle-based applications; there is minimal need, if any at all, to re-write or change the code. ExaStorage

Additional “hot standby” servers in the cluster guarantee redundancy and resilience. Should a server go down, one of the ExaOperation “hot standby” servers automatically takes over and the cluster can ExaClusterOS continue to operate. The defective server can then be removed from the cluster and replaced without taking the Exasol database offline, the new server then becomes the new “hot standby” server. CentOS/Linux CentOS/Linux CentOS/Linux Server Server Server

Figure 1. Exasol Logical Architecture 02 White paper Business 7 What is Exasol? Core architecture

Exasol has high degrees of automation built in to its design to ensure that it delivers high performance with minimal need for expensive DBA resource to operate it. Some of the key areas of automation are:

Automatic distribution of data Automatic duplication of data Automatic selection of evenly across all servers in the cluster across servers to ensure data integrity in innovative compression the event of a server failure algorithms

that are data type specific and optimized for in-memory processing. Automatic compression of data Automatic monitoring and These algorithms also work independently on each node to ensure at column-level with identical images logging of system resources optimal performance stored both in main memory and on (RAM, Disk, CPU) to aid in capacity persistent media (hard disk) to optimize planning performance. 02 White paper Business 8 What is Exasol? Core architecture

The decreasing cost of RAM has to maximise the benefits of in-memory through the EXAoperation GUI. Adding new encouraged a number of vendors to processing. In real-world deployments the servers is also a straightforward process. develop in-memory options for their optimiser adapts over time to the workload Once the new hardware is connected in existing database products. It is important presented, in order to deliver optimum to the cluster, data is automatically re- to note that Exasol has been architected performance from the database resources distributed across current and new nodes and designed from the outset as an in- available. Therefore, workload management as a background process, and users can still memory database. This is not an «add-on» at the system level is significantly be running queries whilst this is on-going. feature and unlike a number of competing automated, reducing the need for skilled products, with Exasol it is not necessary to and expensive DBA resource. In specific use-cases, where workload is store the entire database in memory. highly diverse and/or user priorities vary For optimum performance, sufficient In addition, as part of the workload significantly at different times of day, Exasol memory is required to service the query management, Exasol automatically provides workload management through workload presented. As with persistent monitors and logs its resource utilization. user prioritisation. This facilitates manual storage on disk, compression is also helpful Therefore as workloads increase (i.e. configuration of users’ or groups of users’ in this respect. This affords considerable more data, more users, more complex (roles) priorities by weighting resource flexibility and advantages in terms of queries) and the performance of Exasol allocation and, if required, defining a balancing memory, servers, performance starts to degrade, the system monitoring schedule. and cost. information is available to help determine how much more memory to allocate to Since Exasol’s raw performance is derived the database in each server (scale up), or from in-memory processing, Exasol has if necessary, what number of new nodes developed a highly intelligent, cost-based (servers) need to be added to the cluster query optimizer, based on self-learning (scale out) to maintain performance levels. algorithms, that manages what data is kept in-memory. An important feature of the Scaling up and allocating more memory optimiser is the automatic creation and to the database in each server is a management of join indices in-memory, straightforward command executed 02 White paper Business 9 What is Exasol? Other features

Exasol supports standard interfaces for increasingly use analytics to support their Another unique feature is the analytic integrating upstream (Data Integration) operational business, data in the data Skyline function for preference analytics. and downstream (BI) tools. The standard warehouse must be regularly adjusted and interfaces that ship with the Exasol software updated. Preference analytics addresses the include ODBC, JDBC, and ADO.NET. Exasol fundamental problems of traditional data has already been tested and is working in Exasol Advanced Edition also offers a mining approaches, where the everincreasing production environments with BI Tools powerful analytics framework. Users can volumes of data and multiplicity of variables and Data Integration tools from all the run code that has been written in R, Python, mean that sorting, filtering and weighting leading vendors including: Informatica, Lua or Java as user-defined functions lead to sub-optimal analyses. Talend, Pentaho, Tableau, Business Objects, (UDFs) parallelized in the database with Cognos and Microstrategy. extremely high in-memory performance. In A real-life example is when choosing the best addition, any other programming languages investment funds. Here, using conventional Furthermore, support for the native Oracle can be integrated, so that users are no approaches to ensure a continuous objective OCI interface enables extremely fast and longer limited by the currently extensive analysis while taking into account risk, returns parallelized data exchange with Oracle standard languages that are available. and countless other metrics is anything else database systems. but simple. Skyline calculates the usually Furthermore, calculations based on the rather small subsets of such funds, which A further point of differentiation is that MapReduce principle in Hadoop systems can actually be shortlisted on grounds of the a SQL pre-processor allows users to can be run directly in the SQL engine and defined criteria. transform existing proprietary queries into combined inside a SQL statement with ANSI-compliant SQL without having to standard SQL. make changes to the original queries. Exasol Advanced Edition includes a Exasol also has a bulk loader that is powerful data virtualization framework easy-to-use and compatible with data (virtual schemas) and an extendible and integration tools. One powerful feature highly-flexible integration framework (ETL lies in the ability to process compressed UDFs). This enables a very flexible and data, e.g. in ZIP format, that enables even high-performance integration of Hadoop- faster data transfer rates. As organizations based data in Exasol. 03 White paper Business 10 Value proposition

Exasol delivers high performance The graph in figure 2 illustrates the analytics on a system that is easy to performance advantage of Exasol at all use, highly scalable, fast to implement scale factors and the detailed TPC-H results and extremely cost effective. can be found on the Transaction Processing Performance Council website at Implementing Exasol does not require http://www.tpc.org/tpch/results/ you to replace your existing systems. It tpch_perf_results.asp. can be implemented alongside existing infrastructure as a complementary solution, to provide the high performance analytics that your current systems cannot deliver. TPC-H Performance at all Scale Factors This complementary approach enables you to continue to leverage your existing 11 000 000 investments and prove the value that 10 000 00 high performance analytics will deliver to your business without impacting existing 9 000 000 operations. If appropriate you can then plan 8 000 000 a phased migration of existing analytics 7 000 000 applications to Exasol over time. 6 000 000 5 000 000 1st Position Exasol’s advantage in terms of the 4 000 000 significant Price/Performance it delivers is 3 000 000 2nd Position clearly demonstrated by the independent Performance (QphH) 2 000 000 Transaction Processing Performance 3rd Position Council TPC-H benchmark, where Exasol 1 000 000 holds the number one position, by a 0 4th Position significant margin over other solutions, 100GB 300GB 1TB 3TB 10TB 100TB for both raw performance and price- performance on data volumes ranging from TPC-H Scale Factor 300GB through to 100TB. Figure 2: Exasol performance advantage 03 11 Value proposition

The exceptional analytics capabilities of the advanced edition open up a wide range of powerful new ways to analyze your business data. These can be roughly split into two categories:

These applications are characterized by The expansion, integration extremely large data volumes that have to or even replacement of be analyzed in a short time using complex algorithms. traditional, platform-specific analytic applications, The powerful capabilities of the advanced edition are based on open frameworks for where the maximum amount of data that data integration and analytical application can be processed is highly limited by the development. The integration of these platform architecture. This is often the case functionalities can be done easily via the in systems such as MATLAB or SAS. standard SQL interface using UDFs (user defined functions). This open approach means that companies can now plan The creation of new high- and build solutions that leverage existing technology investments. This differentiates performance computing Exasol from many competitors who often (HPC) applications pursue closed, proprietary approaches that force customers into classic vendor lock-ins. that are only made possible with MPP technology (such as Exasol) and its scalable data analytics capabilities.

04 White paper Business 12 Deployment benefits

Exasol offers considerable flexibility in deployment, enabling highly efficient utilisation of both hardware and software licensing:

Exasol requires a lower The performance of Exasol specification of (commodity) depends on the relationship hardware between database size, to deliver a given workload; system usage as well as the amount of memory Exasol licensing is based available. optionally on RAM allocated (Note: There is no need to ensure you have enough memory capacity to the application on the for your entire data volumes.) maximum amount of raw Performance is at its most optimum data that can be processed when all the data needed to process queries is always in memory. – there are no end-user limits, and the storage of infrequently used or However, performance with less RAM entirely new, but as yet unexplored, (and lower licensing cost) typically data sets is not penalised or the still meets SLA requirements. maximum amount of raw data that can be processed. 04 White paper Business 13 Deployment benefits

Higher workload demands Several databases can be run Exasol supports a large can be satisfied flexibly on a single hardware cluster percentage of the Oracle through either increasing memory in ideal for supporting multi-tenant language dialect, existing nodes (scale up) and/or adding applications; which means that Oracle new database nodes (scale out). environments can be migrated quickly, efficiently and with minimal effort.

Exasol includes a SQL Exasol’s powerful analytics Exasol includes a data pre-processor framework virtualization framework that enables applications to be offers a seamless integration of and called “virtual schemas” as well migrated fast without having to with other technologies. as a high performance data customize the application itself. integration framework, so you can connect to and analyze data from more sources than ever before. 05 White paper Business 14 Conclusion

Do you want to speed up your Take a look and see the power Or if you would like to discuss current BI platform or business of Exasol for yourself – test our your requirements with one of our analytics? Do you want to predict solution for free: experts please contact us at complex correlations using predictive analytics? Or do you www.exasol.com/download [email protected] need to ensure large numbers of your users have access to BI applications and analytics in general? If you would like to offer your organization really useful “big data” solutions, then Exasol is for you – it’s the platform that allows you to deliver value fast, easily and affordably. Exasol AG Neumeyerstr. 22 – 26 Follow us for the latest content: 90411 Nuremberg Germany www.exasol.com

Tel: +49 911 23991-0 About this Whitepaper: Information listed here may change after the data sheet has been printed (July 2018). Exasol is a registered trademark. All trademarks named are protected Email: [email protected] and the property of their respective owner. © 2018, Exasol AG | All rights reserve