The New World GETTING MORE VALUE FROM DATA: THE RISE OF HIGH-PERFORMANCE

TECHNOLOGY WHITE PAPER MARCH 2019 Contents

THE NEW BIG DATA WORLD – GETTING MORE VALUE FROM DATA: THE RISE OF HIGH-PERFORMANCE DATABASES

Foreword - The world has changed 1

1. TECTONIC SHIFTS IN THE DATABASE MARKET 2 Relational database vendors are challenged 2

Mainstream players have reacted with acquisitions 6

SAP pushes in-memory databases to primetime 8

2. NEW DATA MANAGEMENT TECHNOLOGIES 10 From SQL to NoSQL to NewSQL 10

Hadoop for big data: the rise and fall? 14

NoSQL databases: simpler than SQL 16

NewSQL databases 20 The database market has seen seismic change in the last decade. Cloud data warehousing 22 Trends including cloud computing, Big Data and IOT, AI and automation 3. CLOUD-BASED DATA WAREHOUSING: THE SNOWFLAKE CASE 24 have contributed to an explosion in data volumes. This has shaken up Snowflake, a ‘superfunded’ company 24 the typical enterprise data management ecosystem and created fertile ground for a new generation of startups offering high-volume, Snowflake’s approach: 100% ‘as-a-service’ 26 high-performance applications. Many mainstream relational database 4. IN-MEMORY ANALYTICS DATABASE: THE EXASOL CASE 28 players have reacted by acquiring the newcomers. Other big players Exasol, a hidden champion 28 such as SAP have championed in-memory databases.

5. CONCLUSION 30 Relational databases still dominate globally. But while the segment is expected to keep growing, driven by factors such as in-memory SQL databases, its global share is forecast to fall steadily. At the other end of the market, NoSQL and Hadoop databases surged by an estimated 66% per year from 2013-18 and are expected to account for 18% of the total database market by 2021. The excitement around new database technologies has been reflected in a wave of pre- IPO funding rounds dominated by Cloudera and Snowflake. However, cloud innovations and plummeting storage costs have driven consolidation in the Hadoop space. NoSQL databases, which offer simplicity and speed in big data applications, and NewSQL, which adds the ACID guarantees of relational databases, have seen significant activity and interest. To examine some aspect of the evolving database market in more detail, we profile cloud-based data warehousing software vendor Snowflake, a ‘superfunded’ company offering as-a-service; and we look at Exasol, which develops in-memory analytics for big data use cases. In conclusion, we see a continuing tussle between Oracle and for the #1 spot in the database market, Hadoop being challenged by its own complexity, acquisitions in new database technologies as incumbents continue to battle AWS, and plenty of potential unicorns in the data management space.

1 1. Tectonic shifts in the database market

Relational database vendors the backbone of mainstream Over the last decade, the database FIG.FIG 2: 2 ANNUAL SIZE OF THE ‘GLOBAL DATASPHERE’ IN ZETTABYTES (2010-2025E) are challenged enterprise application software like market has undergone massive ERP or CRM that flourished in the changes, thanks to the nature of data Gartner estimated the size of the 175 B 1990s and 2000s. (SQL was first being generated, the volume, and 180 database management systems developed by IBM in 1974 and made the processing capability needed to market at USD 38.8bn for 2017, commercially available by Oracle make sense of all the data. According 160 up 12.7%1, after an 8.1% rise in five years later). As they constantly to an IDC white paper published in 2016 and a 1.6% decline in 2015. 140 improved in performance, driven by November 20182, the total volume of For more than two decades, this the data warehousing, global data has exploded, rocketing 120 market has been dominated by three and business intelligence tools that to 33 zettabytes (ZB) in 2018 from players: Oracle, which had 41% 100 appeared throughout the 1980s less than 5 ZB in 2005. It is projected market share in 2016, IBM (20%),

and 90s, relational databases were to reach 163 ZB in 2025. One Zetabytes 80 and (15%). Relational fast enough to address all of an zettabyte is equivalent to a trillion databases based on Structured 60 organisation’s data processing needs. gigabytes (GB). Query Language (SQL) have become 40

20 FIG. 1: DATABASE MANAGEMENT SYSTEMS MARKET SHARES (2016) 0 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 FIG 1

Source: IDC and Seagate, The Digitisation of the World, November 2018 MICROSOFT

IT departments of large and small inexpensive technologies such Google Cloud BigTable (wide column companies alike are scrambling as software-defined storage (SDN), store), Google Cloud Spanner 20% to bridge this data infrastructure they made data storage and the (NewSQL database), Google Cloud IBM gap as they failed to anticipate this general use of data analytics Datastore (NoSQL database) and data explosion, which was primarily accessible to organisations of all Google BigQuery (data warehouse); 15% caused by six converging trends: sizes and across all departments. 3) Microsoft launched These three players developed their SQL Database and Microsoft Azure 41% Cloud computing own data management systems: Data Warehouse. The public cloud services Amazon 1) Amazon launched Amazon RDS Big Data ORACLE Web Services (2002), Google Cloud (relational database), Amazon 24% Platform (2008) and Microsoft DynamoDB (NoSQL database), As we have already seen, data volumes Azure (2010) have been designed Amazon Aurora (relational database) have increased massively. Social media to facilitate agile projects and and Amazon Redshift (data platforms create plenty of it: the overall OTHER provide flexible, demand-oriented warehouse); 2) Google launched data footprint of eBay, Facebook (AMAON, TERADATA, SAP) infrastructure and services. Using Google Cloud SQL (SQL database), and Twitter now is 500, 300 and 300

1Gartner, State of the Operational DBMS market, 2018, 6th September 2018. 2IDC and Seagate, The Digitisation of the World, November 2018.

2 | THE NEW BIG DATA WORLD 3 petabytes, respectively (a petabyte such as cameras or vending machines, and include the most recent data, External data self-driving cars and predictive stored in a ‘data lake’, which can is equivalent to one million GB). And has created new challenges for data meaning that the patience of users analytics based on self-learning then be connected to enterprise Data is no longer created only within the Internet of Things (IoT) – data from storage and analysis. waiting for reports, analyses and algorithms are radically changing data warehouses. the organisation itself. It can now be phones, cameras, connected cars, results is increasingly short. business processes. easily enriched with all manner of wearables, sensors, operational control Self-service data analytics Another consequence of these external sources, for example: social centers – generates vast volumes of Operational business intelligence These trends have shaken up the changes is the advent of Amazon By design, access to legacy data media, public statistics, geospatial and automation typical enterprise data management Web Services (AWS) as the fourth data and needs new technologies warehouses was restricted to data, weather forecasts, competition ecosystem. It is no longer limited to largest database vendor in the market for analysis, more scalable storage, centralised IT departments. The Data analytics no longer simply analysis and event-driven data. structured data, relational databases, (with a offering). streaming solutions and operational data decentralisation of data and the measures historical data. It is being ETL (Extract Transform Load) tools, In 2017, AWS had an estimated systems. New applications also need to advent of more easy-to-use analysis built into all kinds of operational Artificial intelligence enterprise data warehouses, and 9%+ market share of the whole be created on top of data, for example tools like Tableau, Qlik, or Microsoft processes and is predictive, With recent advances in infrastructure traditional business intelligence tools. database market and revenues up to offer dynamic pricing or personalised Power BI enabled staff to access and deterministic and self-learning. and tools, data science is much In particular, structured data, an impressive 112% in this area. marketing campaigns. The complexity analyse data and get more insights Completely automated and self- more accessible and now allows semi-structured data, unstructured of IoT, where data is processed either from it. User requirements have also learning processes make decisions for automated decision-making. data and machine-based data on servers, public or private clouds, or changed dramatically. Dashboards based on data at a speed and Applications such as face recognition, (IoT, logs, sensors) can be now at the endpoint or the edge of devices can be updated within seconds accuracy that humans cannot match.

FIG. 3: TYPICAL MODERN DATA MANAGEMENT ECOSYSTEM

STRUCTURED DATA SEMI-STRUCTURED DATA UNSTRUCTURED DATA MACHINE-GENERATED DATA

RELATIONAL DATABASES NON- (SQL, NEWSQL) RELATIONAL DATABASES (NOSQL)

ETL TOOLS (EXTRACT TRANSFORM LOAD)

DATA LAKE

DATA WAREHOUSE

BUSINESS ANALYTICS AND BUSINESS INTELLIGENCE TOOLS

4 | THE NEW BIG DATA WORLD 5 Mainstream players have in-memory databases, but these were These newly acquired technologies reacted with acquisitions confined to high-end applications. contained their own weaknesses. For example, they are all built on a Relational database management In the mid-2000s, start-ups including kernel of the PostgreSQL database software leaders have had to , , Netezza, (a pre-existing database based on embrace these trends, but at the DatAllegro, Aster Data, ParAccel Ingres) which was built in the 1970s, same time deal with hundreds of and Exasol acknowledged that revamped in the 1990s, and difficult thousands of customers who need greater performance was going to to change. The legacy players backwards compatibility with legacy be needed to deal with increasing attempted to lock in the customers IT environments. This constraint made data volumes. They developed by offering ‘end-to-end’ business it practically impossible to replace solutions in the form of large-volume, solutions (from databases to BI 30-40-year-old code and rebuild their high-performance data warehousing solutions) of which the relational solutions from scratch. So instead applications designed and built for database was only one component. of migrating their database product analytical use cases. Their products For example, for use in corporate offerings to the new technology started to see success and most data centers deployed as private platforms, these large incumbents of them were acquired by global IT clouds, Oracle launched in 2008 its kept their old platforms and tried to players (see Fig. 4). ParAccel, which Exadata Database Machine, which fix their legacy systems by adding has served as the core component is an Oracle database running on elements of the new technologies of Amazon Redshift, was acquired dedicated hardware. Exadata was to their existing products. The result by Actian (formerly Ingres) in 2013. adapted to public clouds in 2015. has been a mix of approaches and These acquisitions enabled global a hybrid architecture. Alternative vendors to temporarily address solutions emerged, such as performance issues and consolidate column-based data stores their market positions. (like Sybase IQ in 1995) and

FIG. 4: ACQUISITIONS IN DATABASES AND DATA WAREHOUSING

Company Funding Price/acquirer

DatAllegro USD 63m USD 200m by Microsoft (2008)

Netezza USD 63m USD 1,700m by IBM (2010)

Greenplum USD 97m USD 300m by EMC (2010)

Vertica Systems USD 31m USD 350m by HP (2011)

Aster Data USD 53m USD 295m by Teradata (2011)

Source: Company Data; Bryan, Garnier & Co ests.

6 | THE NEW BIG DATA WORLD 7 SAP pushes in-memory FIG. 5: SAP HANA REVENUES (2011-2013) (EUR M) In response to the success of SAP governed data platform, with several data – including semi-structured and databases to primetime FIG 5 HANA, which grew from nothing in product extensions and innovations for unstructured external data like clicks, 2010 to EUR 633m in revenues by orchestrating and managing enterprise logs, IoT, text, images, and video – Before it launched HANA, SAP was 2013 and now reaches over 21,000 business data within multi-cloud and before it is surfaced through SAP HANA not a database player, even though 633 customers, relational database hybrid environments. To integrate Big for further, more detailed analysis. in 1997 it had acquired Max DB, a incumbents moved to embrace Data stored in Hadoop with enterprise relational database with an in-memory in-memory technology and protect data in SAP HANA, SAP offers a As of today, the key components engine that was developed in the their installed base. During 2013 and separate in-memory query engine, SAP of SAP HANA Data Management late 1970s at the Technical University 392 2014, IBM, Oracle and Microsoft all Vora, which was launched in 2015. And Suite include SAP Data Hub - of Berlin. For years, SAP relied on launched in-memory options on top with the acquisition in 2016 of Big Data which creates a central metadata IBM, Oracle and Microsoft to provide of their relational databases, rather as-a-service (BDaaS) provider Altiscale, management catalog; SAP HANA; the database that ran its enterprise than releasing an independent SAP was able to create its SAP Cloud SAP Enterprise Architecture application software products. In 160 in-memory database. Platform Big Data Services offering Designer, which creates architecture 2010, the acquisition of Sybase for based on Hadoop and Spark. This acts documentation; and SAP Cloud USD 5.8bn gave SAP the Sybase ASE Over the years, SAP HANA has as a data refinery, with the ability to Platform Big Data Services. relational database (which served as 2011 2012 2013 evolved to become an open and cleanse and transform terabytes of raw the basis for Microsoft SQL Server) and the Sybase IQ column store. Source: SAP. From 2014 onwards, SAP stopped reporting SAP HANA revenues

The launch in November 2010 of the The HANA technology had several designed eliminate the use of ETL FIG. 6: ARCHITECTURE OF THE SAP HANA DATA MANAGEMENT SUITE SAP HANA in-memory database, sold advantages from the start: 1) speed (Extract Transform Load) tools, stored SAP INTELLIGENT ENTERPRISE SUITE SAP LEONARDO AND SAP ANALYTICS CLOUD THIRD-PARTY APPLICATIONS as an application running on certified – the ability to handle several billion procedures, materialised views and hardware, was a revolution in the records in seconds; 2) scalability OLAP cubes. - designed to positively respond database world. SAP now mandated SAP HANA DATA MANAGEMENT SUITE SAP HANA was made possible by SAP ADD-ON its own in-memory database for to the increase in the number of API SERVICES running its business applications, at cores per server; 3) in-memory data the arrival of standard servers at AND PRODUCTS the expense of third-party relational compression between 5 and 20x; and affordable prices, equipped with THIRD-PARTY SAP HANA PRODUCTS IN-MEMORY DATA DATA DATA DATA providers. The initial 4) data management simplification several terabytes of memory and AND SERVICES TRANSACTION DISCOVERY ORCHESTRATION CLEANSING STORAGE SERVICES multi-core processors. With blade AND AND AND AND AND HANA purpose of SAP HANA was to carry – with no need to create data SPARK ANALYTICS GOVERNANCE INTEGRATION ENRICHMENT COMPUTE BLOCKCHAIN servers then sold for EUR 40,000 HADOOP SERVICES out real-time data analytics and aggregates, there is no duplication. THIRD-PARTY HANA a unit (now down to EUR 15,000), DATABASES STREAMING from 2011, SAP used it to power THIRD-PARTY ANALYTICS In addition, HANA processed massively parallel computing DATA OTHER SAP SAP Business Warehouse, its data MANAGEMENT SAP ENTERPRISE CLOUD SAP HANA SAP CLOUD PLATFORM warehousing system. But SAP HANA in-memory data in both lines and became widespread. In-memory data ARCHITECTURE DESIGNER PLATFORM + SAP LEONARDO was quickly used to manage data columns, and could handle data processing technology has therefore HYBRID CLOUD MANAGEMENT SERVICES SAP DATA HUB SAP CLOUD PLATFORM, transactions, powering SAP Business management standards for long been reduced to the notion of a SAP EIM SOLUTIONS BIG DATA SERVICES Suite (2013) then its replacement SAP relational databases and OLAP cache or database accelerator. While S/4HANA (2015) and all SAP cloud (OnLine Analytical Processing) all operations take place in-memory, multi-dimensional analysis. Finally, each transaction is automatically CLOUD applications. Today, SAP HANA is BUSINESS APPLICATION IOT SPATIAL SOCIAL IMAGE ON-PREMISES DATA MULTI-CLOUD the mandatory database for running with its ability to handle transactional captured and copied onto a disk with DATA SAP’s Core and Cloud applications. and analytical data, SAP HANA was redundancy and high availability.

Source: SAP

8 | THE NEW BIG DATA WORLD 9 2. New data management technologies

FIG 7 From SQL to NoSQL FIG. 7: DATABASE MANAGEMENT SYSTEMS MARKET (2013-2021E) to NewSQL

Relational databases still account 60,000 DYNAMIC DBMS (NOSQL, HADOOP) for around 82% of the global market LEGACY NON-RELATIONAL DBMS RELATIONAL DBMS for database management systems. 50,000 However, this has fallen from 87% 40,000 in 2013, and IDC anticipates that it 30,000 will fall further, to 77% by 2021. That said, the relational database segment 20,000 is still expected to grow by 6-7% 10,000 per year over the next three years, 0 driven in particular by in-memory 2013 2014 2015 2016 2017 2018E 2019E 2020E 2021E SQL databases or in-memory options and the strong growth of NewSQL Source: IDC databases. NoSQL, NewSQL will not replace SQL databases, but co-exist within technology stacks FIG. 8: PRE-IPO FUNDING OF NO/NEWSQL, HADOOP AND CLOUD EDW for the foreseeable future. At the FIG 8 other end of the market, NoSQL and VENDORS (USD M) Hadoop databases have increased their market share from 1% in 2013 to an estimated 11% in 2018, with 1,000 revenues up from USD 366m to 929 USD 4.6bn. These databases have surged by an estimated 66% per year on average over the period. IDC anticipates that this segment will see average annual growth of 33% from 311 2017-2021, reaching USD 10bn or 280 248 247 190 173 18% of the total database market 162 160 146 108 by 2021.

The excitement around NoSQL, MAPR NewSQL and Hadoop, which NEO4J ELASTIC MEMSQL DATASTAX MONGODB are aimed at replacing relational CLOUDERA MARKLOGIC SNOWFLAKE COUCHBASE DATABRICKS databases, is reflected by the funding HORTONWORKS of a significant number of startups in Source: Crunchbase these areas (see Fig. 8).

10 | THE NEW BIG DATA WORLD 11 The table below recaps the financing Yellowbrick Data (USD 92m), Neo4j and Elastic – the four companies FIG. 10: MARKET CAPITALISATION OF NEW DATA MANAGEMENT TECH VENDORS rounds that have taken place since (USD 80m), NuoDB (USD 31m) and in this space that have gone public early 2016 in the database and data MemSQL (USD 30m). In 2017, the since 2014 – show strong investor Elastic USD 5.19bn +107% since IPO warehousing areas. 2018 saw plenty two biggest rounds were Databricks appetite for these stocks as they Mongo DB USD 4.59bn +181% since IPO of activity, with USD 713m raised (USD 140m), and Snowflake again generate strong growth, reduce for Snowflake Computing, which (USD 105m). their losses, and become potential Cloudera USD 1.72bn -23% since IPO is seen as the new ‘unicorn’ in the champions in their specific area or Hortonworks USD 1.24bn* +88% since IPO space, over two rounds. The other Valuation multiples achieved by M&A targets for the likes of IBM, Vertica Systems USD 31m significant rounds for this year include Hortonworks, Cloudera, MongoDB Oracle, SAP and others. Aster Data USD 53m

FIG. 9: FINANCING ROUNDS FOR NO/NEW SQL, HADOOP AND CLOUD EDW VENDORS *As of 2 January 2019, when it left the stock market following merger with Cloudera.

Date Company Round No. investors Amount Lead financing

Dec. 2018 NuoDB Venture 5 USD 31m Temenos Current EV/(N+1) sales multiples are impressive for Elastic and MongoDB: Nov. 2018 Neo4j Series E 5 USD 80m Morgan Stanley, One Peak

Oct. 2018 Yellowbrick Data Series B 2 USD 48m Next47 FIG.FIG 11: 11 EV/(N+1) SALES MULTIPLES AS OF 8TH JANUARY 2019 Oct. 2018 Snowflake Computing Series F 8 USD 450m Sequoia

Oct. 2018 Elastic IPO USD 252m Valuation: USD 2.5bn Aug. 2018 Yellowbrick Data Series A 5 USD 44m Menlo Ventures 19 18.5 May 2018 MemSQL Series D 5 USD 30m Glynn, GV

Jan. 2018 Snowflake Computing Series E 3 USD 263m Altimeter, Iconiq, Sequoia

Oct. 2017 MongoDB IPO USD 256m Valuation: USD 1.6bn

Sep. 2017 MariaDB Series C 6 USD 27m Alibaba

Sep. 2017 Snowflake Computing Series D 7 USD 105m Iconiq

Sep. 2017 MapR Technologies Venture 7 USD 56m Lightspeed

Aug. 2017 Databricks Series D 4 USD 140m Andreessen Horowitz May 2017 MariaDB Venture 1 EUR 25m European Investment Bank 3.2 May 2017 Cockroach Labs Series B 5 USD 27m Redpoint

Apr. 2017 Cloudera IPO USD 225m Valuation: USD 1.9bn

Dec. 2016 Databricks Series C 3 USD 60m New Enterprise Associates ELASTIC MONGODB CLOUDERA

Nov. 2016 Neo4j Series D 4 USD 36m Source: Company data; Thomson Reuters Aug. 2016 MapR Technologies Venture 7 USD 50m Future Fund

Jul. 2016 Elastic Series D 2 USD 58m

Apr. 2016 MemSQL Series C 7 USD 36m Caffeinated, REV

Mar. 2016 Cockroach Labs Series A 4 USD 20m Index

Mar. 2016 Couchbase Series F 7 USD 30m Sorenson

Feb. 2016 NuoDB Series B 3 USD 17m

Source: Crunchbase

12 | THE NEW BIG DATA WORLD 13 Hadoop for big data: System or HDFS; and processing, computation and data are distributed FIG.FIG 13 13: REVENUES OF CLOUDERA AND HORTONWORKS (2014-2018E) (USD M) the rise and fall? the MapReduce programming model. via high-speed networking. Hadoop splits files into large blocks The initial release of and distributes them across nodes in An ecosystem of additional software took place in 2006. This distributed 500 HORTONWORKS (FYE 31/12 N) a cluster. It then transfers packaged packages for managing data has 445.0 file system is a collection of open- 450 CLOUDERA (FYE 31/1 N+1) code into nodes to process the data been developed around Hadoop 400 source software tools that facilitate 367.4 in parallel. This approach allows the (See Fig.12). 350 340.5 the use of computer clusters to solve dataset to be processed faster and 300 261.0 261.8 issues involving massive amounts Hadoop can be deployed on-premise, 250 more efficiently than in a conventional of data. Its core has two elements: or in a public or private cloud: 200 184.5 supercomputer architecture, which 166.0 storage, Hadoop Distributed File Microsoft Azure (with Microsoft Azure 150 109.1 121.9 uses a parallel file system where 100 HDInsight), Amazon EC2 and S3 50 46.0 (with Amazon Elastic MapReduce), 0 FIG. 12: THE HADOOP DATA MANAGEMENT ECOSYSTEM (Google Cloud 2014 2015 2016 2017 2018E

Dataproc), SAP Cloud Platform (SAP Source: Cloudera; Hortonworks Spark Cluster-computing framework for data analytics and machine-learning algorithms Big Data Services), and Oracle Cloud

Hive Data warehousing software providing a SQL-like query language, HiveQL Platform (Oracle Big Data Cloud).

Non-relational distributed database for storing large quantities of sparse data, In October 2018, Cloudera Plummeting storage costs to load it into special storage. Other HBase for serving data-driven websites The main Hadoop software announced it was acquiring cloud database services - Snowflake, distributions are Cloudera, On cloud storage services like AWS Phoenix Massively parallel relational database based on SQL, using HBase as its data store Hortonworks through an all-stock Google BigTable, Amazon Aurora, S3, Azure Blob Storage, and Google Hortonworks and MapR. Hortonworks ‘merger of equals’, leaving Cloudera and Microsoft Cosmos - are similarly Impala Massively parallel SQL query engine for analytics Cloud Storage, a terabyte of cloud and Cloudera have been listed in the shareholders with 60% of the massive-scale, highly flexible, object storage costs about USD 20 a Pig Platform for creating programmes on Hadoop US since December 2014 and April combined company. The transaction globally distributed ‘pay-as-you- 2017, respectively. Their revenues month, compared to around USD 100 ZooKeeper Distributed computing system was completed in January 2019. go’ databases. surged at over +40% per year until a month on Hadoop Distributed File System (HDFS). Flume Log data collection 2017, after which growth rates slowed We understand the deal as a signal Python and R data science running in 2018, when the mid-point of the Hadoop market could no longer on containers Sqoop Data transfer between relational databases and Hadoop Faster, better, and cheaper company guidance implied +21% for sustain two big competitors, with cloud databases Hadoop’s revolutionary innovation Oozie Workflow scheduling Cloudera and +30% for Hortonworks. several trends driving the change: was that it offered both a storage and Cloudera’s shares dropped 40% on The advent of cloud services like Storm Distributed stream processing computation framework compute environment for applications 4th April 2018 when it reported its The shift to public cloud Google BigQuery in 2011 eliminated written in Java. However, this was out FY18 results due to disappointing the need to run Hadoop or Spark. AWS, Microsoft Azure, and Google of step with data scientists working guidance for FY19. While is used to handle Cloud offer their own managed on machine learning in Python and ad-hoc distributed SQL queries, Hadoop/Spark services as part of R, who need native deployment Google BigQuery runs ad-hoc queries fully integrated offerings that are of Python/R models to iterate and on any amount of data stored in its cheaper to buy and scale. improve machine learning models. object storage service without having

14 | THE NEW BIG DATA WORLD 15 NoSQL databases: There are four main categories of Key-value (KV) stores FIG. 14: CLASSIFICATION OF NOSQL DATABASES NoSQL database: simpler than SQL These databases use dynamic data Wide column stores Document-oriented databases Key-value stores Graph databases such as session data and machine- The concept of ‘NoSQL’ databases Wide column stores emerged around 2009. The term was generated data used for analysis and Apache Accumulo (based on Google BigTable) Amazon DocumentDB Aerospike Database AllegroGraph Data is collected in very large systems and device coordination. introduced by IT developer Johan Apache Cassandra (DataStax) Apache CouchDB Amazon DynamoDB Apache Giraph Oskarsson to label a growing number structures for high-performance They treat the data as a single of non-relational, distributed data analytics. By storing data in columns collection that may have different fields Apache HBase ArangoDB Apache Ignite ArangoDB rather than rows, the database an stores including Google BigTable, for every record. Key-value databases Druid BaseX Apache ZooKeeper Objectivity InfiniteGraph Amazon Dynamo and the newly access the data it needs to answer often use far less memory to store the Micro Focus Vertica Clusterpoint ArangoDB MarkLogic launched Amazon DocumentDB. a query more precisely, without the same data than relational databases, need to scan and discard unwanted Most early NoSQL systems did not which can lead to large performance Couchbase Server Basho Riak Neo4J attempt to provide the atomicity, data in rows. Wide column stores gains in certain workloads. Elasticsearch Couchbase Server OpenLink Virtuoso consistency, isolation and durability are well-suited for data warehouses (ACID) guarantees that relational involving highly complex queries. Graph databases Microsoft Azure Cosmos DB FoundationDB SAP OrientDB databases usually offer. NoSQL These are organised in a graph IBM Notes/Domino InfinityDB Document-oriented databases databases enable the storage and structure for relationship and pattern MarkLogic Oracle BerkeleyDB retrieval of data modelled in other These are used for managing semi- analysis, for example fraud detection means than through tabular relations. structured data in document formats or semantic text analysis. The graph MongoDB SAP OrientDB They are increasingly used in big such as JSON or XML. Document directly relates data items in the Qualcomm Qizx SciDB data and real-time web applications databases store all the information store to a collection of nodes of due to design simplicity, simpler for a given object in a single instance data and edges representing the RethinkDB in the database, and every stored scaling to clusters of machines, relationships between the nodes. The SAP OrientDB and finer control over availability. object can be different from every relationships allow data in the store to The data structures used by NoSQL other. This makes mapping objects be linked together directly. Querying Source: Bryan, Garnier & Co databases (key-value, wide column, into the database a simple task and relationships within a graph database graph, or document) make some typically eliminates anything similar is fast because they are perpetually operations faster in NoSQL than in to an object-relational mapping. This stored within the database itself. relational databases. makes document stores attractive for Relationships can be intuitively programming web applications, which visualised using graph databases, are subject to continual changes, and making them useful for heavily where speed of deployment is an inter-connected data. important issue.

16 | THE NEW BIG DATA WORLD 17 MongoDB was the first listed FIG.FIG 15 15: MONGODB - REVENUES AND NON-GAAP OPERATING MARGIN (FYE 31/1) pure-play NoSQL database software vendor, with a NASDAQ IPO in REVENUES (USD M) (LEFT SCALE) NON-GAAP OPERATING MARGIN (%) (RIGHT SCALE) October 2017. The company is still -36.2% heavily loss-making, with a -49.2% non-GAAP operating margin -64.0% -229.0 forecasted at -36% at the mid-point of company guidance for FY19. -91.7% Losses are reducing year-on-year while revenues are growing around 154.5 50% annually.

The second NoSQL database 101.4 pure-player to be listed was Elastic, 65.3 which develops and sells the -168.1% - Elasticsearch document-oriented 40.8 database and search and analytics engine. Listed on NYSE in October FY15 FY16 FY17 FY18 FY19E

2018, this company is also loss- Source: MongoDB making, with a non-GAAP operating margin of -20% for the fiscal year FIG 16 ending 30th April 2018. Revenues, FIG. 16: ELASTIC - REVENUES AND NON-GAAP OPERATING MARGIN (FYE 30/4) on the other hand, grew 81% during REVENUES (USD M) (LEFT SCALE) that fiscal year. NON-GAAP OPERATING MARGIN (%) (RIGHT SCALE)

255 159.9 -17.6% -20.0% 88.2 -31.2% FY17 FY18 FY19E

Source: Elastic; Bryan, Garnier & Co

18 | THE NEW BIG DATA WORLD 19 NewSQL databases The issue with NoSQL databases There are three main categories FIG. 17: CLASSIFICATION OF NEWSQL DATABASES is their lack of compliance with the of NewSQL database: One of the reasons why relational ACID rules when processing Shared-noting databases Storage engines for SQL Transparent sharding databases have been so successful ‘high-profile’ data (e.g. financial Shared-nothing (SN) databases for decades is that they have a set Altibase Infobright ScaleBase data or orders) with strong of characteristics that guarantee These are designed to operate in transactional and consistency Apache Ignite MyRocks Vitess their validity: Atomicity, Consistency, a distributed cluster of ‘shared- requirements. NewSQL – a concept Isolation, and Durability (ACID). nothing’ nodes in which each node CockroachDB MariaDB Columnstore which emerged in 2011 from the Atomicity guarantees each is independent and self-sufficient, 451 Group analyst Matthew Aslett Exasol Microsoft SQL Server (with ColumnStore and InMemory features) transaction is treated as a single unit and owns a subset of the data, – seeks to offer the same scalable Google Spanner Oracle InnoDB that either succeeds completely or while none of the nodes shares performance of NoSQL databases memory or disk storage. Some SN fails completely. Consistency means GridGain Oracle MySQL Cluster for transaction workloads while still that any data written to the database database platforms are in-memory, maintaining the ACID guarantees of a HarperDB TokuDB must be valid according to all defined like Altibase, Apache Ignite, Exasol, relational database. Both NoSQL and rules – this prevents database GridGain, MariaDB , MariaDB Clustrix NewSQL support the relational data corruption by an illegal transaction, and VoltDB. model and use SQL as their primary MemSQL but does not guarantee a transaction Storage engines for SQL interface. Applications typically Microsoft Azure Cosmos DB is correct. Isolation ensures that targeted by NewSQL databases Optimised storage engines for SQL, concurrently executed transactions NuoDB have a large number of short-lived, these databases providing the same leave the database in the same state repetitive transactions that touch a programming interface as SQL but Trafodion which would have been obtained small subset of data. Unlike NoSQL scale better than built-in engines. if the transactions were executed VoltDB databases, NewSQL databases are sequentially. Finally, durability not easy to categorise by technical Transparent sharding guarantees that once a transaction Source: Bryan, Garnier & Co approach as they vary significantly has been committed, it will remain These NewSQL databases provide in their internal architecture. committed even in the case of a a horizontal data micro-partition Some support only transactional power outage or crash. layer to automatically split databases workloads, but others support hybrid across multiple nodes. transactional/analytical workloads, which some NoSQL databases support too.

20 | THE NEW BIG DATA WORLD 21 Cloud data warehousing While the on-premise data As for access, Amazon and warehousing software market An increasing number of data Google provide a fully managed is dominated by Oracle, IBM, warehouses are now available in the data warehousing service that Microsoft, Teradata and SAP, the cloud rather than on-premise. Cloud is available exclusively on their cloud data warehouse market is led data warehouses have the advantage own public cloud infrastructure. by Amazon Web Services (Redshift), of being provisioned in minutes and Oracle’s cloud data warehouse is Snowflake, Google (BigQuery), without any technical expertise. only available within the Oracle and Oracle (Autonomous Data They suit small and mid-sized Cloud. IBM is available as a Warehouse Cloud). companies, business analysts, and managed cloud service on IBM’s other non-technical users who want Challengers in this market include cloud infrastructure (SoftLayer), to access, store and process large Teradata (IntelliCloud), IBM (Db2 although it can also be deployed amounts of data at low cost. Initially Warehouse on Cloud), Microsoft on all major cloud providers with focused on simple data loading and (Azure SQL Data Warehouse), and the potential ‘downside’ of the reporting, cloud data warehouses Cloudera (Hortonworks HDP). Smaller customer having to manage the now support complex cases such players exist, like MarkLogic, Pivotal, software themselves. However, as IoT analytics, real-time analytics, Exasol and Yellowbrick Data, while most of the other independent fraud analysis and 360-degree some large players like Alibaba, Micro players have a multi-cloud customer views at the hundreds of Focus (Vertica) and Huawei have a approach, using Amazon Web terabyte (and even petabyte) scale. weak presence. Services, Microsoft Azure, Google And while cloud data warehouses Cloud and/or their own data were initially only deployed for new centers to run their software. projects, some organisations are Finally, Yellowbrick Data - which migrating their on-premise data exited stealth in July 2018 and warehouses to the cloud, whether it since then has raised USD 92m - is a public, private or hybrid cloud. has its data warehouse available on-premise, on private clouds, on colocation and on the edge (IoT), and plans to be available on public clouds in 2019.

22 | THE NEW BIG DATA WORLD 23 3. Cloud-based data warehousing: the Snowflake case

Snowflake, database performance efforts for its FIG. 18: SNOWFLAKE COMPUTING – SUMMARY OF FUNDING ROUNDS In January 2018, CEO Bob Muglia a ‘superfunded’ company parallel systems. Thierry Cruanes publicly stated that the USD 263m “In many cases, Snowflake is spent 13 years at Oracle, working on round would probably be the last The cloud-based data warehousing the only product that’s viable the optimisation and parallelisation Date Lead investor(s) Other investors Capital raised/valuation before a possible IPO. At that time, software vendor Snowflake for customers to move to. layers in Oracle databases. Before the company had more than 1,000 Customers demand a level of Computing is, in our view, a good Redpoint ventures Oracle, he spent seven years at the June 2015 Altimeter Capital Sutter Hill USD 71m customers, up from 450 in April example of the new generation Wing VC scale that Amazon Redshift IBM and Sema, working on data 2017, and expected this to double just can’t do. We can scale of successful data management Iconiq Capital mining techniques. Finally, Marcin April 2017 Existing investors USD 105m over 2018. Snowflake employed 330 companies that have emerged from Madrona Venture Group to a large number of queries. Zukowski was co-founder and CEO staff, up from 175 in April 2017, and Because we’re just a very large the cloud revolution. The company Iconiq Capital of Vectorwise, an analytical database Altimeter Capital USD 263m/USD 1.5bn expected this to also almost double, develops and sells a cloud-based January 2018 cloud service, we throw the software vendor sold to Actian in Sequoia Capital to 600, during 2018. The two large storage and analytics platform, the power of the cloud at running October 2018 Existing investors 2010. During its first two years of Sequoia Capital USD 450m/USD 3.5bn funding rounds in 2018 was aimed Snowflake Elastic Data Warehouse (pre- IPO) Meritech Capital things at the same time. Every existence, Snowflake operated in helping Snowflake in four areas: (EDW), which uses a cloud-based single day, we run almost 20 stealth mode, looking to create data million queries a day.” infrastructure and has been available Source: Crunchbase warehouse software designed for FIG 19 1. continuing to expand its multi- since June 2015. As a testimony to the cloud. cloud strategy beyond Amazon Bob Muglia, CEO, Snowflake its success, large companies like Web Services and Microsoft Azure; FIG. 19: SNOWFLAKE COMPUTING - TOTAL FUNDING AND VALUATION (USD M) Computing. Capital One, Nielsen, and Adobe use The company came out of stealth Snowflake EDW. New customers in mode in 2014, just after the 2. growing sales teams across and 2018 include leading brands such as 4,000 TOTAL VALUATION outside the US to meet increasing appointment of CEO Bob Muglia. At FUNDING ROUND 3,500 3,500 Netflix, Office Depot, Netgear, and Microsoft between 1988 and 2011, he demand for the only cloud-built Yamaha. In our view, Snowflake’s had worked as the product manager 3,000 data warehouse; success can be attributed to the for SQL Server, Vice-President 2,500 3. investing further in Snowflake’s strategic vision of its founders, the Windows NT, Server Application 2,000 1,500 data warehouse-as-a-service by involvement of its CEO Bob Muglia, Group and .NET Services Group, 1,500 growing its engineering team in and a successful funding series. then Senior Vice-President Servers & 1,000 San Mateo, CA, Bellevue, WA, Tools. Then at Juniper Networks he 500 263 450 Snowflake Computing was founded 26 45 105 and Berlin, Germany); served as EVP Software & Solutions 0 in 2012 in San Mateo, California, 2014 JUN 2015 APR 2017 JAN 2018 OCT 2018 (2011-2013). by CTO Benoît Dageville, Thierry 4. delivering innovations such Source: Snowflake company data as Snowflake Data Sharing, also Cruanes (Product Architect) and In June 2015, Snowflake announced known as ‘The Data Sharehouse’, Marcin Zukowski. Benoît Dageville the general availability of Snowflake which further consolidates its spent 15 years at Oracle as the EDW, as well as the first in a series of lead over legacy cloud and lead architect for parallel execution funding rounds (See Fig. 18 and 19). and a key architect in the SQL on-premise solutions. Manageability group. Before Oracle, he worked at Bull, where he helped define the architecture and lead

24 | THE NEW BIG DATA WORLD 25 Snowflake’s approach: The Snowflake EDW platform has the FIG. 20: SNOWFLAKE EDW PRODUCT ARCHITECTURE 100% ‘as-a-service’ following features:

Snowflake’s data warehouse-as- • The cloud storage layer is a-service uses the SQL language engineered to scale independently AD-HOC ANALYTICS to manage data. It can work with a of compute resources. It can TRANSACTIONS SECURITY mix of structured data such as CSV process data loading and unloading files and tables, as well as semi- without impacting running queries. structured data like JSON, Avro, XML Snowflake uses micro-partitions MANAGEMENT METADATA and Parquet. It uses the Amazon (‘sharding’) to store customer data, Web Services S3 service as a cloud which are compressed in columns LOADING DEVELOPMENT infrastructure. Since September 2018, and encrypted. CENTRALISED Snowflake has also been generally STORAGE available on Microsoft Azure, • On the compute layer, data OPTIMISATION METADATA integrated with Azure services such processing is performed by virtual as Microsoft Azure Data Lake Store warehouses, which retrieve the and Microsoft Azure Power BI. The data required from the storage platform also employs several new layer to satisfy queries, then Source: Snowflake Computing, company data features, for example limitless storage cache it locally with computing The company was initially competing Bob Muglia’s long-term vision for accounts, accelerated networking resources along with the caching with Amazon Redshift, another Snowflake is that the company’s and storage soft delete. As of today, of query resources to improve the cloud-based data warehousing data lake technology becomes Snowflake is available in the Azure performance of future queries. system. However, as an increasing a platform for building data East-US-2 region (in Virginia), with Multiple data warehouses can number of large firms move towards applications. This could be for data more regions to be added in the simultaneously operate on the cloud adoption and look for solutions sharing between customers or to sell future3. Whether it is running on same data. that can handle structured, semi- that data. He foresees companies Amazon Web Services or Microsoft structured and transactional data, building businesses on top of the Azure, from the US and Dublin, • The services layer authenticates Snowflake has been increasingly data stored in the Snowflake data Snowflake generally costs USD 40 user sessions, manages, competing with Oracle, IBM and warehouse much as companies per terabyte of data per month for enforces security, performs query Teradata. The company often picks have built businesses on top of the on-demand storage, and USD 23 compilation and optimisation, up customers disillusioned with Salesforce platform. per terabyte per month for capacity and coordinates transactions. Hadoop, and it benefits from the storage. From Frankfurt, pricing is It uses a distributed metadata IoT wave as the semi-structured USD 45 and USD 24.50 respectively. store for global state management. data generated by sensors is Metadata processing is powered complicated to manage with a by a separate integrated structured approach. sub-system, which allows data processing without using query computer resources.

3 Microsoft Azure has at least 14 cloud regions in the US and 54 globally.

26 | THE NEW BIG DATA WORLD 27 4. In-memory analytics database: the Exasol case

Exasol, a hidden champion Exasol has offices in Germany, The main memory served as a large France, the UK and the US, and sells cache, from which content was placed Headquartered in Nuremberg “Exasol’s vision is to provide its products through partners in the in local storage when it was not (Germany), Exasol develops and the most extensible, powerful rest of Europe, Israel and the USA, efficient to fit everything into the cache. sells a high-performance, in-memory, and platform independent but also directly to customers in India, That way, it was possible to analyse column-oriented relational database data analytics framework for Australia and South America. As of more data even more quickly than with management and data warehousing customers to govern, connect today, more than 140 companies a pure in-memory database. Current system specifically designed and understand the entirety of use Exasol to run business main shareholders are board members, for analytics. It launched its first the data flowing through their intelligence and data analytics, founders, the German State-owned product in 2005. Either standalone organisations. As they transition including well-known names such as bank KfW and the Swiss early-stage or integrated with Hadoop, Exasol to data-driven processes, Adidas, Zalando, GfK, King Digital venture firm Mountain Partners. is used in a wide range of Big Data companies need to see and Entertainment, Vodafone, Xing, use cases, including accelerating leverage all the data necessary Revolut, Dailymotion, and Postbank. for the success of their business, standard reporting, running multi-user FIG. 21: EXASOL PRODUCT ARCHITECTURE wherever the data may be. For ad-hoc analytics, and performing Exasol’s technology was created by complex modelling. Exasol’s ability this they will need an incredibly one of its founders, Falko Mattasch, APPLICATIONS powerful and intelligent data to pull in data in real-time allows in the 1990s. While researching framework which will permit end-users to compress the multi- intelligent algorithms in the Parallel REAL-TIME DATA ADVANCED PREDICTIVE BUSINESS LOGICAL DATA new tools, technologies and step process of data collection and AD-HOC Processing Department at the SCIENCE ANALYTICS ANALYTICS INTELLIGENCE WAREHOUSE processes to be implemented pre-processing into a rapid execution University of Jena (Germany), he REPORTING on demand, as smoothly and as analytics engine. This means users showed how data analytics could simply as possible.” can perform quick ad-hoc analysis benefit from parallel processing. After and answer critical business leaving academia, he implemented a Aaron Auld, CEO, Exasol questions in minutes. new prototype for a parallel database ANALYTICS system, which served as the basis for Exasol’s core differentiation lies in its the technology behind Exasol. It took proprietary technology architecture years for the company to develop a SQL MAP REDUCE R PYTHON JAVA LUA SKYLINE GEOSPATIAL GRAPH that combines in-memory plus MPP complete database, as everything (Massive Parallel Processing) which was created in-house without results in extreme high querying EXASOL – THE PARALLEL IN-MEMORY DATABASE recourse to open source database speed at large data volumes. Their cores such as PostgreSQL. This led platform integrates with a wide range to a software system that offered of front-end analytics tools, such as optimum performance and was Tableau and Microstrategy. highly scalable and ‘tuning-free’. As HADOOP SYSTEMS EXASTORAGE DATABASES CLOUD STORAGE data was continuously changed and updated, Exasol designed the system

so that not all data had to be stored DATA in the main memory. Source: Exasol company data

28 | THE NEW BIG DATA WORLD 29 5. Conclusion

In this report, we have reviewed Database is adding resilience. As for market consolidation, we the deep changes happening in Oracle also has the financial might saw a significant wave of M&A the database world, and the new to acquire cloud database or data deals between 2008 and 2013, generation of players aiming to warehouse players if they add value before the advent of NoSQL and challenge longstanding leaders such to its offerings and for shareholders. NewSQL databases and cloud data as Oracle, Microsoft and IBM. The So, while it’s not dying yet, its warehouses. Many small database question now is how the competitive competitive position is not as players have emerged since the landscape will be reshaped ten years comfortable as it used to be. beginning of the decade, with only on. Will Oracle remain as the #1 a handful achieving significant scale database player? Can Amazon Web AWS can steal the #1 position in or funding. Nonetheless, some have Services be the new #1 in this market? databases. But with a market share become ‘unicorns’, most recently Is Hadoop definitely a fallen ‘star’? is at 9%, it could take a long time to Snowflake Computing in data How the market will consolidate? get there. However, Amazon’s growth warehousing. We believe there will What could be the next database comes from the cloud: for every cloud be an opportunity for incumbent ‘unicorns’? Let’s try to address customer it gets, there is a potential database companies to acquire these questions. customer for DynamoDB, Aurora and some of these players to stay Redshift. That said, its database and competitive with AWS. As Elastic Oracle, in our view, will be increasingly data warehouse offerings are tied to its and MongoDB demonstrated, EV/ challenged by Amazon Web Services, cloud, so a Microsoft Azure or Google sales valuation multiples can be very but this does not mean it will lose its Cloud customer will not use AWS rich. Consequently, adding a sizeable leadership in the database market. database or data warehouse services. premium to the valuation of the next Recently, AWS CEO Andy Jassy IPOs in this space is tempting. declared that by the end of 2018 Hadoop is losing ground because or mid-2019, 88% of Amazon’s of its complexity but is not dead in Finally, who are the new unicorns? In databases that ran on Oracle would our view. The 2019 merger between data management, it’s now MongoDB, be on an Amazon database instead. Cloudera and Hortonworks means Elastic and Snowflake. There are He also noted that Amazon moved its growth is unlikely to be as strong as plenty of potential unicorns out data warehouse from Oracle to its own expected. We think that independent there too: we think Neo4j, DataStax service, Redshift in early November players may well disappear, probably (Cassandra), Databricks, Cockroach 2018. SAP’s long-term ambition is through M&A. However, almost all the Labs, MemSQL, Yellowbrick Data, to replace Oracle databases running large software vendors have made Couchbase, and MariaDB could be the on its application software with significant investment in Hadoop open next ones. In Europe, there are only SAP HANA. Other defections may source technologies, which suggests a few players, such as Elastic in The follow. But Oracle has the strength it is here to stay and will probably Netherlands and MariaDB in Finland, of an installed base (several hundred evolve, supported by the open source but Exasol, a German company, has thousand of customers) which is not Apache Software Foundation. potential to benefit from the cloud data fully dependent on Amazon or SAP. warehouse wave as its technology is Oracle’s databases are also evolving, based on a solid and powerful and the new Oracle Autonomous in-memory database.

30 | THE NEW BIG DATA WORLD 31 White Paper Authors Corporate Transactions

Falk Müller-Veerse Gregory Ramirez Partner Equity Research Analyst Bryan, Garnier & Co leverage in-depth sector expertise to create fruitful and long lasting relationships between investors and Technology Software & IT Services European growth companies. [email protected] [email protected]

Berk Kirca Priyanshu Bhattacharya

understanding your Vice President Vice President audience Investment Banking Investment Banking Acquired by Strategic Investment Acquired by Private Placement Acquired by [email protected] [email protected]

and Cisco Technology Team Undisclosed Undisclosed Undisclosed Undisclosed Undisclosed Sole Advisor to the Sellers Sole Advisor to the Sellers Sole Advisor to the Seller Sole Financial Advisor Sole Advisor to the Sellers INVESTMENT BANKING

PARIS About Bryan, Garnier & Co Greg Revenu Olivier Beaudouin Guillaume Nathan Managing Partner Partner, Technology Partner, Digital Media Bryan, Garnier & Co is a European, full service growth-focused independent investment banking partnership founded in 1996. [email protected] & Smart Industries & Business Services The firm provides equity research, sales and trading, private and public capital raising as well as M&A services to growth companies [email protected] [email protected] and their investors. It focuses on key growth sectors of the economy including Technology, Healthcare, Consumer and Business Services. Bryan, Garnier & Co is a fully registered broker dealer authorised and regulated by the FCA in Europe and the FINRA Thibaut De Smedt Philippe Patricot in the U.S. Bryan, Garnier & Co is headquartered in London, with additional offices in Paris, Munich and New York. The firm Partner, Managing Director, is a member of the London Stock Exchange and Euronext. Application Software Technology & IT Services [email protected] [email protected] MUNICH Bryan, Garnier & Co Technology Equity Research Coverage Falk Müller-Veerse Lars Dürschlag Partner Managing Director Technology [email protected] [email protected]

EQUITY RESEARCH ANALYST TEAM

Richard-Maxime Beaudoux Thomas Coudry Gregory Ramirez Video Games, Payment Telecoms & Media Equity Research Analyst & Security [email protected] Software & IT Services [email protected] [email protected]

David Vignon Fréderic Yoboué Industrial Software Semiconductors [email protected] [email protected]

EQUITY CAPITAL MARKETS EQUITY DISTRIBUTION

Pierre Kiecolt-Wahl Nicolas-Xavier de Montaigut Nicolas d’Halluin 5 Analysts | 50+ Stocks Covered Partner Partner, Head of Distribution Partner, Head of US Distribution [email protected] [email protected] [email protected] With more than 150 professionals based in London, Paris, Munich and New York, Bryan, Garnier & Co combines the services and expertise of a top-tier investment bank with a long-term client focus.

32 | THE NEW BIG DATA WORLD LONDON PARIS MUNICH NEW YORK Beaufort House 26 Avenue des Champs Elysées Widenmayerstrasse 29 750 Lexington Avenue 15 St. Botolph Street 75008 Paris 80538 Munich New York, NY 10022 London, EC3A 7BB France Germany USA UK

T: +44 (0) 20 7332 2500 T: +33 (0) 1 56 68 75 00 T: +49 89 242 262 11 T: +1 (0) 212 337 7000 F: +44 (0) 20 7332 2559 F: +33 (0) 1 56 68 75 01 F: +49 89 242 262 51 F: +1 (0) 212 337 7002

Authorised and regulated by the Financial Regulated by the Financial Conduct FINRA and SIPC member Conduct Authority (FCA) Authority (FCA) and the Autorité de Contrôle prudential et de resolution (ACPR)

IMPORTANT INFORMATION

This document is classified under the FCA Handbook not reflected in this Report. The Firm or an associated United States to “Major US Institutional Investors” as being investment research (independent research). company may have a consulting relationship with a as defined in SEC Rule 15a-6 and may not be furnished Bryan Garnier & Co Limited has in place the measures company which is the subject of this Report. to any other person in the United States. Each Major and arrangements required for investment research US Institutional Investor which receives a copy of this as set out in the FCA’s Conduct of Business Sourcebook. This Report may not be reproduced, distributed Report by its acceptance hereof represents and agrees or published by you for any purpose except with the that it shall not distribute or provide this Report to any other This report is prepared by Bryan Garnier & Co Limited, Firm’s prior written permission. The Firm reserves all person. Any US person that desires to effect transactions registered in England Number 03034095 and its MIFID rights in relation to this Report. in any security discussed in this Report should call or write branch registered in France Number 452 605 512. to our US affiliated broker, Bryan Garnier Securities, LLC. Bryan Garnier & Co Limited is authorised and Past performance information contained in this Report 750 Lexington Avenue, New York NY 10022. regulated by the Financial Conduct Authority (Firm is not an indication of future performance. The information Telephone: 1-212-337-7000. Reference Number 178733) and is a member of the in this report has not been audited or verified by an London Stock Exchange. Registered address: independent party and should not be seen as an This Report is based on information obtained from Beaufort House 15 St. Botolph Street, London EC3A 7BB, indication of returns which might be received by investors. sources that Bryan Garnier & Co Limited believes to United Kingdom. Similarly, where projections, forecasts, targeted or be reliable and, to the best of its knowledge, contains illustrative returns or related statements or expressions no misleading, untrue or false statements but which it has This Report is provided for information purposes only of opinion are given (“Forward Looking Information”) not independently verified. Neither Bryan Garnier & Co and does not constitute an offer, or a solicitation of an they should not be regarded as a guarantee, prediction Limited and/or Bryan Garnier Securities LLC make offer, to buy or sell relevant securities, including securities or definitive statement of fact or probability. Actual events no guarantee, representation or warranty as to its mentioned in this Report and options, warrants or rights and circumstances are difficult or impossible to predict accuracy or completeness. Expressions of opinion to or interests in any such securities. This Report is for and will differ from assumptions. A number of factors, herein are subject to change without notice. This Report general circulation to clients of the Firm and as such is in addition to the risk factors stated in this Report, could is not an offer to buy or sell any security. not, and should not be construed as, investment advice or cause actual results to differ materially from those in any a personal recommendation. No account is taken of the Forward Looking Information. Bryan Garnier Securities, LLC and/or its affiliate, Bryan investment objectives, financial situation or particular needs Garnier & Co Limited may own more than 1% of the of any person. Disclosures specific to clients in the United Kingdom securities of the company(ies) which is (are) the subject This Report has not been approved by Bryan Garnier matter of this Report, may act as a market maker in the The information and opinions contained in this Report & Co Limited for the purposes of section 21 of the securities of the company(ies) discussed herein, may have been compiled from and are based upon generally Financial Services and Markets Act 2000 because it is manage or co-manage a public offering of securities for available information which the Firm believes to be being distributed in the United Kingdom only to persons the subject company(ies), may sell such securities to reliable but the accuracy of which cannot be guaranteed. who have been classified by Bryan Garnier & Co Limited or buy them from customers on a principal basis and All components and estimates given are statements as professional clients or eligible counterparties. Any may also perform or seek to perform investment banking of the Firm, or an associated company’s, opinion only recipient who is not such a person should return the services for the company(ies). and no express representation or warranty is given or Report to Bryan Garnier & Co Limited immediately should be implied from such statements. All opinions and should not rely on it for any purposes whatsoever. Bryan Garnier Securities, LLC and/or Bryan Garnier & Co expressed in this Report are subject to change without Limited are unaware of any actual, material conflict notice. To the fullest extent permitted by law neither the NOTICE TO US INVESTORS of interest of the research analyst who prepared this Firm nor any associated company accept any liability Report and are also not aware that the research analyst whatsoever for any direct or consequential loss arising This research report (the “Report”) was prepared knew or had reason to know of any actual, material from the use of this Report. Information may be available by Bryan Garnier & Co Limited for information purposes conflict of interest at the time this Report is distributed to the Firm and/or associated companies which are only. The Report is intended for distribution in the or made available.

03.19-TCS