White Paper

Why Enterprise NoSQL Matters

By Evan Quinn, Senior Principal Analyst

March 2013

This ESG White Paper was commissioned by MarkLogic and is distributed under license from ESG.

© 2013 by The Enterprise Strategy Group, Inc. All Rights Reserved. White Paper: Why Enterprise NoSQL Matters 2 Contents

Executive Summary ...... 3 The Alternative to Aging Enterprise Relational : NoSQL Databases ...... 4 Requirements for Enterprise NoSQL ...... 5 MarkLogic Is a Proven Candidate for Enterprise NoSQL ...... 7 The Bigger Truth ...... 8

All trademark names are property of their respective companies. Information contained in this publication has been obtained by sources The Enterprise Strategy Group (ESG) considers to be reliable but is not warranted by ESG. This publication may contain opinions of ESG, which are subject to change from time to time. This publication is copyrighted by The Enterprise Strategy Group, Inc. Any reproduction or redistribution of this publication, in whole or in part, whether in hard-copy format, electronically, or otherwise to persons not authorized to receive it, without the express consent of The Enterprise Strategy Group, Inc., is in violation of U.S. copyright law and will be subject to an action for civil damages and, if applicable, criminal prosecution. Should you have any questions, please contact ESG Client Relations at 508.482.0188.

© 2013 by The Enterprise Strategy Group, Inc. All Rights Reserved. White Paper: Why Enterprise NoSQL Matters 3

Executive Summary The good, old-fashioned relational , a well-understood technology with a known list of providers for two decades, has faced disruption since the turn of the millennium, and the disruption is peaking now. The rise of cloud applications, big data analytics, mobile computing, sophisticated content and asset management solutions, and social media have pushed the once dependable relational database to the edge of, and sometimes past, its abilities. Because it was originally architected to work with hardware from yesteryear, the older relational database may struggle to take optimal advantage of the dramatic price/performance improvements and innovations found in adjunct technologies like multi-core processors and storage. Add in previously inconceivable requirements for scalability, plus the still substantial margins enjoyed by long-standing enterprise database providers, and modern database buyers have found motivation to look for fresh alternatives. In the vacuum formed between older databases and new use cases, an explosion of roughly 50 new commercial and open source databases, often referred to collectively as “NoSQL databases,” have come to market. ESG prefers to interpret the term “NoSQL” to mean “Not Only SQL” given that many of the new databases do support Structured Query Language (SQL). But suffice it to say that pent-up demand to better address post-2000 use cases has produced a throng of new database choices. Yet therein lies another, ironic, challenge for the database buyer: too much choice. Fortunately, if you require enterprise-class features in a NoSQL database, the number of choices shrinks to a few, and MarkLogic stands out as a clear leader in the “Enterprise NoSQL” category. “Enterprise NoSQL” means a database designed to deal with modern, post-2000 application use cases that also has the features to support enterprise-grade compliance; transaction processing; tools for DBAs and administrators; 24x365 availability; scalability adapting to multiple cost-effective hardware choices; and APIs and connectors for a wide variety of third-party services wanted by developers. The shorter definition: Security, policy management, availability, scalability, ACID, DBA and administrator tools, connectors and APIs. An even shorter definition: Enterprise NoSQL is a NoSQL database you would bet your business on—a database like MarkLogic. The MarkLogic database moves easily between the schema-less approach used for advanced web, rich content, and document solutions, as well as full ACID transaction processing. Its native shared-nothing architecture enables near- linear scalability across a wide variety of hardware choices, including “commodity hardware.” Its government- certified security and policy management enable fine-grained access, data privacy, and retention control. It offers a variety of high availability approaches—versus only one—for both shared disk implementations as well as shared- nothing that includes point-in-time recovery. And MarkLogic’s administration tools are API-based, offering the flexibility to fit into existing workflows and consoles, with prebuilt integration to several popular systems management tools. MarkLogic also offers specific solutions that address multi-national class needs, such as a “big data search” with support for over 200 languages, and metadata repositories used by some of the world’s largest media organizations. In the big data arena, MarkLogic offers optimized connectors to leading big data platform and visualization tools such as Hadoop, Tableau , and IBM Cognos. Between its true enterprise-class database capabilities, and its wide range of proven solutions, MarkLogic belongs on the short list of those database buyers who are interested in NoSQL, but not willing to sacrifice the features needed by enterprises.

© 2013 by The Enterprise Strategy Group, Inc. All Rights Reserved. White Paper: Why Enterprise NoSQL Matters 4

The Alternative to Aging Enterprise Relational Databases: NoSQL Databases The relational databases enterprises have depended on for over two decades for nearly all applications are being replaced by a newer set of databases that offer data models and features more finely tuned to post-2000 applications. Given the growth of data—managing data growth rated as the number four most important IT priority reported by respondents in ESG’s recent IT spending intention survey—and pressure on costs—cost reduction initiatives rated as the number one business priority for IT cited by organizations in the same survey (see Figure 1)—the well-known lofty prices of enterprise databases have contributed to interest for alternative databases.1 The high cost of classic enterprise databases, however, isn’t just about price. Relational databases were not originally designed for the rich and pervasive content of today’s collaboration, social media, and mobile applications. The older relational model struggles with the complex queries and compute requirements associated with advanced data analytics. Thus, many of the leading business initiatives that need IT’s help today don’t marry well with the long-standing enterprise relational database, and those organizations that choose to stick with older relational models literally pay for their choice by throwing more hardware at the issue, and using workarounds. Figure 1. 2013 Business Initiatives with the Greatest Impact on IT Spending Decisions

Which of the following business initiatives do you believe will have the greatest impact on your organization's IT spending decisions over the next 12-18 months? (Percent of respondents, N=540, three responses accepted)

Cost reduction initiatives 44%

Business process improvement initiatives 31%

Security/risk management initiatives 31%

Regulatory compliance 25%

Providing our employees with the mobile devices and 24% applications they need to maximize productivity

0% 10% 20% 30% 40% 50% Source: Enterprise Strategy Group, 2013 NoSQL databases have stepped in as the alternatives to legacy enterprise databases. While the implication of NoSQL infers a lack of query language or data structure, in fact the dozens of NoSQL databases available in the market cover a wide range of query approaches and data models, thus “NoSQL” really means “not only SQL.” Regardless, most NoSQL databases not only offer data models lined up with more recent application types, but they are also typically designed natively to take advantage of advances in in-memory computing, storage, and virtualized infrastructures including clouds. Speed and scalability, across a variety of deployment choices, are often calling cards of NoSQL databases.

1 Source: ESG Research Report, 2013 IT Spending Intentions Survey, January 2013.

© 2013 by The Enterprise Strategy Group, Inc. All Rights Reserved. White Paper: Why Enterprise NoSQL Matters 5

Yet, ESG is not comfortable recommending all NoSQL databases to enterprises and large organizations. Several of the primary reasons for ESG’s reticence are spelled out in Figure 2, which lists the most important IT priorities reported by respondents to ESG’s recent survey. Many priorities are related to security and compliance, and few NoSQL databases measure up to enterprise-class needs, and even fewer meet government required security and compliance requirements. But enterprise-class requirements do not end at security and compliance for NoSQL databases. The interest in virtualization, cloud, and private cloud suggests the need for scalability across variable and elastic infrastructures. Data backup, recovery, and business continuity needs mean that NoSQL databases need to operate in the context of 24x7x365 IT operations. In short, the newer applications that fit well with NoSQL databases are no less important to business, government, and IT than legacy enterprise applications. Let’s drill a little further into what ESG sees as the features needed for “enterprise” NoSQL databases.

Figure 2. 2013 Most Important IT Priorities

Which of the following would you consider to be your organization’s most important IT priorities over the next 12 months? (Percent of respondents, N=540, ten responses accepted)

Information security initiatives 29% Improve data backup and recovery 27% Increased use of server virtualization 26% Manage data growth 25% Data center consolidation 24% Desktop virtualization 22% Use cloud infrastructure services 22% Major application deployments or upgrades 22% Deploying applications on or for new mobile devices 20% Improve collaboration capabilities 20% Regulatory compliance initiatives 20% Business continuity/disaster recovery programs 20% Business intelligence/data analytics initiatives 20% Mobile workforce enablement 19% Building a “private cloud” infrastructure 19%

0% 5% 10% 15% 20% 25% 30% 35%

Source: Enterprise Strategy Group, 2013

Requirements for Enterprise NoSQL What differentiates NoSQL from Enterprise NoSQL? ESG counts six specific areas that enterprises look to for certain characteristics and attributes in databases, NoSQL or not: 1. Security, Policy Management, and Compliance: Large organizations, and particularly governments, want fine-grained security and policy control at the database level, not just at the application or endpoint device level. And governments and verticals dealing with particularly sensitive information often also require security certification(s). Note that a properly architected NoSQL database will not pay a performance penalty in exchange for security. Naturally auditing tools need to be part of the NoSQL solution as well.

© 2013 by The Enterprise Strategy Group, Inc. All Rights Reserved. White Paper: Why Enterprise NoSQL Matters 6

2. High Availability: Important databases need to offer a variety of high-availability options to ensure enterprise-grade uptime. Given that databases may use different storage scenarios, and may be distributed in several ways, enterprise NoSQL databases should offer a variety of availability and recoverability options.

3. Scalability: Most NoSQL databases offer a single scaling approach, and thus customers must configure hardware and infrastructure to support that singular method. Enterprise-grade NoSQL, however, offers several scalability options that adapt to the hardware/infrastructure and applications workload footprint.

4. ACIDity: Many NoSQL databases do not fully support transactional computing. However, so many application scenarios require true transaction processing, it seems silly to have two databases, one for schema-less approaches and one for transactions. NoSQL databases that also support distributed transactions in their entirety offer huge benefits for large organizations.

5. Administrative Tools: Nearly every database offers some kind of administration console, but how far does it go? In truth, most IT operations groups of larger organizations already have committed to a system management framework, so enterprise-grade administration tools for a database not only offer a rich set of capabilities for the DBA, but also integrate easily into existing major system management solutions. And, the administration tools should offer some customization in order to fit into workflows and process automation, and should also offer features to support development and testing like cloning. At the same time, the enterprise-class NoSQL database still has to deliver more effectively for several application types than its older, more schema-bound relational siblings. The modern applications NoSQL databases may work particularly well with include a wider variety of content types than relational databases comfortably deal with, unique scalability requirements, more flexible indexing that is useful in analytics, and the ability to dynamically adapt to shifting data formats—versus a situation where developers and DBAs are forced to explicitly define all formats for the database. Table 1 lists some of the well-known application use cases where NoSQL often shines.

Table 1. Application Use Cases Potentially Well-served by NoSQL Databases Application Use Case Potential NoSQL Advantage Search No predefinition of searchable items required; simple to adapt as data and search parameters change; no data replication required Content Applications NoSQL databases that use XML as an organizing principle for content, regardless of its original form, can flexibly store and retrieve content of, in essence, any type, supporting both authoring and content distribution— naturally this approach helps with Web and mobile applications. Digital Asset Management By using metadata to organize digital assets, regardless of source and consumption model, NoSQL databases easily support huge and complex media-oriented solutions, including social media. Big Data Analytics What would a discussion of NoSQL be without mentioning the most famous use case, big data analytics? But it is true that the key-value pair nature of many NoSQL databases is more readily able to deal with a voluminous mix of structured and semi-structured data, and effectively processes complex analytical queries using far less computing resources and at far higher speeds than relational databases. Source: Enterprise Strategy Group, 2013. How many NoSQL databases cross the divide of offering the many benefits of NoSQL while maintaining all the features required by larger organizations? Out of literally dozens of NoSQL databases in the market, ESG has only been able to identify a few that both offer the flexibility of NoSQL yet meet the requirements of enterprises—and the MarkLogic database definitely ranks as one of those few.

© 2013 by The Enterprise Strategy Group, Inc. All Rights Reserved. White Paper: Why Enterprise NoSQL Matters 7 MarkLogic Is a Proven Candidate for Enterprise NoSQL Before listing how the NoSQL MarkLogic database meets and surpasses the aforementioned “enterprise” NoSQL database requirements, perhaps it makes sense to understand why MarkLogic is able to do so. MarkLogic’s first set of customers were media publishers and government agencies. That means that MarkLogic had to immediately develop features that were up to snuff for some of the most demanding customers on the planet. In addition, the dozen years in the market have enabled MarkLogic to respond to and develop for a far wider range of customers than most other NoSQL databases—many NoSQL databases have only been in market for two years or less. ESG also looks at the lineage of a NoSQL database vendor’s executive team for both NoSQL and enterprise-grade leadership. On the enterprise side of MarkLogic, for example, CEO and President Gary Bloom spent over a dozen years at Oracle, and led Oracle’s database business. On the NoSQL side, founder Christopher Lindblad had overall responsibility for the creation of the Ultraseek Server, Infoseek’s enterprise search application, which is now one of Autonomy’s principal products following the acquisition of Verity. And Lindblad focused his post-doctoral research at the Massachusetts Institute of Technology on high-speed networks and real-time video processing. But lineage and history with customers aside, the proof is in the database and related software, and that is where MarkLogic sets itself apart from the NoSQL throng. Let’s look specifically at how MarkLogic measures up to ESG’s criteria for “enterprise” databases. Table 2. MarkLogic Database Enterprise Features Enterprise Feature Area MarkLogic Database Attributes Security, Policy MarkLogic provides the most comprehensive governmental-class security of any Management, and NoSQL database. Using flexible policy management, it offers internal, database- Compliance layer authentication, as well as permission management with rich options for document security, and retention management. It supports client-to-node encryption, and works with multiple encryption systems in order to meet varying compliance requirements. Its auditing features, and overall security, have received a long list of certifications including Common Criteria (ISO/IEC 15408). And MarkLogic accomplishes all of this without impacting scalability or performance. High Availability MarkLogic, unlike most NoSQL databases, offers multiple scenarios for HA for various storage approaches. Customers may choose from configurable HA for shared-nothing deployments, and/or local disk failover for HA within shared disk implementations. Asynchronous replication is used for DR (Disaster Recovery) purposes, and configurable journal archiving synchronization enables point-in-time recovery, whether from a complete backup or from snapshots. Scalability The native shared-nothing architecture of MarkLogic offers the best approach for near-linear scalability. Deployments easily scale horizontally to handle growing data loads. Given MarkLogic's early focus as a database supporting documents, it was designed from the get-go to effectively deal with huge jumps in data volume. Acidity For enterprises, there is no substitution for ACID transactions. Even with the additional burden placed on developers to work around the non-ACID transaction approaches in many other NoSQL databases, the risk associated with non-ACID approaches is not worth it for transactional applications. These workarounds, fortunately, are not required with MarkLogic because it is fully ACID-compliant. Administrative Tools MarkLogic’s approach to offering plentiful APIs for management, process automation and workflow enable it to fit into enterprise-class systems management and IT operations applications. It integrates readily with HP systems management and any open-source Nagios-based solution. And the built-in administration tools cover all necessary bases, from access control, to information management, to audit, and compliance. Source: Enterprise Strategy Group, 2013.

© 2013 by The Enterprise Strategy Group, Inc. All Rights Reserved. White Paper: Why Enterprise NoSQL Matters 8

The Bigger Truth While applications in the 2000s have shifted from strictly transactional, as often found in ERP-style applications, towards more semi-structured content and advanced analytics, the needs of large organizations for reliability, availability, and security have not changed. In fact, one could argue as “enterprise applications” become more visible outside the firewall, from social media to e-commerce, from partner exchanges to online customer self- service, their criticality of uptime and dependability exceed that of more inward facing applications. In the realm of big data, the excuses for the older, slower data warehouse with latency challenges and a lack of availability are no longer tolerable. If big data, which is more oriented to where an organization is going and what it should be doing versus what has already happened, helps organizations make better tactical and strategic decisions, it certainly should count as “mission-critical.” It is apparent to ESG that many customers, rightfully, have turned away from older relational databases to newer alternatives that deal more effectively with the more flexible content and data requirements of web, mobile, social, and big data era applications. NoSQL databases, by their native design and nature, better address these modern application use cases. But far too many of the NoSQL database options on the market do not stand up to the test of “enterprise.” Yes, many do a better job handling semi-structured data, content-rich applications and advanced analytics than their relational grandparents. But how long will they stay up? How secure are they? Can they actually flex to different infrastructures, and scale as demand peaks and recedes? Do they only support the schema-less approach or will they also deal effectively with full transactional requirements—these are not mutually exclusive use cases after all. And most importantly, how many NoSQL databases offer a long list of customers and solutions, all of which exhibit the advantages of NoSQL without sacrificing the requirements of enterprises? A gleaming example of one NoSQL database that meets the criteria of large enterprise, governmental, and multinational computing, and that already offers a complete list of satisfied proof points is MarkLogic.

© 2013 by The Enterprise Strategy Group, Inc. All Rights Reserved.

20 Asylum Street | Milford, MA 01757 | Tel: 508.482.0188 Fax: 508.482.0218 | www.esg-global.com