Database Comparison

A Comparison of MarkLogic Features and Data Model DATASHEET

MarkLogic’s Key Features MarkLogic® handles the volume, velocity, and variety of data like other NoSQL solutions, AND has built, tested, and deployed the enterprise features necessary to run mission-critical applications. In general, relational have all of the features that make them trusted. Newer NoSQL databases have gained traction for being powerful and agile, but they lack or have not hardened the enterprise features to make them trusted. MarkLogic was designed from the start to be an Enterprise NoSQL that is powerful, agile, and trusted. MarkLogic’s key features include the following:

• Flexible Data Model – Multi-model database for handling JSON, XML, RDF, Geospatial, and binary data -- Document data (JSON, XML) – Structured and unstructured data -- Graph data (RDF Triples) – Semantic facts and relationships • Search and Query – Lightning fast, sub-second search and query across all of your data using sophisticated indexes • Semantics – Triple store integrated into MarkLogic to store and query RDF & SPARQL • Scalability and Elasticity – Scale to petabytes of data without over-provisioning or over-spending • ACID Transactions – Transactional integrity—even at speed and scale, and even across a cluster • Certified Security – Government-grade, granular, role-based access control with NIAP Common Criteria Certification. • Hadoop Integration – MarkLogic Connector for Hadoop to move data back and forth and use HDFS as native storage • Other Important Features – Monitoring and Management, Cloud Deployment (AWS), Tiered Storage, Bitemporal Support, Customizable Backup and Failover, Point-in-time Recovery, Flexible Replication, Real-Time Alerting, Geospatial Support, and Application Services

Comparing Databases by Feature Relational Other NoSQL Features Databases Databases

Flexible Data Model

MarkLogic natively stores JSON, RDBMSes require predetermined, static In general, NoSQL databases are more XML, RDF and more—a multi-model schemas. Adding a single column can flexible than RDBMSes, particularly approach that is schema agnostic and cost millions and take years. document stores. allows you to load data as-is. Search and Query

MarkLogic has search built into its Full-text search requires other Full-text search may require other core. And, MarkLogic gives you the and a lot of work to setup the indexes. software. And, many NoSQL databases ability to run complex queries across And, a single query may not be able to have simple indexes and limit queries multiple data types, utilizing a wide access all data. to one or two indexes at most. selection of fully customizable indexes. Semantics

MarkLogic includes an RDF triple RDF triple stores are considered as RDF triple stores are considered store and is the only enterprise-grade separate systems that can only be as separate systems that are then database that can combine documents, connected to an RDBMS. connected to a NoSQL database if data, and triples. necessary.

Database Comparison MARKLOGIC DATASHEET Relational Other NoSQL Features Databases Databases

Scalability and Elasticity

MarkLogic can scale out to hundreds RDBMSes scale up on a single Some NoSQL systems scale out, but of nodes, powering production machine, often with a big price tag. determining how to partition data applications with hundreds of billions Performance will hit a ceiling when data across many nodes is a challenge. of documents. Data is partitioned and reaches a certain size, and there is no Most NoSQL databases lack the ability balanced automatically, and nodes option to scale down when there is to scale up and down elastically. can be added or subtracted in minutes unused capacity. based on application demand. ACID Transactions

MarkLogic maintains full ACID RDBMSes have had ACID transactions Virtually every NoSQL database has properties and also does XA distributed as a standard for over a decade, which sacrificed ACID transactions, or are transactions. helped make them the standard choice unable to maintain transactional for mission-critical applications. consistency across a distributed system. Certified Security

MarkLogic obtained a NIAP Common RDBMSes have been around for No other NoSQL database carries a Criteria Certification and is operational decades and have had the time to NIAP Common Criteria Certification, on classified government systems. become hardened in large enterprise and many have not been around long systems where security is a priority. enough to have been certified. HA/DR

MarkLogic has deep support for Although it took almost a decade, HA/ Most NoSQL databases do not have failover within and between clusters, DR is now a standard feature in most transactional integrity, which making and its shared-nothing architecture RDBMSes. Depending on architecture, backup and disaster recovery. High means there is no single point of failure. there may still be a single point of availability is an option for several failure. scale-out databases. Hadoop Integration

MarkLogic easily connects with Some RDBMSes connect with Hadoop, Many NoSQL databases connect to Hadoop and performs as the perfect but ETL is often involved when moving Hadoop, though functionality varies database for Hadoop. data back and forth. tremendously.

More Detail on the Flexible Data Model MarkLogic is considered a multi-model database for its ability to store, manage, and search JSON and XML documents and graph data (RDF triples). This approach provides flexibility in modeling data and moving data. It avoids the lost fidelity and functionality from data conversion when moving data between the database, middle tier, and front-end of an application. It also helps avoid brittle ETL and makes it much easier to load data from different sources and adapt to changes over time.

Data does not require a pre-defined schema before loading it into MarkLogic because MarkLogic is schema-agnostic. Unlike with relational databases, it is possible to change the data without mapping it to a fixed schema or hiding data in opaque objects. You can still store all of the information that you would find in the row of a relational table, and you can persist relationships, hierarchies, and metadata that would be difficult or impossible to model with a static, relational schema.

Database Comparison Marklogic Solution Sheet Comparing Databases by Data Type Data Query Relational NoSQL Triple Type Language Databases Databases Stores

JSON JavaScript documents

MarkLogic stores RDBMSes may Some NoSQL databases Triple stores are generally JSON documents market limited JSON store JSON documents single purpose, and natively. You get all of compatibility, typically natively. Others may no other triple stores MarkLogic’s production- achieved by treating convert it to proprietary are designed to store proven indexing, data JSON documents as a formats or treat it the and query both RDF management, and binary large object or same way as RDBMSes. and JSON in the same security capabilities for text block. Indexing and database as MarkLogic the Web’s predominant retrieving this data will can. data format. take a long time relative to properly formatted relational data. XML XQuery documents

MarkLogic stores XML RDBMSes may have an Most NoSQL databases Triple stores are generally documents natively, XML storage option as do not offer XML support. single purpose, and no using a compressed tree an add-on, but they are Some specialized legacy triple store is designed to format that preserves not optimized to store, databases exist, but handle both RDF triples its hierarchical nature. manage, and search XML. they lack the scalability and XML documents in MarkLogic can also store You still end up with a and flexibility associated the same system like other related formats such silo, and you often run with modern NoSQL MarkLogic. as SGML, FpML, HTML, into performance issues. databases. and many more. RDF triples SPARQL

MarkLogic stores RDF RDBMSes may have Some graph databases RDF triple stores are triples natively, and they an option as an add-on offer steps to convert designed specifically are indexed using a to handle RDF triples. RDF into their chosen to store and query RDF specialized triple index However, they are often format. There are no triples. But, because they and cached using a hindered at scale and in other document stores, are single purpose, they triples cache for optimal trying to handle complex wide-column stores, or cannot run combination performance at scale. queries across the graph key-value stores that can queries and generally lack MarkLogic is also the structure. handle RDF triples. enterprise features. only database that can store RDF triples right alongside JSON and XML. Geospatial XQuery, data JavaScript

MarkLogic stores Many RDBMSes can Many NoSQL databases Triple stores are generally geospatial data natively. handle geospatial data can handle geospatial single purpose, and no MarkLogic supports but the degree to which data but the degree to other triple stores are multiple geospatial data the data is indexed and which the data is indexed designed to handle both types such as GML, KML, searchable varies. and searchable varies. RDF triples and geospatial and GeoRSS. data like MarkLogic.

Database Comparison Marklogic Solution Sheet Data Query Relational NoSQL Triple Type Language Databases Databases Stores

Large N/A (not binaries indexed)

MarkLogic can handle RDBMSes have evolved Many NoSQL databases Triple stores are generally a wide variety of large to handle large binaries can store large binaries, single purpose, and binary files, and can in various ways, but they though how they do it require an additional index all of the associated are often stored as large varies. And, like RDBMS’, system that stores metadata to make the BLOBs or CLOBs, and they do not have a content such as images, content searchable. searching the data is a built-in way to index the videos, text, etc. challenge. metadata, which means a poor search experience. Relational SQL data

Relational data can Storing relational data Some NoSQL databases Triple stores are generally be easily ingested in tables with rows and offer limited support for single purpose, and there and modeled using a columns, and querying mapping relational data is usually a connector document model in the data with SQL to their non-relational involved to move data MarkLogic. JSON and has been considered model. These databases’ between an RDBMS and XML are great for field- the traditional default basic query engines and the triple store. level metadata and sparse approach for the past 30 indexes will limit the use data. In one example, or more years. of this data in production. over 600 RDBMS tables were mapped to 13 XML schema files in MarkLogic.

*Note: RDF triple stores can be considered NoSQL databases but are broken out as a separate category for the purposes of this comparison.

The above table shows how MarkLogic spans multiple categories and is the only database platform that gives you the full feature- set necessary to manage all of today’s data. The ability to store and query all of these different data types—often referred to as “polyglot persistence”—makes it possible to reduce the complexity of an IT environment. It also means that developers can get going faster and develop more powerful applications, spending less time and code on data transformation and brittle ETL by keeping data in the format that makes the most sense.

About MarkLogic For more than a decade, MarkLogic has delivered a powerful, agile, and trusted Enterprise NoSQL database platform that enables organizations to turn all data into valuable and actionable information. For more information, visit www.marklogic.com.

© 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. This technology is protected by U.S. Patent No. 7,127,469B2, U.S. Patent No. 7,171,404B2, U.S. Patent No. 7,756,858 B2, and U.S. Patent No 7,962,474 B2. MarkLogic is a trademark or registered trademark of MarkLogic Corporation in the United States and/or other countries. All other trademarks mentioned are the property of their respective owners.

MARKLOGIC CORPORATION 999 Skyway Road, Suite 200 San Carlos, CA 94070 +1 650 655 2300 | +1 877 992 8885 | www.marklogic.com | [email protected]