Database Comparison
Total Page:16
File Type:pdf, Size:1020Kb
Database Comparison A Comparison of MarkLogic Features and Data Model DATASHEET MarkLogic’s Key Features MarkLogic® handles the volume, velocity, and variety of data like other NoSQL solutions, AND has built, tested, and deployed the enterprise features necessary to run mission-critical applications. In general, relational databases have all of the features that make them trusted. Newer NoSQL databases have gained traction for being powerful and agile, but they lack or have not hardened the enterprise features to make them trusted. MarkLogic was designed from the start to be an Enterprise NoSQL database that is powerful, agile, and trusted. MarkLogic’s key features include the following: • Flexible Data Model – Multi-model database for handling JSON, XML, RDF, Geospatial, and binary data - Document data (JSON, XML) – Structured and unstructured data - Graph data (RDF Triples) – Semantic facts and relationships • Search and Query – Lightning fast, sub-second search and query across all of your data using sophisticated indexes • Semantics – Triple store integrated into MarkLogic to store and query RDF & SPARQL • Scalability and Elasticity – Scale to petabytes of data without over-provisioning or over-spending • ACID Transactions – Transactional integrity—even at speed and scale, and even across a cluster • Certified Security – Government-grade, granular, role-based access control with NIAP Common Criteria Certification. • Hadoop Integration – MarkLogic Connector for Hadoop to move data back and forth and use HDFS as native storage • Other Important Features – Monitoring and Management, Cloud Deployment (AWS), Tiered Storage, Bitemporal Support, Customizable Backup and Failover, Point-in-time Recovery, Flexible Replication, Real-Time Alerting, Geospatial Support, and Application Services Comparing Databases by Feature Relational Other NoSQL Features Databases Databases Flexible Data Model MarkLogic natively stores JSON, RDBMSes require predetermined, static In general, NoSQL databases are more XML, RDF and more—a multi-model schemas. Adding a single column can flexible than RDBMSes, particularly approach that is schema agnostic and cost millions and take years. document stores. allows you to load data as-is. Search and Query MarkLogic has search built into its Full-text search requires other software Full-text search may require other core. And, MarkLogic gives you the and a lot of work to setup the indexes. software. And, many NoSQL databases ability to run complex queries across And, a single query may not be able to have simple indexes and limit queries multiple data types, utilizing a wide access all data. to one or two indexes at most. selection of fully customizable indexes. Semantics MarkLogic includes an RDF triple RDF triple stores are considered as RDF triple stores are considered store and is the only enterprise-grade separate systems that can only be as separate systems that are then database that can combine documents, connected to an RDBMS. connected to a NoSQL database if data, and triples. necessary. DATABASE COMPARISON MARKLOGIC DATASHEET Relational Other NoSQL Features Databases Databases Scalability and Elasticity MarkLogic can scale out to hundreds RDBMSes scale up on a single Some NoSQL systems scale out, but of nodes, powering production machine, often with a big price tag. determining how to partition data applications with hundreds of billions Performance will hit a ceiling when data across many nodes is a challenge. of documents. Data is partitioned and reaches a certain size, and there is no Most NoSQL databases lack the ability balanced automatically, and nodes option to scale down when there is to scale up and down elastically. can be added or subtracted in minutes unused capacity. based on application demand. ACID Transactions MarkLogic maintains full ACID RDBMSes have had ACID transactions Virtually every NoSQL database has properties and also does XA distributed as a standard for over a decade, which sacrificed ACID transactions, or are transactions. helped make them the standard choice unable to maintain transactional for mission-critical applications. consistency across a distributed system. Certified Security MarkLogic obtained a NIAP Common RDBMSes have been around for No other NoSQL database carries a Criteria Certification and is operational decades and have had the time to NIAP Common Criteria Certification, on classified government systems. become hardened in large enterprise and many have not been around long systems where security is a priority. enough to have been certified. HA/DR MarkLogic has deep support for Although it took almost a decade, HA/ Most NoSQL databases do not have failover within and between clusters, DR is now a standard feature in most transactional integrity, which making and its shared-nothing architecture RDBMSes. Depending on architecture, backup and disaster recovery. High means there is no single point of failure. there may still be a single point of availability is an option for several failure. scale-out databases. Hadoop Integration MarkLogic easily connects with Some RDBMSes connect with Hadoop, Many NoSQL databases connect to Hadoop and performs as the perfect but ETL is often involved when moving Hadoop, though functionality varies database for Hadoop. data back and forth. tremendously. More Detail on the Flexible Data Model MarkLogic is considered a multi-model database for its ability to store, manage, and search JSON and XML documents and graph data (RDF triples). This approach provides flexibility in modeling data and moving data. It avoids the lost fidelity and functionality from data conversion when moving data between the database, middle tier, and front-end of an application. It also helps avoid brittle ETL and makes it much easier to load data from different sources and adapt to changes over time. Data does not require a pre-defined schema before loading it into MarkLogic because MarkLogic is schema-agnostic. Unlike with relational databases, it is possible to change the data without mapping it to a fixed schema or hiding data in opaque objects. You can still store all of the information that you would find in the row of a relational table, and you can persist relationships, hierarchies, and metadata that would be difficult or impossible to model with a static, relational schema. DATABASE COMPARISON MARKLOGIC SOLUTION SHEET Comparing Databases by Data Type Data Query Relational NoSQL Triple Type Language Databases Databases Stores JSON JavaScript documents MarkLogic stores RDBMSes may Some NoSQL databases Triple stores are generally JSON documents market limited JSON store JSON documents single purpose, and natively. You get all of compatibility, typically natively. Others may no other triple stores MarkLogic’s production- achieved by treating convert it to proprietary are designed to store proven indexing, data JSON documents as a formats or treat it the and query both RDF management, and binary large object or same way as RDBMSes. and JSON in the same security capabilities for text block. Indexing and database as MarkLogic the Web’s predominant retrieving this data will can. data format. take a long time relative to properly formatted relational data. XML XQuery documents MarkLogic stores XML RDBMSes may have an Most NoSQL databases Triple stores are generally documents natively, XML storage option as do not offer XML support. single purpose, and no using a compressed tree an add-on, but they are Some specialized legacy triple store is designed to format that preserves not optimized to store, databases exist, but handle both RDF triples its hierarchical nature. manage, and search XML. they lack the scalability and XML documents in MarkLogic can also store You still end up with a and flexibility associated the same system like other related formats such silo, and you often run with modern NoSQL MarkLogic. as SGML, FpML, HTML, into performance issues. databases. and many more. RDF triples SPARQL MarkLogic stores RDF RDBMSes may have Some graph databases RDF triple stores are triples natively, and they an option as an add-on offer steps to convert designed specifically are indexed using a to handle RDF triples. RDF into their chosen to store and query RDF specialized triple index However, they are often format. There are no triples. But, because they and cached using a hindered at scale and in other document stores, are single purpose, they triples cache for optimal trying to handle complex wide-column stores, or cannot run combination performance at scale. queries across the graph key-value stores that can queries and generally lack MarkLogic is also the structure. handle RDF triples. enterprise features. only database that can store RDF triples right alongside JSON and XML. Geospatial XQuery, data JavaScript MarkLogic stores Many RDBMSes can Many NoSQL databases Triple stores are generally geospatial data natively. handle geospatial data can handle geospatial single purpose, and no MarkLogic supports but the degree to which data but the degree to other triple stores are multiple geospatial data the data is indexed and which the data is indexed designed to handle both types such as GML, KML, searchable varies. and searchable varies. RDF triples and geospatial and GeoRSS. data like MarkLogic. DATABASE COMPARISON MARKLOGIC SOLUTION SHEET Data Query Relational NoSQL Triple Type Language Databases Databases Stores Large N/A (not binaries indexed) MarkLogic can handle RDBMSes have evolved Many NoSQL databases Triple stores are generally a wide variety