Hype Cycle for Storage and Data Protection Technologies, 2020

Published 6 July 2020 - ID G00441602 - 78 min read

By Analysts Julia Palmer

Initiatives:Data Center Infrastructure

This Hype Cycle evaluates storage and data protection technologies in terms of their business impact, adoption rate and maturity level to help IT leaders build stable, scalable, efficient and agile storage and data protection platform for digital business initiatives.

Analysis What You Need to Know The storage and data protection market is evolving to address new challenges in enterprise IT such as exponential data growth, changing demands for skills, rapid digitalization and globalization of business, requirements to connect and collect everything, and expansion of data privacy and sovereignty laws. Requirements for robust, scalable, simple and performant storage are on the rise. As the data center no longer remains the center of data, IT leaders expect storage to evolve from being delivered by rigid appliances in core data centers to flexible storage platforms capable of enabling hybrid cloud data flow at the edge and in the public cloud.

Here, Gartner has assessed 24 of the most relevant storage and data protection technologies that IT leaders must evaluate to address the fast-evolving needs of the enterprise. For more information about how peer I&O leaders view the technologies aligned with this Hype Cycle, see “2020-2022 Emerging Technology Roadmap for Large Enterprises.”

The Hype Cycle IT leaders responsible for storage and data protection must cope with the rapidly changing requirements of digital business, exponential data growth, introduction of new workloads, and the desire to leverage public cloud and enable edge capabilities. This research informs I&O leaders and infrastructure technology vendors about innovative storage technologies that are entering the market, and shows how Gartner evaluates highly hyped technologies or concepts and how quickly enterprises are adopting innovative technologies.

More than half of the technologies reviewed in the 2020 Hype Cycle are posed to mature over the next five to 10 years, while 60% of technologies have a potential to deliver high benefits if driven by genuine business requirements. To provide readers with clearer, more focused research that supports their analysis and planning, this year we have removed a number of innovation profiles that are no longer hyped. We have only included the ones that are most relevant to IT leaders today, as well as those with a strong link to the storage and data protection Hype Cycle and its theme.

Gartner, Inc. | 441602 Page 1/45 There are three new innovation profiles that have been added in 2020: computational storage, container backup and dHCI. While very different in their value proposition, these technologies reflect IT leaders’ priorities to take advantage of new flash technologies, improve and modernize data protection, and leverage new deployments modes for storage and data protection platforms.

Fast-moving technologies this year include storage-class memory SSDs, NVMe-oF and hyperconvergence, which all continue to show increased adoption rates, driven largely by a desire to leverage storage software innovation to enable performant, yet resilient, storage infrastructure based on industry-standard hardware.

Figure 1. Hype Cycle for Storage and Data Protection Technologies, 2020

The Priority Matrix The Priority Matrix maps the benefit rating for each technology against the length of time before Gartner expects it to reach the beginning of mainstream adoption. This alternative perspective can help users determine how to prioritize their storage hardware, software and data protection technology investments, and adoption.

In general, companies should begin with fast-moving technologies that are rated transformational or high in business benefits and are likely to reach mainstream adoption quickly. These technologies tend to have the most dramatic impact on business processes, revenue or cost- cutting efforts. After these transformational technologies, users are advised to evaluate high-

Gartner, Inc. | 441602 Page 2/45 impact technologies that will reach mainstream adoption status in the near term, and work downward and to the right from there.

Organizations that have not already done so should evaluate and implement continuous data protection and virtual machine backup and recovery to drive improved resiliency and data protection efficiency. They should also consider implementation of distributed file systems and object storage to address the growing needs of unstructured data. Hyperconverged infrastructure solutions are increasing in popularity, experiencing year-over-year growth while replacing storage arrays for enterprises looking to improve simplicity of management and streamline implementation in the data center and at the edge.

Figure 2. Priority Matrix for Storage and Data Protection Technologies, 2020

Off the Hype Cycle The cloud storage gateways innovation profile has been removed because it became Obsolete Before Plateau.

Gartner, Inc. | 441602 Page 3/45 The following profiles have been removed because they have reached maturity and are no longer hyped:

■ Enterprise endpoint backup

■ Virtual machine backup and recovery

The following profiles have changed as stated:

■ NVMe and NVMe-oF was changed to NVMe-oF.

■ Storage-class memory has been split into two profiles: persistent memory DIMMs and storage- class memory SSDs.

On the Rise Data Transport and Edge Appliances Analysis By: Raj Bala; Julia Palmer; Santhosh Rao

Definition: Data transport and edge appliances are physical devices capable of transporting bulk data to cloud infrastructure and platform service (CIPS) providers via package carriers rather than solely relying on a network to transfer data. Such appliances are often designed with ruggedized cases to be self-contained shipping units. The appliances optionally can be equipped with compute capacity in order to preprocess data before being transported to the cloud.

Position and Adoption Speed Justification: Data transfer and edge appliances are emerging as an efficient means of transporting large quantities of data when no network or less-than-ideal network conditions exist. Data transfer and edge appliances are playing an important role in enabling data transfer from the data centers or edge location to CIPS environments for processing and analytics. Gartner clients are evaluating ways to enable a continuous collection of data centralized in the cloud at significant scale for which network-based transfer is simply too limited.

User Advice: Enterprises are increasingly interested in using CIPS for an ever-expanding set of workloads, but find migrating such workloads and the data they require to be a challenge. As a general guide, moving 10TB of data in a 24-hour period requires a 10 Gbps network link. Such large network links may not be feasible depending on the amount of new data being generated per day.

The physical movement of data is merely part of the challenge. Planning the procedure and preparing the data take more effort and often more time than the shipment itself. Start planning these data shipments well in advance of the date on which you need to ship the appliance.

Business Impact: Getting data to the public cloud can be challenging due to network bottlenecks. There are distinct advantages to shipping data using transfer appliances when the data is unwieldy and the network bandwidth is constrained. Enterprise backups, for example, can be

Gartner, Inc. | 441602 Page 4/45 seeded at the public cloud target such that only incremental backups need to be sent to the cloud. Data can be collected in low- or no-network conditions to then be processed using public cloud services.

Benefit Rating: Moderate

Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Sample Vendors: Amazon Web Services (AWS); Backblaze; Google; IBM; Microsoft; Oracle; Wasabi

Recommended Reading: “Market Guide for Cloud IaaS Data Transport and Edge Appliances”

Hybrid Cloud Storage Analysis By: Raj Bala; Julia Palmer

Definition: Hybrid cloud storage encompasses a number of deployment patterns with varying underlying technologies. It can take the form of purpose-built hybrid cloud storage appliances, software-defined storage, broader storage systems with hybrid cloud features or the use of storage technologies from within colocation facilities connected by private network link to cloud service providers. The common thread among the varying patterns is the notion of a seamless bridge between disparate data centers and public cloud storage services.

Position and Adoption Speed Justification: The term “hybrid cloud storage” was first used in 2009 by vendors in the cloud storage gateway segment to describe their nascent offerings. Those early hybrid cloud products treated public cloud storage as an archive tier for infrequently used, low- value data. But the current market for hybrid cloud storage has moved well-past the early products in the cloud storage gateway market. Hybrid cloud storage is now used for modern workloads that transform data using the elasticity that public cloud compute provides. These workloads typically start off as large, bulky datasets that require transformation to a smaller result. Examples include videos and a broad range of analytics-oriented data. In the case of videos, artifacts of a video are collected over time and then rendered into a final result using the compute capabilities of public cloud IaaS providers.

User Advice: Evaluate vendors of hybrid cloud storage across two imperatives: tactical and strategic uses. The tactical approach includes uses such as tiering data to the cloud. The strategic approach includes using public cloud compute services to transform data into usable results. Most vendors focused on tactical use cases are unable to provide the strategic, transformational capabilities that are emerging in the market.

Business Impact: Tactical uses of hybrid cloud storage have been available for nearly a decade. These solutions are often designed such that data is not easily readable in the public cloud due to the opaque storage formats used by vendors. As a result, these methods limit the full breadth of functionality that can be unlocked in the cloud.

Gartner, Inc. | 441602 Page 5/45 The strategic uses of hybrid cloud storage are often developed with modern approaches in mind. As such, vendors have taken care to ensure that data can not only be read in the public cloud, but also modified and synchronized back to its source. This end-to-end capability requires that providers of hybrid cloud storage solutions integrate deeply with the cloud service provider in a manner that far exceeds the functionality required to simply tier to the cloud.

Benefit Rating: High

Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Sample Vendors: Amazon Web Services (AWS); CTERA Networks; Hammerspace; Microsoft; Nasuni; NetApp; Peer Software; Qumulo; Vcinity

Recommended Reading: “Top Five Approaches to Hybrid Cloud Storage — An Analysis of Use Cases, Benefits and Limitations”

“Market Guide for Hybrid Cloud Storage”

Container-Native Storage Analysis By: Julia Palmer; Arun Chandrasekaran

Definition: Container-native storage (CNS) is specifically designed to support container workloads and focus on addressing unique cloud-native scale and performance demands while providing deep integration with the container orchestration systems. CNS is designed to be aligned with microservices architecture principles and adhere to the requirements of container-native data services, such as being hardware-agnostic, API-driven and based on distributed software architecture.

Position and Adoption Speed Justification: As a result of growing interest around containers, many container-based applications in enterprise production environments now require support for stateful data persistence. In order to address this demand, vendors are now delivering container- native storage solutions and integrated systems platforms that have been specifically designed to run cloud-native applications. The majority of those solutions are deployed as a software-only product on commodity hardware, while some are being offered as an integrated appliance (compute and storage bundled together). While the technology for data persistence is not yet standardized, the common foundation is typically based on a distributed, software-defined single pool of storage. End users prefer platforms where application containers and persistent storage services are running on the same platform, similar to a hyperconverged solution. In addition, the entire stack is most often orchestrated with Kubernetes to create container life cycle integration and enable self-service operations for developers. Some of these solutions can be deployed on a wide choice of commodity hardware on-premises and as software in the public cloud. By running on top of highly available and robust container platforms, container-native storage frees the

Gartner, Inc. | 441602 Page 6/45 DevOps team from managing hardware components. However, IT leaders need to exercise caution, as technology is constantly evolving and many vendors in this space are early-stage startups.

User Advice: The adoption of containers is rapidly growing in the enterprise, especially for new applications, and is starting to expand beyond the initial stateless use cases. At this stage, end users should evaluate container-native storage systems for cloud-native applications. Container- native storage is designed for containers and can employ data services with a level of granularity not typically possible with a solution that is not purpose-built for these workloads.

Most traditional storage array solutions have Container Storage Interface (CSI) plug-ins and are suitable for supporting containerized applications. However, they lack true integration with orchestration systems and large-scale container deployments because the granularity of storage operations in traditional storage arrays is at physical or virtual volume level as opposed to container level. I&O leaders should choose storage solutions aligned with microservices architecture principles and adhere to the requirements of container-native data services, such as being hardware-agnostic, API-driven, based on distributed architecture, and capable of supporting edge, core or public cloud deployments. In addition end-users should assess vendor’s delivery of quality customer support and a consistent pricing model, given that the container ecosystem is rapidly evolving with unproven vendor business models

Business Impact: Containers can deliver agility in the end-to-end life cycle of deploying applications. Containers and related DevOps tooling can streamline the process of creating, testing, deploying and scaling applications. VMware’s Tanzu, Google’s Anthos and Red Hat’s OpenShift are making it easier to deploy and manage containers and the existing storage ecosystem will likely align efforts around the dominant platforms, all of which are ultimately Kubernetes based. But applying traditional approaches to storage for an otherwise streamlined container infrastructure can be a bottleneck to agility. Container-native storage aims to eliminate the bottlenecks to achieving agility in the end-to-end process of building and deploying applications.

Benefit Rating: High

Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Sample Vendors: Arrikto; Diamanti; MayaData; Portworx; Rancher; Red Hat; Robin.io; StorageOS

Recommended Reading: “An I&O Leader’s Guide to Storage for Containerized Workloads”

“The Future of Software-Defined Storage in Data Center, Edge and Hybrid Cloud”

“Decision Point for Selecting Stateful Container Storage”

Immutable Data Vault

Gartner, Inc. | 441602 Page 7/45 Analysis By: Michael Hoeck

Definition: Immutable data vault is an isolated backup solution designed to be complementary to the existing primary and disaster recovery backup infrastructure. Isolated from production in an air gap architecture, it mitigates impacts from malware, ransomware or insider malicious attacks. Immutable data vaults have immutable and improved security properties and are used in conjunction with ransomware and malware tools to detect and cleanse data in the vault. Immutable data vaults are often a combination of multiple vendor technologies and services.

Position and Adoption Speed Justification: While financial services organizations, led by industry- led initiatives such as Sheltered Harbor, have been rapidly adopting immutable data vaults and focusing on building out cyber resilience capabilities, the broader market has not been aware of these offerings. As the threat and impact of data destruction events (malware, ransomware, insider malicious activities) grows, this will cause a broader awareness and adoption of these capabilities.

User Advice: The idea of having a separate copy of business-critical systems and data on a media type resistant to change and be placed in a secure location can be traced back to the first tape cartridges stored in an off-site document vault. This is often referred to as an “air-gap” copy. Even though organizations have moved away from using tape as a deep archive, the need for secured, immutable storage has remained.

Immutable data vaults are storage environments or products intended to supplement existing backup infrastructure. They often encompass multiple vendor technologies combined as a solution which scan, cleanse and repair data in order to be fully functional for operational recovery. Since immutable data vaults are not the first line of defense for disaster recovery, a fast recovery speed may not be as necessary. However, close proximity to recovery compute capacity is often recommended. The proximity of compute capabilities differentiates immutable data vaults from data bunkers, as those did not need compute capabilities for the required functionality.

Air-gapping of storage environments is one method of isolation, but other requirements for this need to be evaluated. Correctly securing the storage device and any recovery catalogs or software configuration files are also important. Be mindful that the data stored within an immutable data vault may also contain the agent or infectious code, as well as infected or encrypted data. Only scanning, cleansing and repairing of that data will prevent the reinfection of other systems during the recovery and restoration process.

Business Impact: While recognized as significant to the financial services industry, business impact is measured differently based on an organization’s own risk tolerance assessment and its assurance of recovery. The actual impacts will vary based on regulatory changes or audit findings for industries, the desire of individual organizations to protect their critical data from potential destruction, and the increasing impact and changing behavior of malware, ransomware, and insider malicious activities.

Benefit Rating: High

Gartner, Inc. | 441602 Page 8/45 Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Sample Vendors: Continuity Software; EMC; IBM (Business Resiliency Services)

Recommended Reading: “Avoid Ransomware Disasters With a Better Backup and Recovery Strategy”

“Market Guide for IT Resilience Orchestration”

“Magic Quadrant for Disaster Recovery as a Service”

“Magic Quadrant for Data Center Backup and Recovery Solutions” dHCI Analysis By: Tony Harvey; Julia Palmer

Definition: Distributed HCI is a three-tier storage architecture using separate compute and storage nodes. Storage can be either scale-out external controller-based storage or dedicated SDS. dHCI offers the simplified management model of HCI while allowing compute and storage to be expanded independently.

Position and Adoption Speed Justification: External controller-based (ECB) and software-defined storage (SDS) vendors have introduced distributed HCI (dHCI) products to target customers who want the ease-of-use of hyperconverged systems, but also need asymmetric scaling of compute and storage. dHCI can also support bare metal workloads, and can provide more predictable latency and higher storage throughput than hyperconverged.

Datrium introduced the first dHCI solution (DVX) in 2016. Other vendors followed, including NetApp HCI in 2017 and HPE Nimble Storage dHCI in 2019. dHCI is gaining traction and enabling growth in integrated infrastructure systems (IIS) as more vendors enter the market.

User Advice: dHCI solutions offer many of the advantages of ECB storage solutions combined with the simplicity and VM level storage provisioning capabilities of hyperconverged solutions. I&O leaders should evaluate dHCI solutions when they have workloads that:

■ Require a mix of different server sizes and configurations.

■ Consume large amounts of storage capacity.

■ Have unbalanced compute and storage growth requirements.

■ Demand extremely high transaction rates or throughput.

■ Require predictable latency.

Gartner, Inc. | 441602 Page 9/45 When considering implementing a dHCI solution, I&O leaders should:

■ Identify specific workloads or initiatives where a dHCI system would be suitable.

■ Implement jointly by server, storage, and virtualization teams, as skills and project alignment from all three is required.

■ Deploy first as a proof of concept to ensure performance, availability, automation, and ease-of- use expectations are met.

Business Impact: dHCI can provide increased agility, reduced service costs as well as increase the likelihood of meeting service levels. I&O leaders who successfully implement a dHCI solution will view dHCI as a strategic investment that enables an automated, agile architecture that delivers the flexibility and scalability required for modern business.

Benefit Rating: Moderate

Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Sample Vendors: Datrium; HPE; NetApp

Recommended Reading: “Market Share Analysis: Data Center Hardware Integrated Systems, Worldwide, 4Q19 Update”

“How I&O Leaders Should Leverage New dHCI Solutions”

Computational Storage Analysis By: Jeff Vogel; Julia Palmer

Definition: Computational storage (CS) combines processing and storage media to allow applications to run on the storage media, offloading host processing from the main memory of the CPU. The aim is to reduce data movement by processing data directly within the storage device. CS involves more sophisticated processing capabilities located on the storage device. CS storage products employ greater processing power in the form of ASICs and low-power CPU cores on the SSD.

Position and Adoption Speed Justification: Computational storage brings computing power to storage to reduce performance inefficiencies and latency-sensitive issues in the movement of data between storage and compute resources. CS-based systems may include data compression, encryption and redundant array of independent disks (RAID) management. As data volumes increase, movement of data becomes a bottleneck. As storage sizes normally vastly exceed memory, the data has to be read in from the storage media. This impedes application performance, undermining real-time analysis for most datasets. The principle that storage is separate from

Gartner, Inc. | 441602 Page 10/45 processing remains a core tenet of most enterprise IT systems. Data-intensive applications such as AI/ML, high-performance computing, analytics, high-frequency trading and immersive and mixed- reality streaming stand to benefit the most by removing the bottleneck. Edge computing remains an opportunity, along with applications that favor distributed processing. Furthermore, putting processing on the storage media may provide substantial performance gains, better memory management and energy savings over storage systems comprising industry-standard solid-state media.

User Advice: The CS market is still in the development phase with a handful of vendors fielding POC, limited and small-volume production systems. Early use cases for CS are rapidly emerging, including machine learning processing, real-time data analytics, high-frequency trading, multimedia production, and higher-performance computing. Applications that work across multiple CS nodes will perform best and offer the greatest benefits. CS system architecture is more complex, may require applications to be recompiled, may require additional APIs, or for the host system to be aware of the services that are provided by the CS system.

I&O leaders are advised to explore possible benefits that can be gained from specific use cases, but should carefully weigh the cost vs. performance gains. This is especially true where certain workloads are very input/output-bound and would benefit the most from processing in storage. The segment is led by small startups, so I&O leaders are advised to perform sufficient due diligence. Also, monitor ongoing Storage Networking Industry Association (SNIA) CS Technical Work Group (TWG) developments where standards and interoperability work is being actively pursued with over 20 companies participating.

Business Impact: There is both a cost and time factor involved in shuffling terabytes of data around. CS can provide material performance benefits to data-intensive applications, especially in edge computing. Combined with its low power footprint, CS increases the performance-per-watt ratio, therein decreasing power consumption costs for applications at the edge. CS provides a programmable platform for value-added storage services such as erasure coding and database analytics functions, substantially reducing application and server/compute costs. CS is complementary to container workloads. Utilizing more powerful compute into solid-state media controllers will increase storage efficiencies and lower overall application costs, allowing the application to directly access the NAND flash chips inside the CS drives. A material increase in bandwidth between application and the solid-state media device/controller is achieved by taking advantage of the common flash interconnect channel standard. However, the challenge facing computational storage vendors is how to broaden the target audience beyond a few highly advanced customers with the resources and capabilities to perform significant internal application development and testing with clear ROI advantages.

Benefit Rating: High

Market Penetration: Less than 1% of target audience

Maturity: Emerging Gartner, Inc. | 441602 Page 11/45 Sample Vendors: Eideticom; NETINT Technologies; NGD Systems; Nyriad; Samsung; ScaleFlux

Recommended Reading: “Prepare Your Storage and Data Management Strategy for the Impact of Artificial Intelligence Workloads”

“Market Share Analysis: Solid-State Drives, Worldwide, 2018”

Container Backup Analysis By: Jerry Rozeman; Nik Simpson; Santhosh Rao

Definition: Container backup solutions primarily protect data in containers that use persistent storage, by managing the process of taking additional copies of that data with the integrated capability to restore that data in a consistent state. Container backup solutions are offered either as part of a traditional backup solution, as part of the storage solution or are delivered as a specialized solution.

Position and Adoption Speed Justification: Container backup solution are an emerging technology that protect organizations against data loss in a containerized environment. Containers require different backup procedures when compared with traditional application running in a virtual or physical machine owing to their stateless architecture. Leading backup vendors are beginning to protect persistent data by integrating with snapshot capabilities provided by Container Storage Interface (CSI) plug-ins offered by storage vendors. Container backup is largely nascent, however, the adoption curve of containers in enterprise environments accelerates the need for data protection solutions in this space.

User Advice: Organizations adopting containers in their enterprise should evaluate the need for container backup based on application criticality. In addition, adopting container backup solutions is not only a technology choice but it also needs to align with the organization structure. Unlike traditional infrastructure, container backup operations would be performed by the application engineering team. Therefore, a certain degree of coordination between the engineering team and backup team maybe required. Protecting containers also requires additional investments in backup infrastructure. Organizations should adopt a strategy for container backup just as with every other data source in their enterprise. Organizations might need to adopt specialized container backup solutions first as they lead in focus while enterprise vendor adoption matures over time.

Business Impact: Container backup solutions address the data loss risk associated with container environments. As the use of containers in enterprise production environment slowly starts to rise, the effort to protect that data should be in-line with enterprise requirements, just as it is with all enterprise data. In addition, backing up containers in enterprise environments will further propel the adoption of containers as it better aligns with enterprise data protection policies. However, container backup will be another data source to look after from a data protection governance perspective increasing the load on the organization.

Benefit Rating: Moderate

Gartner, Inc. | 441602 Page 12/45 Market Penetration: Less than 1% of target audience

Maturity: Emerging

Sample Vendors: Cohesity; Commvault; Dell EMC; IBM; Kasten; Portworx; Rubrik; Trilio

Management Software-Defined Storage Analysis By: Julia Palmer; Chandra Mukhyala

Definition: Management software-defined storage (MSDS) coordinates the delivery of storage services to enable greater storage agility. It is deployed in combination with traditional storage improving and enabling robust policy management, I/O optimization and automation functions to configure, manage, analyze and provision underlying storage resources. Products in the management SDS category enable abstraction, mobility, virtualization, storage resource management (SRM) and I/O optimization of storage resources to reduce cost and enable portability.

Position and Adoption Speed Justification: While MSD products are not widely adopted by enterprise end users, they could revolutionize storage architectural approaches and storage consumption models over time. The concept of abstracting and separating physical or virtual storage services via splitting the control plane (action signals) regarding storage from the data plane (how data actually flows) is foundational to SDS. This is achieved largely through programmable interfaces (such as APIs), which are still evolving. MSDS requests will negotiate capabilities through software that, in turn, will translate those capabilities into storage services that meet a defined policy or SLA. Storage virtualization abstracts storage resources, which is also foundational to MSDS, whereas the concepts of policy-based automation and orchestration — possibly triggered and managed by applications and hypervisors — are key differentiators between simple virtualization and MSDS.

User Advice: MSDS targets end-user use cases where the ultimate goal is to improve or extend existing storage capabilities and reduce operating expenditure (opex). However, value propositions and leading use cases of MSDS are not clear, as the technology itself is fragmented by many subcategories. When looking at different products, identify and focus on use cases applicable to your enterprise, and investigate each product for its capabilities.

Implement proofs of concept (POCs) to determine a product’s suitability for broader deployments.

The top reasons for interest in MSDS, as gathered from interactions with Gartner clients, include:

■ Improving the management and agility of the heterogenous storage infrastructure through better programmability, interoperability, automation and orchestration

■ Hybrid and multicloud enablement

■ Storage virtualization and abstraction

Gartner, Inc. | 441602 Page 13/45 ■ Performance improvement by optimizing and aggregating storage I/O

■ Data analytics and search

■ Opex reductions via reducing the demands of administrators

■ Capital expenditure (capex) reductions via more efficient utilization of existing storage systems for life cycle management

Despite the promise of SDS, there are potential problems with some storage point solutions that have been rebranded as SDS to present a higher value proposition versus built-in storage features, and it needs to be carefully examined for ROI benefits.

Business Impact: Management SDS’s ultimate value is to provide broad capability in the policy management and orchestration of many storage resources. While some management SDS products are focusing on enabling provisioning and automation of storage resources, more comprehensive solutions feature robust utilization and management of heterogeneous storage services. Such solutions are allowing mobility between different types of storage platforms on- premises, at the edge and in the public cloud. As a subset of MSDS, I/O optimization products can reduce storage response times, improve storage resource utilization and control costs by deferring major infrastructure upgrades. The benefits of MSDS are in improved operational efficiency by unifying storage management practices and providing common management layer across different storage technologies. The operational ROI of management SDS will depend on IT leaders’ ability to quantify the impact of improved ongoing data management, increased operational excellence and reduction of opex.

Benefit Rating: Moderate

Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Sample Vendors: DataCore Software; Dell EMC; Hammerspace; HubStor; IBM; Igneous; Komprise; Leonovus; Nodeum; Peer Software

Recommended Reading: “The Future of Software-Defined Storage in Data Center, Edge and Hybrid Cloud”

“Top Five Approaches to Hybrid Cloud Storage — An Analysis of Use Cases, Benefits and Limitations”

“Competitive Landscape: Infrastructure Software-Defined Storage”

“Market Guide for Hybrid Cloud Storage”

At the Peak Gartner, Inc. | 441602 Page 14/45 NVMe-oF Analysis By: Julia Palmer; Joseph Unsworth; Joe Skorupa

Definition: Nonvolatile memory express over fabrics (NVMe-oF) is a network protocols that is taking advantage of the parallel-access and low-latency features of NVMe PCIe devices. NVMe-oF enables tunneling the NVMe command set over additional transports beyond PCIe over various networked interfaces to the remote subsystems across a data center network. The specification defines a common protocol interface and is designed to work with high-performance fabric technology including RDMA over Fibre Channel, InfiniBand or Ethernet with RoCEv2, iWARP or TCP.

Position and Adoption Speed Justification: NVMe is a storage protocol that is being used within solid-state arrays and servers. It takes advantage of the latest nonvolatile memory to address the needs of extreme-low-latency workloads and is now broadly deployed. However, NVMe-oF, which requires a storage network, is still emerging and developing at different rates depending on the network encapsulation method. Today, many NVMe-oF offerings that use fifth generation and/or sixth generation Fibre Channel (FC-NVMe) are available, but adoption of NVMe-oF within 25/50/100 Gigabit Ethernet is slower. In November 2018, the NVMe standards body ratified NVMe/TCP as a new transport mechanism. In the future, it’s likely that TCP/IP will evolve to be an important data center transport for NVMe-oF. The NVMe-oF protocol can take advantage of high- speed networks and will accelerate the adoption of next-generation storage architectures, such as disaggregated compute, scale-out software-defined storage and hyperconverged and composable infrastructures, bringing super low latency application access to the mainstream enterprise. Unlike server-attached flash storage, shared accelerated NVMe and NVMe-oF- can scale out to high capacity with high-availability features and be managed from a central location, serving dozens of compute clients. Most storage array vendors have already debuted at least one NVMe-oF capable product with nearly all vendors expected to do so during 2021.

User Advice: Buyers should clearly identify workload where the scalability and performance of NVMe-based solutions and NVMe-oF justify the premium cost of an end-to-end NVMe deployment. There are a select variety of highly performant workloads that can utilize the technology, such as AI/ML, high-performance computing (HPC), in-memory databases or transaction processing. Next, identify appropriate potential storage platform, NIC/HBA and network fabric suppliers to verify that interoperability testing has been performed and that reference customers are available. Should buyers not desire the immediate performance gains and associated costs, then they should investigate how simply and nondisruptively their existing products migration path to ensure investment protection for the future.

Most storage vendors already offer solid-state arrays with internal NVMe storage, and during the next 12 months an increasing number of infrastructure vendors will offer support of NVMe-oF connectivity to the compute hosts. HCI vendors will deliver NVMe storage in an integrated offering during the next 12 to 18 months, but customers need to verify the availability of NVMe-oF networks between HCI nodes to see significant performance improvement. Similarly, when customers require

Gartner, Inc. | 441602 Page 15/45 NVMe-oF storage networks that encompass switches, host bus adapters (HBAs), and OS kernel drivers, IT infrastructure modernization will be required.

This potential requirements of making major changes to the data center storage networking and servers are slowing down the adoption of NVMe-oF solutions in mainstream enterprises. However, due to the better interoperability and availability of NVMe-oF over Fibre Channel within the next two years, I&O leaders implementing NVMe-oF will likely choose to deploy it within an existing Fibre Channel SAN infrastructure. Investment protection for customers with existing fifth-generation or sixth-generation FC SANs is compelling because customers can implement new fast NVMe storage arrays and connect via NVMe-oF to servers while using the same media. Therefore, old and new storage, network switches and host bus adaptors can run together in the same FC-based storage network (SAN), with SCSI and NVMe storage separated by zones, as long as compatible fifth- or sixth-generation FC equipment is used. Furthermore, as support for NVMe-oF expands and matures, I&O leaders will have an additional choice of either deploying NVMe-oF with RDMA RoCEv2 or NVMe-oF over TCP/IP based products to leverage latest Ethernet deployment, thereby easing the transition and providing investment protection.

Business Impact: Today, NVMe SSDs and NVMe-oF offerings can have a dramatic impact on business use cases where low-latency requirements are critical to the bottom line. Though requiring potential infrastructure enhancements, the clear benefits these technologies can provide will immediately attract high-performance computing customers who can quickly show a positive ROI. Designed for all low-latency workloads where performance is a business differentiator, NVMe SSD with NVME-oF will deliver architectures that extend and enhance the capabilities of modern general-purpose solid-state arrays. Most workloads will not need the multimillion IOPS performance that these new technologies offer, but most customers are demanding the lower, consistent response times provided by NVMe-based systems.

Benefit Rating: High

Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Sample Vendors: Dell EMC; Excelero; Hitachi Vantara; IBM; Kaminario; Lightbits; NetApp; Pavilion Data Systems; Pure Storage; StorCentric

Recommended Reading: “Top 10 Technologies That Will Drive the Future of Infrastructure and Operations”

“2019 Strategic Roadmap for Storage”

“Prepare Your Storage and Data Management Strategy for the Impact of Artificial Intelligence Workloads”

“Critical Capabilities for Solid-State Arrays”

Gartner, Inc. | 441602 Page 16/45 File Analysis Analysis By: Michael Hoeck

Definition: File analysis software analyzes, indexes, searches, tracks and reports on file metadata and file content. FA solutions are offered as both on-premises and SaaS options. FA software reports on detailed metadata and contextual information to enable better information governance, risk management and data management actions against unstructured data.

Position and Adoption Speed Justification: File analysis (FA) solutions assist organizations in managing the ever-expanding repository of unstructured “dark” data. This includes file shares, email databases, content services platforms, content collaboration platforms and cloud-based productivity platforms, such as Microsoft Office 365 and Google G Suite. The primary use cases for FA software for unstructured data environments include:

■ Organizational efficiency and cost optimization

■ Regulatory compliance

■ Risk mitigation

■ Text analytics

The desire to mitigate business risks (including security and privacy risks), identify sensitive data, optimize storage cost and implement information governance are key factors driving the adoption of FA software. The hype associated with the growing trend of privacy regulations such as GDPR and CCPA has greatly raised the interest and awareness for FA software. When exposed through the use of FA software, the potential value of contextually rich unstructured data is capturing the interest of data and analytics teams.

User Advice: Organizations should use FA software to better grasp the risk of their unstructured data footprint, including where it resides and who has access to it, and to expose another rich dataset for driving business decisions. Utilize for cleanup of old file shares containing ROT data, which can be defensively disposed of or relocated to optimize data infrastructure. Data visualization maps created by FA software can be presented to better identify the value and risk of the data. This, in turn, can enable IT, line of business (LOB) and compliance organizations to make better-informed decisions regarding classification, data governance, storage management and content migration. For example, apply to regulatory compliance projects, such as CCPA, GDPR or other privacy regulations and readily identify sensitive personal data or key corporate intellectual property data.

Business Impact: FA software reduces business risk and inefficiencies hidden in unstructured data sources often considered “dark” data. They improve management of information governance and operational efficiency practices by:

Gartner, Inc. | 441602 Page 17/45 ■ Eliminating or quarantining of sensitive data

■ Identifying access permission issues and protecting intellectual property

■ Optimizing storage utilization by finding and eliminating redundant and outdated data

■ Feeding data into corporate retention initiatives through the utilization of standard and custom file attributes

FA software also assists in classifying valuable business data so it can be more easily found and leveraged, as well as supporting e-discovery, data migration and analytics.

Benefit Rating: Moderate

Market Penetration: 5% to 20% of target audience

Maturity: Adolescent

Sample Vendors: Active Navigation; Adlib; Condrey; Ground Labs; Index Engines; SailPoint; Stealthbits Technologies; Titus; Varonis; Veritas Technologies

Recommended Reading: “Market Guide for File Analysis Software”

“Top 5 Emerging Cost Optimization Opportunities for Storage and Data Protection”

“Beyond GDPR: Five Technologies to Borrow From Security to Operationalize Privacy”

Cloud Data Backup Analysis By: Jerry Rozeman; Chandra Mukhyala; Michael Hoeck

Definition: Policy-based, cloud data backup tools back up and restore production data generated natively in the cloud. The data can be generated by SaaS applications (e.g., Microsoft Office 365 or Salesforce) or by infrastructure as a service (IaaS) compute services (e.g., Amazon Elastic Compute Cloud [Amazon EC2] instances). Backup copies can be stored in the same or a different cloud location, or on-premises in the data center, where restore/recovery options should be offered in terms of restore granularity and recovery location.

Position and Adoption Speed Justification: Backup of data generated natively in public cloud is an emerging requirement, because cloud providers focus on infrastructure high availability and disaster recovery, but are not responsible for application or user data loss. Most SaaS applications’ natively included data protection capabilities are not true backup, and they lack secure access control and consistent recovery points to recover from internal and external threats.

As Microsoft Office 365 (O365) gains more momentum, O365 backup capabilities have begun to emerge from mainstream backup vendors and small vendors. IaaS data backup, on the other hand, is a more nascent area that caters to organizations’ need to back up production data generated in

Gartner, Inc. | 441602 Page 18/45 the IaaS cloud. Native backup of IaaS usually resorted to snapshots and scripting, which may lack application consistency, restore options, data mobility, storage efficiency and policy-based automation. However, more data center backup vendors now offer improved cloud storage backup capabilities that automate snapshot management and address some cloud-native limitations.

User Advice: Before migrating critical on-premises applications to SaaS or IaaS, organizations need a thorough understanding of cloud-native backup and recovery capabilities and should compare them to their situations today. If the native capabilities seem to fall short (e.g., in application consistency, security requirements and recovery point objective [RPO]), factor additional backup costs into the total cost of ownership (TCO) calculation before migrating to the cloud. Organizations planning to use cloud-native recovery mechanisms should ensure that their contracts with cloud providers clearly specify the capabilities and costs associated with the following items in terms of native data protection:

■ Backup/restore methods — This describes how user data backup and restore are done, including any methods to prevent users from purging their own “backup copies” and to speed up recovery after a propagated attack, such as ransomware.

■ Backup/restore performance — Some users have observed poor recovery time objectives (RTOs) when restoring or recovering data from cloud object storage.

■ Retention period — This measures how long cloud providers can retain native backups free of charge or with additional cost.

■ Clear expectations in writing, if not service-level agreement (SLA) guarantees, regarding recovery time objectives — RTO measures how long it takes to restore at different granular levels, such as a file, a mailbox or an entire application.

■ Additional storage cost due to backup — Insist on concrete guidelines on how much storage IaaS’s native snapshots will consume, so that organizations can predict backup storage cost.

For third-party backup tools, focus on ease of cloud deployment, policy automation for easy management, data mobility, storage efficiency and flexible options in terms of backup/recovery granularity and location.

Business Impact: As more production workloads migrate to the cloud (in the form of SaaS or IaaS), it has become critical to protect data generated natively in the cloud. Deploying data protection for cloud-based workloads is an additional investment; however, this is often an afterthought, because it was not part of the business case. Without additional protection of cloud- based data, customers face additional risks, due to the impact of data loss, data corruption or ransomware attacks on their data.

SaaS and IaaS providers typically offer infrastructure resiliency and availability to protect their systems from site failures. However, when data is lost due to their infrastructure failure, the

Gartner, Inc. | 441602 Page 19/45 providers are not financially responsible for the value of lost data, and provide only limited credit for the period of downtime. When data is lost to user errors, software corruption or malicious attacks, user organizations are fully responsible themselves. The more critical cloud-generated data is, the more critical it is for users to provide recoverability of such data.

Benefit Rating: Moderate

Market Penetration: 5% to 20% of target audience

Maturity: Emerging

Sample Vendors: Actifio; Cohesity; Commvault; Dell EMC; Druva; Rubrik; Spanning Cloud Apps; Veeam; Veritas Technologies

Recommended Reading: “Adopt Microsoft Office 365 Backup for Damage Control and Fast Recovery After Malicious Attacks”

“Debunking the Myth of Using EFSS for Backup”

Sliding Into the Trough Open-Source Storage Analysis By: Julia Palmer; Arun Chandrasekaran

Definition: Open-source storage is a form of software-defined storage for which the source code is made available to the public through a free distribution license. Open-source storage supports many of the same features as proprietary storage, including support of structured and unstructured data, as well as heterogeneous management.

Position and Adoption Speed Justification: Although open-source storage (OSS) has been around for over a decade, it has been mainly adopted by hyperscalers, technical service providers and large organizations. Recent innovations in x86 hardware and flash, combined with an innovative open-source ecosystem, are making open-source storage attractive for cloud and big data workloads and as a potential alternative to proprietary storage. Cloud computing, microservices application architectures, big data analytics and information archiving push the capacity, pricing and performance frontiers of traditional scale-up storage architectures. This has led to a renewed interest in open-source software as a means to achieve high scalability in capacity and performance at lower acquisition costs.

The emergence of open-source platforms such as Kubernetes and TensorFlow are backed by large, innovative communities of developers and vendors, together with vendors such as Red Hat (Gluster Storage, Ceph Storage) and SUSE (Ceph) and DDN (Lustre). Collectively, they provide enterprises with a broad selection of options to consider for use cases such as cloud storage, big data, stateful microservices workloads and archiving. There have been some open-source storage projects for container-based storage such as MinIO, OpenEBS, Longhorn and Rook.

Gartner, Inc. | 441602 Page 20/45 User Advice: Although open-source storage offers a less-expensive upfront alternative to proprietary storage, IT leaders need to weigh the benefits, risks and costs accurately. Some enterprise IT leaders often overestimate the benefits and underestimate the costs and risks. Conversely, with the emerging maturity of open-source storage solutions, enterprise IT buyers should not overlook the value proposition of these solutions. IT leaders should actively deploy pilot projects, identify internal champions, train storage teams and prepare the overall organization for this disruptive trend. Although source code can be downloaded for free, it is advisable to use a commercial distribution and to obtain support through a vendor, because OSS requires significant effort and expertise to install, maintain and support. IT leaders deploying “open core” or “freemium” storage products need to carefully evaluate any downsides of lock-in against the perceived benefits attained. The proprietary software version often comes in the form of add-on modules, retained features or management tools that function on top of OSS.

In most cases, open-source storage is not positioned as general-purpose storage. Therefore, choose use cases that leverage the strengths of open-source platforms — for example, batch processing or a low-cost archive for Hadoop and development/testing (dev/test) use cases for containers — and use them appropriately. It is important to focus on hardware design and choose cost-effective reference architectures that are certified by the vendors and for which support is delivered in an integrated manner. Overall, on-premises integration, management automation and customer support should be key priorities when selecting open-source storage solutions.

Business Impact: Open-source storage is playing an important role in enabling cost-effective, scalable platforms for new cloud and big data workloads. Today, Gartner clients are evaluating open-source storage across block, file and object protocols. Gartner is seeing adoption among technology firms’ service provider clients, as well as in research and academic environments. Big data, dev/test and private cloud use in enterprises are also promising use cases for open-source storage, where Gartner is witnessing keen interest. As data continues to grow at a frantic pace, open-source storage will enable customers to store and maintain data, particularly unstructured data, at a lower acquisition cost, with “good enough” availability, performance and manageability.

Benefit Rating: Moderate

Market Penetration: 5% to 20% of target audience

Maturity: Emerging

Sample Vendors: BeeGFS; Cloudera; DDN; iXsystems; MayaData; MinIO; OpenIO; Rancher; Red Hat; SUSE

Recommended Reading: “What Innovation Leaders Must Know About Open-Source Software”

“Peer Connect Perspectives: Developing an Open-Source Strategy”

“Use Service-Level Requirements to Drive Decisions Between Commercial and Self-Support for Open-Source Software” Gartner, Inc. | 441602 Page 21/45 “Four Steps to Adopt Open-Source Software as Part of the DevOps Toolchain”

Copy Data Management Analysis By: Chandra Mukhyala; Michael Hoeck

Definition: Copy data management (CDM) products capture application-consistent copies of production data in application-native format to create live “golden images” in secondary storage systems. Virtual copies of the golden image can be mounted for instant recovery and other use cases. Because the golden image is in application-native format, any application that needs read- only access to a point in time copy of the production data can use the golden image on the secondary system instead of accessing the production storage.

Position and Adoption Speed Justification: CDM is not a widely understood concept, although it is more than seven years old and more backup/recovery and hyperconverged integrated systems (HCIS) vendors are promoting it. Adoption rates vary greatly among different vendors and products because of different functions and varying sales focus. Although DevOps teams often need test/development workflow automation, they don’t typically evaluate the same product with the backup team. In fact, the concept and value proposition of activating backup data for test/development is foreign to many organizations. The main challenge faced by CDM products is that they have to target different buying centers and decision makers. The lack of products from major vendors and inconsistent use of the term “CDM” impede greater adoption.

User Advice: Organizations face increased cost and productivity challenges, due to the management complexities of provisioning multiple application copies for test/development. CDM products should be evaluated to improve time to market and reduce storage waste. CDM could also be useful for organizations that are looking for active access to secondary data sources for reporting or analytics due to its separation from the production environment. Enterprises should also look at opportunities for database and application archiving for storage reduction or governance initiatives to further justify investment. Due to the short history of the new architecture and vendors, new use cases beyond the common ones (e.g., backup and test/development enablement) are not field-proven and should be approached with caution.

Business Impact: IT organizations have historically used different hardware and software products to deliver backup, archive, replication, test/development, legacy application archiving and other data-intensive services with little control or management across these services. This results in overinvestment in storage capacity, software licenses and operating expenditure (opex) costs associated with managing multiple solutions. CDM facilitates the use of one copy of data for many or all of these functions via virtual copies, dramatically reducing the need for multiple physical copies of data and enabling organizations to cut the costs associated with multiple disparate software licenses and storage islands.

The separation of the virtual “golden image” from the production environment can facilitate aggressive recovery point objectives (RPOs) and recovery time objectives (RTOs). In the case of

Gartner, Inc. | 441602 Page 22/45 test/development, CDM improves the workflow process and operational efficiency by enabling database administrators and application developers more self-service capabilities.

Benefit Rating: High

Market Penetration: 5% to 20% of target audience

Maturity: Adolescent

Sample Vendors: Actifio; Catalogic Software; Cohesity; Delphix; Rubrik; Veritas Technologies

Recommended Reading: “Market Insight: Differentiate Your Data Protection Portfolio by Exposing Backup Data for Use Cases Beyond Data Restore”

Infrastructure SDS Analysis By: Julia Palmer; Chandra Mukhyala

Definition: Infrastructure software-defined storage (SDS) abstracts storage software from the underlying hardware providing common provisioning and data services across IT infrastructures regardless of locality and hardware technology. It can be deployed as a virtual machine, container or as storage software on a bare-metal industry standard server, allowing organizations to deploy a storage-as-software package on-premises, at the edge or in the public cloud. This creates a storage platform that can be accessed by file, block or object protocols.

Position and Adoption Speed Justification: Infrastructure SDS changes the delivery model and potentially the economics of enterprise storage infrastructures. Whether deployed independently, or as an element of a hyperconverged infrastructure, SDS alters how organizations buy and deploy enterprise storage. Following web-scale IT’s lead, I&O leaders are deploying SDS as hardware- agnostic storage, and breaking the bond from proprietary, external-controller-based (ECB) storage hardware. The power of multicore Intel x86 processors, use of software-based RAID or erasure coding, use of flash and high throughput networking have essentially eliminated most hardware- associated differentiation, transferring the value to storage software. New infrastructure SDS vendors are targeting a broad range of delivery models and workloads, including backup, archiving, big data analytics, high-performance computing, AI supporting structured and unstructured data for virtual machines, containers, and bare-metal workloads. Calculating the total cost of ownership (TCO) benefits of SDS involves comprehensive analysis of savings from both capital expenditure and operating expenditure, including administration, verification, deployment, and ongoing management, maintenance and support, as well as a potential improvement in business agility.

User Advice: Infrastructure SDS is the delivery of data services and storage-array functionality on top of industry standard hardware. Enterprises choose a software-defined approach when they wish to accomplish some or all of the following goals:

■ Build a storage solution at a low acquisition price point on commodity x86 platform.

Gartner, Inc. | 441602 Page 23/45 ■ Decouple storage software and hardware to standardize their data center hardware platforms.

■ Establish a scalable solution specifically geared toward modern workloads.

■ Build an agile, “infrastructure as code” architecture, enabling storage to be a part of software- defined data center automation and orchestration framework that expands to the public cloud.

■ Take advantage of latest innovations in storage hardware before they are supported in traditional ECB storage arrays.

Advice to end users:

■ Recognize that infrastructure SDS remains a nascent, but growing, deployment model that will be primarily focused on web-scale deployment agility, but also has applicability at the edge and public cloud deployments

■ Identify use cases for SDS deployment by mapping IT strategy with the specific SDS solution business advantage framework.

■ Select SDS vendors that provide support for multiple deployment options, and offer tight hardware reference designs and flexible pricing models.

■ Deploy SDS as part of a cohesive software-defined infrastructure (SDI) design, with an emphasis on delivering uniform storage platforms across on-premises, public cloud and edge environments.

■ Recognize that SDS may involve substantial work in sizing the underlying hardware and building the total solution on your own versus a plug-and-play appliance.

■ Grade SDS products by their ability to be truly hardware-agnostic, API-driven, based on distributed architecture, and capable of supporting edge, core or public cloud deployments.

Business Impact: Infrastructure SDS is a hardware-agnostic platform. It breaks the dependency on proprietary storage hardware and lowers acquisition costs by utilizing the industry standard x86 server platform of the customer’s choice. Some Gartner customers report up to 40% TCO reduction with infrastructure SDS that comes from the use of x86 industry standard hardware and lower cost upgrades and maintenance. However, the real value of infrastructure SDS in the long term is increased flexibility and the ability to have common provisioning and data services tools regardless of data locality. I&O leaders that successfully deployed and benefited from infrastructure SDS have usually belonged to large enterprises or cloud service providers that pursued web-scale-like efficiency, flexibility and scalability, and viewed SDS as a critical enablement technology for their IT initiatives. I&O leaders should look at infrastructure SDS not as another storage product but as an investment in improving storage economics and providing data mobility including hybrid cloud storage integration.

Gartner, Inc. | 441602 Page 24/45 Benefit Rating: Transformational

Market Penetration: 5% to 20% of target audience

Maturity: Adolescent

Sample Vendors: DataCore Software; IBM; NetApp; Nutanix; Red Hat; Scality; StorMagic; SUSE; VMware; WekaIO

Recommended Reading: “The Future of Software-Defined Storage in Data Center, Edge and Hybrid Cloud”

“Top Five Approaches to Hybrid Cloud Storage — An Analysis of Use Cases, Benefits and Limitations”

“Market Guide for Hybrid Cloud Storage”

“An I&O Leader’s Guide to Storage for Containerized Workloads”

“Competitive Landscape: Infrastructure Software-Defined Storage”

“Software-Defined Storage Enables Versatile Hybrid-Cloud Architectures”

Object Storage Analysis By: Raj Bala; Chandra Mukhyala

Definition: Object storage refers to a system that houses data in structures called “objects,” and serves hosts via APIs such as Amazon Simple Storage Service (Amazon S3). Conceptually, objects are similar to files, in that they are composed of content and metadata. Object storage uses a flat namespace, compared with treelike structures seen in file systems. Object storage products are available to be deployed as virtual appliances, managed hosting, purpose-built hardware appliances or storage software that can be installed on bare-metal servers.

Position and Adoption Speed Justification: The market for on-premises, deployed object storage platforms is not growing rapidly, particularly when compared with adjacent storage segments, such as hyperconverged integrated system (HCIS) and solid-state arrays (SSAs). However, the market for object storage is growing, albeit slowly, as enterprises seek petabyte-scale storage infrastructures at a lower total cost of ownership (TCO).

Hybrid cloud storage capabilities from emerging vendors and refreshed products from large storage portfolio vendors are expected to further stimulate adoption from end users, as mainstream enterprises seek seamless interaction between on-premises and public cloud infrastructure. Although cost containment of traditional storage area network/network-attached storage (SAN/NAS) infrastructure continues to be the key driver for object storage adoption, cloud- native use cases in industries such as media and entertainment, life sciences, the public sector and education/research, are spawning new investments.

Gartner, Inc. | 441602 Page 25/45 User Advice: IT leaders that require highly scalable, self-healing and cost-effective storage platforms for unstructured data should evaluate the suitability of object storage products, but not when the primary use case requires processing or editing of file-based data. Most object storage vendors offer lackluster implementations of file protocols as the engineering effort is substantial. The common use cases that Gartner sees for object storage are archiving, content distribution, analytics and backup. When building on-premises object storage repositories, customers should evaluate the product’s API support for dominant public cloud providers, so that they can extend their workloads to a public cloud, if needed.

Amazon’s S3 has emerged as the dominant API over vendor-specific APIs and OpenStack Swift, which is in precipitous decline. Select object storage vendors that offer a wide choice of deployment (software-only versus packaged appliances versus managed hosting) and licensing models (perpetual versus subscription) that can provide flexibility and reduce TCO. These products are capable of a huge scale in capacity and are better-suited for workloads that require higher bandwidth than transactional workloads that demand high input/output operations per second (IOPS) and low latency.

Business Impact: Rapid growth in unstructured data (40% year over year) and the need to store and retrieve it in a cost-effective, automated manner will drive the growth of object storage. Enterprises often deploy object storage on-premises when looking to provide a private cloud infrastructure as a service (IaaS) experience in their own data centers. Object storage is well-suited to multitenant environments and requires no lengthy provisioning for new applications. There is growing interest in object storage from enterprise developers and DevOps team members looking for agile and programmable infrastructures that can be extended to the public cloud. Object storage software, deployed on commodity hardware, is emerging as a threat to external controller- based (ECB) storage hardware vendors in big data environments with heavy volume challenges.

Benefit Rating: High

Market Penetration: 5% to 20% of target audience

Maturity: Adolescent

Sample Vendors: Caringo; Cloudian; DDN; Dell EMC; Hitachi Vantara; IBM; MinIO; NetApp; Red Hat; Scality

Recommended Reading: “Magic Quadrant for Distributed File Systems and Object Storage”

“Critical Capabilities for Object Storage”

Persistent Memory DIMMs Analysis By: Alan Priestley

Definition: Persistent memory dual in line memory modules (PM-DIMMs) are nonvolatile DIMMs that reside on the double data rate (DDR) DRAM memory channel but unlike DRAM are able to

Gartner, Inc. | 441602 Page 26/45 retain memory contents through a power failure. PM-DIMMs are also referred to as solid state DIMMs. These devices integrate nonvolatile memory (either NAND flash or 3D XPoint) and a system controller chip.

Position and Adoption Speed Justification: DIMMs connect directly to a dedicated memory channel rather than a storage channel and do not face the data transfer bottlenecks of a traditional storage system. As a result PM-DIMMs can achieve drastically lower latencies (at least 50% lower) than any existing solid-state storage solution and can be viable alternatives to DRAM memory, if the slower access speeds and reliability are acceptable.

The markets adoption of PM-DIMMs has been hampered by the slow write performance and limited endurance of existing flash memory technologies. However, the introduction of the 3D XPoint nonvolatile memory technology, from Intel and Micron, brought substantial performance and reliability gains over traditional flash memory. Intel’s introduction of its Optane DC persistent memory DIMMs enables the use of PM-DIMM technology in data center servers (albeit those based on the latest generations of Intel’s Xeon Scalable Processors).

Use of any PM-DIMM requires a mix or all of the following — support by the host chipset, software optimization of the OS and applications, and optimization for the server hardware. Systems deploying PM-DIMMs will also require the installation of standard DRAM-based DIMMs to complement the PM-DIMMs. This will be necessary to provide the operating system and applications with an area of memory capable of sustaining frequent high-speed write accesses. To achieve greater adoption, support will be required across a wide range of server vendors, operating systems and applications. In addition, use cases for persistent memory technologies will need to spread beyond the extremely high-performance, high-bandwidth and ultra-low-latency applications for which they are attracting most interest today.

This technology has faced a number of challenges and has not yet reached maturity. Memory technologies such as 3D XPoint will replace the use of NAND flash on PM-DIMMs; however, these devices have not long been commercially available. The benefit of 3D-XPoint-based PM-DIMMs is that they can achieve higher capacity of current DRAM DIMMs and at a lower $/GB. While Intel has added support for this technology in its 2nd Gen Xeon Scalable Processor (SP), to utilize the persistence and exploit the performance requires ecosystem support, especially from software vendors. For this reason, this technology is currently in the trough.

User Advice: IT professionals should analyze their workloads to determine software vendor support for PM-DIMMs and the performance and memory capacity demands. Major software vendors, such as SAP and Oracle, have recently implemented support for PM-DIMMs and Google has also announced the intent to leverage PM-DIMMs in its cloud services. Since this technology is still nascent, users must assess the roadmaps of the major server and storage OEMs along with those of the SSD appliance vendors that will be launching DIMM-based storage systems. When evaluating PM-DIMM deployments consideration should also be given to TCO comparisons

Gartner, Inc. | 441602 Page 27/45 between nonvolatile memory and DRAMs, especially with the dynamic pricing fluctuations in the DRAM market.

In their evaluations, IT professionals should be aware that specific versions of servers and their firmware, applications, operating systems and drivers will be required to support PM-DIMMs. In addition, Intel’s 3D-XPoint-based PM-DIMMs are only currently compatible with servers that deploy the second generation or later Xeon SPs.

Business Impact: This technology’s impact on users will improve overall system performance. The specific workloads expected to see early adoption are in the in-memory computing, virtualization, analytics, AI and HPC segments. There may also be an impact on traditional storage subsystems as applications are rearchitected to take advantage of large amounts of nonvolatile memory accessible as part of the main server system memory.

Benefit Rating: Moderate

Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Sample Vendors: Dell EMC; Formulus Black; Google; Hewlett Packard Enterprise; Huawei; Intel; Lenovo; NetApp; Oracle; SAP

Recommended Reading: “Determining the Data Center Opportunity Created for 3D XPoint Persistent Memory”

“Top 10 Technologies That Will Drive the Future of Infrastructure and Operations”

“Predicts 2020: Semiconductor Technology in 2030”

“Vendor Rating: Intel”

“Forecast Analysis: NAND Flash, Worldwide, 1Q20 Update”

Storage-Class Memory SSDs Analysis By: Alan Priestley; Joseph Unsworth

Definition: Storage-class memory (SCM) solid-state drives (SSDs) are a new class of memory technology providing nonvolatile memory (byte- or block-addressable) with access speeds close to that of traditional DRAM-based memory modules. SCM refers to the application of emerging memory technology for storage applications, which differs from persistent memory that interfaces with the main DDR memory channels in a compute environment.

Position and Adoption Speed Justification: Increasing dataset sizes are driving demand for large- density memory systems. While DRAM density increases on a regular cadence, cost per bit remains

Gartner, Inc. | 441602 Page 28/45 more than 10 times more than current nonvolatile memory technologies (such as NAND flash) and it cannot retain its contents without supercapacitors or battery backup.

While introduction of SCM SSDs to storage arrays was announced by leading vendors over three years ago, it was only during 2H19 that SCM started to become generally available to customers. SCM has manifested itself in storage environments in three ways: a small cache directly on the SSD, being used as cache in the storage array and being used as its own tier of storage. All of these approaches are in complement with existing flash/SSD technology and while 100% SCM SSD array is technically feasible today it would be prohibitively expensive for all but the extremely performant workloads.

To date, SCM-based SSDs all use NVME PCIe for maximum throughput and low latency. Currently, the underlying technology has typically been Intel’s 3D XPoint, but STT-MRAM (from Everspin) has also been used — mostly as a low-density SSD cache. A new category of high-performance NAND flash has emerged to rival emerging memory technologies called fast NAND. Samsung has its Z- NAND technology and Toshiba has its XL-FLASH NAND technology, which is a modified SLC NAND flash memory to achieve greater performance and reliability, albeit at substantially higher cost than conventional NAND technologies.

User Advice: Given the superior attributes of SCM, the technology can provide sustained, consistently low latency and high bandwidth for extreme performance workloads where persistence and high availability are critical. SCM is also likely to be used with flash-based technology, such as QLC, where it can be used to complement the less performant and less reliable QLC flash technology to enhance overall system performance. I&O leaders must understand the application workload requirements and the return on investment with SCM-based solutions in order to justify the cost/performance premiums. Users must also assess the potential SCM performance impact of disaster recovery (DR) sync replication, which may negate SCM benefits. Furthermore, I&O leaders must have a full application infrastructure view before deploying the technology to ensure that there are no infrastructure and application bottlenecks.

Business Impact: SCM will be compelling for the very high performance application workloads such as big data analytics for real-time analytic processing, interactive and immersive applications and in-memory databases, AI/ML workloads and other workloads where performance premiums can be justified. However, these workloads will also see benefit from DIMM-based persistent memory installed in the processors’ main memory array, which may offer a compelling alternative to SCM for many workloads. Early adoption will be in key verticals such as finance, government, natural resources, biomedical and other select verticals where extreme performance is essential.

Benefit Rating: High

Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Gartner, Inc. | 441602 Page 29/45 Sample Vendors: ; Hewlett Packard Enterprise; IBM; Intel; Micron; NetApp; Pure Storage; VAST Data

Recommended Reading: “2019 Strategic Roadmap for Storage”

“Critical Capabilities for Solid-State Arrays”

“Market Guide for Hybrid Cloud Storage”

Distributed File Systems Analysis By: Julia Palmer; Chandra Mukhyala

Definition: Distributed file systems storage uses a single parallel file system to cluster multiple storage nodes together, presenting a single namespace and storage pool to provide aggregated bandwidth for multiple hosts in parallel. Data is distributed over multiple nodes in the cluster to handle availability and data protection in a self-healing manner. Distributed file system can be expanded by adding new nodes, increasing cluster capacity and throughput in a linear manner.

Position and Adoption Speed Justification: Accelerated growth of existing unstructured datasets as well as introduction of the new file based workloads is bringing distributed scale-out storage architectures to the forefront of IT infrastructure planning. Storage vendors are continuing to develop distributed file systems to address performance and scalability limitations in traditional, scale-up, network-attached storage (NAS) environments. This makes them suitable for batch and interactive processing, and other high-bandwidth workloads. Apart from academic high- performance computing (HPC) environments, vertical industries (such as oil and gas, financial services, media and entertainment, life sciences, etc.) are leading adopters of distributed file systems for applications that require highly scalable file bandwidth and capacity.

Beyond the HPC use case, rich media streaming, analytics, content distribution, collaboration, backup and archiving are other common use cases for distributed file systems. Built on a distributed metadata architecture, distributed file systems provide resilience at the software layer, and do not require proprietary high availability hardware. IT leaders are also considering distributed file systems to enable interoperability between on-premises and public cloud IaaS storage. This will enable new use cases that leverage public cloud computing and share application data across edge, core and cloud deployments. Some products in this market continue to be tied to the legacy roots from which they began. Moreover, they also continue the extraordinary burdens of complicated management and deployment overhead, while more modern product offerings are focusing on improved manageability and ease of use. Vendors are also increasingly starting to offer software-based deployment options in a capacity-based perpetual licensing model, or with subscription-based licensing, to stimulate market adoption.

User Advice: Distributed file systems have been around for decades, although vendor maturity varies widely. Users who need file based products that enable them to pay as they grow in a highly dynamic environment, or who need high bandwidth or low latency for shared storage, should put

Gartner, Inc. | 441602 Page 30/45 distributed file systems on their shortlist. Most commercial and open-source products specialize in tackling specific use cases, but integration with application workflows may be lacking in several products. Select distributed file system storage products based on their interoperability with the ISV solutions that are dominant in your environment. Validate all performance claims with proof-of- concept deployments, given that performance varies greatly by protocol type and file sizes. Prioritize products with software-defined storage capabilities, versus file systems requiring a ECB as the underlying storage. This software-defined approach will enable you to extend distributed file systems to the public cloud and edge deployments. Shortlist vendors with the ability to run natively in the public cloud and that enable hybrid cloud storage deployments with bidirectional tiering, as this emerging paradigm is experiencing positive, early traction with enterprises.

Business Impact: Distributed file systems are based on scale-out architecture alternatives to traditional scale-up NAS architectures. Unlike NAS, they scale storage bandwidth and capacity more linearly, surpassing expensive monolithic frame storage arrays in this capability. The business impact of distributed file systems is most pronounced in environments in which applications or devices generate large amounts of unstructured data, and where the primary access is through file protocols. However, they will also have an increasing impact on traditional data centers that want to overcome the limitations of dual-controller NAS storage designs, as well as for use cases, such as backup and archiving. Many of the file systems products are being deployed as software-only products on top of industry-standard x86 server hardware, which has the potential to have lower TCO compared to storage arrays. Many distributed file systems will have a significant impact on private cloud services, which require a highly scalable, resilient and elastic infrastructure. IT professionals keen to consolidate file servers or NAS file sprawl should consider using distributed file system storage products that offer operational simplicity and nearly linear scalability.

Benefit Rating: High

Market Penetration: 20% to 50% of target audience

Maturity: Early mainstream

Sample Vendors: Cohesity; Dell EMC; Huawei; IBM; Inspur; Pure Storage; Qumulo; Red Hat; VAST Data; WekaIO

Recommended Reading: “Critical Capabilities for Distributed File Systems”

“Magic Quadrant for Distributed File Systems and Object Storage”

“The Future of Software-Defined Storage in Data Center, Edge and Hybrid Cloud”

“Prepare Your Storage and Data Management Strategy for the Impact of Artificial Intelligence Workloads”

Hyperconvergence

Gartner, Inc. | 441602 Page 31/45 Analysis By: Philip Dawson; Jeffrey Hewitt

Definition: Hyperconvergence is scale-out software-integrated infrastructure designed for IT leaders seeking operational simplification. Hyperconvergence provides a building block approach to compute, network and storage on standard hardware under unified management. Hyperconvergence vendors build appliances using off-the-shelf infrastructure, engage with system vendors that package software as an appliance, or sell software for use in a reference architecture or certified server. Hyperconvergence may also be delivered as a service or in a public cloud.

Position and Adoption Speed Justification: Hyperconvergence solutions are maturing and adoption is increasing as organizations seek management simplicity. VMware vSAN utilization among VMware ESXi customers, and Storage Spaces Direct utilization among Microsoft Windows Server 2016 and 2019 Datacenter edition customers are on the rise. Nutanix, an early innovator in HCIS appliances, has largely shifted to a software revenue model and continues to increase the number of OEM relationships. Hyperconvergence vendors are achieving certification for more- demanding workloads, including Oracle and SAP, and end users are beginning to consider hyperconvergence as an alternative to integrated infrastructure systems for some workloads. Meanwhile, suppliers are expanding hybrid cloud deployment offerings. Larger clusters are now in use, and midsize organizations are beginning to consider hyperconvergence as the preferred alternative for on-premises infrastructure for block storage. Meanwhile, a growing number of hyperconvergence suppliers are delivering scale-down solutions to address the needs of remote office/branch office (ROBO) and edge environments typically addressed by niche vendors.

User Advice: IT leaders should implement hyperconvergence when agility, modular growth and management simplicity are of greatest importance. The acquisition cost of hyperconvergence may be higher and the resource utilization rate lower than for three-tier architectures, but management efficiency is often superior. Hyperconvergence requires alignment of compute and storage refresh cycles, consolidation of budgets, operations and capacity planning roles, and retraining for organizations still operating separate silos of compute, storage and networking. Adopt for mission- critical workloads, only after developing knowledge with lower-risk deployments, such as test and development. Workload-specific proofs of concept are an important step in meeting the performance needs of applications. Consider the impact on DR and networking. Test under a variety of failure scenarios, as solutions vary greatly in performance under failure, their time to return to a fully protected state and the number of failures they can tolerate. Consider nonappliance options to enable scale-down optimization of resources for high-volume edge deployments. In product evaluations, consider the ability to independently scale storage and compute, retraining costs, and the ability to avoid additional operating system, application, database software and hypervisor license costs. In large deployments, plan for centralized management of multiple smaller clusters. For data center deployments, ensure that clusters are sufficiently large to meet performance and availability requirements during single and double node failures. While servers are perceived as commodities, they differ greatly in terms of power, cooling and floor space requirements, and performance, so evaluate hyperconvergence software on a variety of hardware platforms for lowest total cost of ownership and best performance. Gartner, Inc. | 441602 Page 32/45 Business Impact: The business impact of hyperconvergence is greatest in dynamic organizations with short business planning cycles and long IT planning cycles. Hyperconvergence enables IT leaders to be responsive to new business requirements in a modular, small-increment fashion, avoiding the big-increment upgrades typically found in three-tier infrastructure architectures. Hyperconvergence provides simplified management that decreases the pressure to hire hard-to-find specialists. It will, over time, lead to lower operating costs, especially as hyperconvergence supports a greater share of the compute and storage requirements of the data center. For large organizations, hyperconverged deployments will remain another silo to manage. Hyperconvergence is of particular value to midsize enterprises that can standardize on hyperconvergence and the remote sites of large organizations that need cloudlike management efficiency with on-premises edge infrastructure. As more vendors support public cloud deployments, hyperconvergence will also be a stepping stone toward public cloud agility.

Benefit Rating: High

Market Penetration: 20% to 50% of target audience

Maturity: Mature mainstream

Sample Vendors: Cisco Systems; Dell Technologies; Hewlett Packard Enterprise (HPE); Huawei; Microsoft; Nutanix; Pivot3; Red Hat; Scale Computing; VMware

Recommended Reading: “Magic Quadrant for Hyperconverged Infrastructure”

“Critical Capabilities for Hyperconverged Infrastructure”

“Toolkit: Sample RFP for Hyperconverged Infrastructure”

“The Road to Intelligent Infrastructure and Beyond”

“Use Hyperconverged Infrastructure to Free Staff for Public Cloud Management”

Climbing the Slope Integrated Backup Appliances Analysis By: Chandra Mukhyala; Michael Hoeck

Definition: An integrated backup appliance is an all-in-one backup software and hardware solution that combines the functions of a backup application server, media server (if applicable) and backup target device. The appliance is typically preconfigured and fine-tuned to cater to the capabilities of the onboard backup software. It is a more simplified and easier-to-deploy backup solution than the traditional approach of separate software and hardware installations, but lacks flexibility on hardware choices and, in some cases, scalability.

Position and Adoption Speed Justification: Integrated backup appliances have been around for many years without much fanfare. The current hype is driven by existing large backup software

Gartner, Inc. | 441602 Page 33/45 vendors that have started packaging their software in an appliance, and by innovative emerging vendors offering all-in-one solutions. The momentum of integrated backup appliances is driven by the desire to simplify the setup and management of the backup infrastructure, because complexity is a leading challenge when it comes to backup management. Overall, integrated backup appliances have resonated well with many small and midsize enterprise customers that are attracted by the one-stop-shop support experience and tight integration between software and hardware. Vendors delivering appliances using the scale-out approach removes some of the scalability concerns through this method, and targets midsize to large enterprise customers.

Within the integrated backup appliance market, the former clear segmentation by backup repository limitations has vanished, with all vendors adding cloud target or tiering capabilities.

There are two types of integrated backup appliances:

■ Stand-alone backup appliances: These appliances come with a fixed capacity and performance, and some vendors allow adding additional disk storage to increase capacity. Performance can be increased by upgrading CPU or RAM but cannot be scaled beyond the performance of a single appliance. Most appliances allow tiering to a public cloud, which can provide additional capacity outside the appliance.

■ Scale-out backup appliances: Scale-out appliances use a distributed architecture to scale both backup performance and capacity. These systems typically start as a four-node cluster and can be expanded by adding additional nodes to the cluster to increase both performance and capacity. Some vendors allow running additional applications like analytics, security scanners including third-party applications that leverage the underlying backup data.

User Advice: We recommend:

■ Organizations should first evaluate backup software functions to ensure that their business requirements are met, before deciding about acquiring an integrated backup appliance or a software-only solution.

■ Once a specific backup software product is chosen, deploying an appliance with that software will simplify operational processes and address any compatibility issues and functionality gaps between backup software-only products and deduplication backup target appliances.

■ Customers should keep in mind that integrated appliances can also act as a lock-in for the duration of the useful life of the hardware.

■ If customers prefer deploying backup software-only products to gain hardware flexibility, they should carefully consider which back-end storage to choose — be it generic disk array/network- attached storage (NAS) or deduplication backup target appliances.

Gartner, Inc. | 441602 Page 34/45 Business Impact: Integrated backup appliances ride the current trend of converged infrastructure and offer tight integration between software and hardware, simplify the initial purchase and configuration process, and provide the one-vendor support experience with no finger-pointing risks.

On the downside, an integrated backup appliance tends to lack the flexibility and heterogeneous hardware support offered by backup software-only solutions, which is often needed by large, complex environments.

Benefit Rating: Moderate

Market Penetration: 20% to 50% of target audience

Maturity: Early mainstream

Sample Vendors: Arcserve; Barracuda; Cohesity; Commvault; Dell EMC; Rubrik; Unitrends; Veritas Technologies

Recommended Reading: “Magic Quadrant for Data Center Backup and Recovery Solutions”

“Critical Capabilities for Data Center Backup and Recovery Solutions”

Erasure Coding Analysis By: Chandra Mukhyala; Raj Bala

Definition: Erasure coding is used to rebuild data lost from disk or other hardware failures. Erasure coding splits data blocks into “n” smaller chunks, and then adds “k” chunks of encoded data, in such a way that up to k chunks of lost data can be recovered from any “n+k” chunks. Erasure coding can be performed at a disk or storage node level, and the storage nodes can be spread across geographical locations. The advantage of erasure coding protection over traditional RAID is that it allows for a greater number of simultaneous storage device failures.

Position and Adoption Speed Justification: Hard-disk drive (HDD) capacity is growing faster than HDD data rates. The result is ever-longer rebuild times that increases the probability of experiencing subsequent disk failures before the rebuild has completed. This results in the need for protection schemes that continue to protect data even in the presence of failures (greater fault tolerance) and the focus on reducing rebuild times. Erasure coding algorithms take advantage of inexpensive and rapidly increasing compute power to write blocks of data as systems of equations. These algorithms then transform these systems of equations back into blocks of data during read operations. Modern flash-centric storage arrays that implement coalesced or aggregated writes minimize the performance impact of erasure coding. Allowing the user to specify the number of failures that can be tolerated before data integrity can no longer be guaranteed enables users to trade off data protection overhead (costs) against mean time between data loss (MTBDL). Erasure coding is typically used in large-scale or web-scale systems where the chances of multiple storage drives or even entire storage node failures are common, and the system is expected to withstand such failures.

Gartner, Inc. | 441602 Page 35/45 User Advice: Have vendors profile the performance/throughput of their storage systems supporting your workloads using the various protection schemes that they support with various storage efficiency features (such as compression and deduplication or autotiering) turned on and off to better understand performance-overhead trade-offs. Confirm that the choice of protection scheme does not limit the use of other value-added features. Request minimum/average/maximum rebuild times to size the likely rebuild window of vulnerability in a storage system supporting your production workloads. Cap microprocessor consumption at 75% of available cycles to ensure that the system’s ability to meet service-level objectives is not compromised during rebuilds and microcode updates. Give extra credit to vendors willing to guarantee response times (latency), MTBDL and rebuild times. If erasure coding does not meet your performance requirements, consider traditional RAID or replication as an alternative protection mechanism.

Business Impact: Advanced data protection schemes, like erasure coding, enable deployment of large-scale distributed storage systems while lowering capex. This is possible because erasure coding can withstand simultaneous failure of multiple disks, which is common in large scale systems. In addition, the high speed rebuilding of lost data using erasure coding enables the use of high capacity drives which reduces the storage capacity costs.

Benefit Rating: High

Market Penetration: 20% to 50% of target audience

Maturity: Early mainstream

Sample Vendors: Caringo; Cloudian; DDN; Dell EMC; IBM; NetApp; Panasas; Scality

Recommended Reading: “Magic Quadrant for Data Center Backup and Recovery Solutions”

“Critical Capabilities for Data Center Backup and Recovery Solutions”

Public Cloud Storage Analysis By: Raj Bala

Definition: Public cloud storage is infrastructure as a service (IaaS) that provides block, file and/or object storage services delivered through protocols. The services are stand-alone, but are often used with compute and other IaaS products. They are priced based on capacity, data transfer and/or the number of requests. The services provide on-demand storage and are self-provisioned. Stored data exists in a multitenant environment, and users access that data through block, network and Representational State Transfer (REST) protocols.

Position and Adoption Speed Justification: Public cloud storage is a critical part of most workloads that use public cloud IaaS, even if it’s often invisible to end users. In fact, the default volume type used for virtual machines (VMs) on some providers is solid-state drive (SSD)-based block storage. Unstructured data is frequently stored in object storage services for high-scale, low- cost requirements; however, end users are often unaware of the underlying storage type being

Gartner, Inc. | 441602 Page 36/45 used. The market for public cloud storage is becoming more visible to end users, as cloud providers are beginning to offer more-traditional enterprise brands with data management capabilities of storage systems found on-premises.

User Advice: Do not choose a public cloud storage provider based simply on cost or on your enterprise’s relationship with the provider. The lowest-cost providers may not have the scale and operational capabilities required to become viable businesses that are sustainable over the long term. Moreover, these providers are also unlikely to have the engineering capabilities to innovate at the rapid pace set by the leaders in this market. Upheaval in this market warrants significant consideration of the risks if organizations choose a provider that is not one of the hyperscale vendors, such as Alibaba Cloud, Amazon Web Services (AWS), Google and Microsoft. Many of today’s Tier 2 public cloud storage offerings may not exist in the same form tomorrow, if they exist at all.

Use public cloud storage services when deploying applications in public cloud IaaS environments, particularly those workloads focused on analytics. Match workload characteristics and cost requirements to a provider with equally suitable services.

Business Impact: Public cloud storage services are part of the bedrock that underpins public cloud IaaS. Recent advances in performance, as they relate to these storage services, have enabled enterprises to use cloud IaaS for mission-critical workloads, in addition to new, Mode-2-style applications. The security advances enable enterprises to use public cloud storage services and experience the agility aspects of a utility model, yet retain complete control from an encryption perspective.

Benefit Rating: High

Market Penetration: 20% to 50% of target audience

Maturity: Mature mainstream

Sample Vendors: Alibaba Cloud; Amazon Web Services (AWS); Backblaze; Google; IBM; Microsoft; Oracle; Tencent Cloud; ; Wasabi

Recommended Reading: “Magic Quadrant for Cloud Infrastructure as a Service, Worldwide”

Enterprise Information Archiving Analysis By: Michael Hoeck

Definition: Enterprise information archiving (EIA) solutions provide tools for capturing data into a distributed or centralized repository for organizations to simplify information governance requirements across unstructured and structured data sources. EIA supports multiple data sources, including email, collaboration, IM, file, social media, and voice. EIA expedites access to archived data in the repository using metadata and full-text indexing. EIA tools support operational efficiency, compliance, retention management, e-discovery and end-user access.

Gartner, Inc. | 441602 Page 37/45 Position and Adoption Speed Justification: The number of vendors offering EIA solutions has stabilized in recent years. Organizational compliance and regulatory requirements drive the retention of unstructured data with SaaS-based archiving solutions as the repository of choice. Archiving is becoming mainstream for meeting compliance and e-discovery needs for organizations, and adoption has spread beyond heavily regulated industries.

Email archive remains the primary use case; however, most organizations are looking to vendors with support for multiple communication types. Support for multiple content types is standard for most EIA products using natively developed connectors or integrated third-party components. Requirements for managing collaboration platform data (e.g., MS Teams, Slack and Facebook Workplace) is growing. In financial services, the need to capture and supervise voice content has risen as a key requirement. Support for the capture of social media has become a requirement in regulated industries, including financial services and public government.

User Advice: As requirements to store, search and discover data grows, companies are implementing an EIA solution, often starting with email as the first managed content type. Many organizations are migrating to cloud email and productivity solutions, such as Microsoft 365 and Google G Suite. When migrating, associated compliance and regulatory retention requirements must be considered. Evaluate EIA solutions assessing user experience, use-case requirements and content sources to be archived. EIA use cases are advancing to include records management, in- place data management, analytics and classification abilities.

Organizations must ensure contractually that they have an appropriately priced solution, including support for applicable content sources, e-discovery, supervision and data exporting requirements. Migrating personal stores to the archive should be part of the deployment of an archive system. The migration of legacy email archives, including into and out of a hosted solution, can be complex and expensive, and it should be scoped during the selection phase. The cost of these services remains an inhibitor to switching vendors in some cases. EIA vendors have attempted to improve on this obstacle incorporating these services in their own offerings.

In SaaS-archiving contracts, organizations need to include an exit strategy that minimizes costs and to understand that they own the data, not the SaaS providers. Additionally, focus attention to service-level agreements which obligate SaaS vendors to support the defined performance levels as the archive grows. When determining costs versus benefits for SaaS archiving, include soft expenses associated with on-premises solutions for personnel and IT-involved discovery requests.

Business Impact: EIA delivers improved data access to end users and enables a timely response to audits, legal and other business requests for historical information. Email and e-discovery remain the predominant content type and use case, but additional communication-based data sources and files, and their classification and retention management, are gaining interest as EIA capabilities. In addition, more organizations are seeking to create a holistic information governance strategy, including analytics of all data.

Gartner, Inc. | 441602 Page 38/45 EIA is an important part of e-discovery. Legal hold, retention management, search and export features are used to meet discovery and compliance requirements. Supervision tools for sampling and proactively reviewing messages are available with many EIA products. To meet the requirements of mobile workers, EIA offers a way for organizations to keep data compliant in an archive, while providing access via mobile devices.

SaaS-based message data archiving is typically priced on a per-user, per-month basis, often with no storage overages.

Benefit Rating: High

Market Penetration: 20% to 50% of target audience

Maturity: Early mainstream

Sample Vendors: Archive360; Barracuda; Global Relay; Micro Focus; Microsoft; Mimecast; Proofpoint; Smarsh; Veritas Technologies; ZL Technologies

Recommended Reading: “Design a Business-Driven Archive Strategy”

“Magic Quadrant for Enterprise Information Archiving”

“Critical Capabilities for Enterprise Information Archiving”

Entering the Plateau Continuous Data Protection Analysis By: Santhosh Rao; Jerry Rozeman

Definition: Continuous data protection is an approach to continuously, or nearly continuously, capture and transmit changes to applications, files and/or blocks of data. Depending on solution architecture, real-time changes are journaled or replicated to a local or remote storage target. This capability provides options for more granular recovery point objectives and is used for backup/recovery, disaster recovery and data migration use cases. Some CDP solutions can be configured to capture changes continuously (true CDP) or at scheduled times (near CDP).

Position and Adoption Speed Justification: The difference between near CDP and regular backup is that backup is typically performed once, to only a few — typically no more than four — times a day. However, near CDP is often done every few minutes or hours, providing many more recovery options and minimizing any potential data loss. Several products also provide the ability to heterogeneously replicate and migrate data between two different types of storage devices, allowing for potential cost savings for disaster recovery solutions and data or cloud migrations. Checkpoints of consistent states are used to enable rapid recovery to known good states (such as before a patch was applied to an application or the last time a database was reorganized). This ensures the application consistency of the data and minimizes the number of log transactions that must be applied.

Gartner, Inc. | 441602 Page 39/45 True-CDP and near-CDP capabilities are increasingly integrated with backup software capabilities, but can still be packaged as server-based software, as network-based appliances or as part of a storage controller snapshot implementation. The delineation between frequent snapshots (one to four per hour or less granularity) and near CDP is not crisp. Administrators often implement snapshots and CDP solutions in a near-CDP manner to strike a balance between resource utilization and improved recoverability.

User Advice: Consider CDP for critical data where regular snapshots and/or backups do not enable meeting the required RPOs. Gartner has observed that true-CDP implementations are often used for files, email and laptop data, but adoption for replication and recovery of VMs, databases and other applications is a common use case too. Ensure that the bandwidth requirements of CDP are addressed before implementing the solution.

Business Impact: CDP can dramatically change the way data is protected, decreasing backup and recovery times, as well as reducing the amount of lost data, and can provide additional recovery points. Compared to traditional backup, where the amount of data lost in a restore situation can be nearly 24 hours, data loss can be reduced to hours, minutes or even seconds with CDP. The integration of live-mounting capabilities with CDP technology can even further shorten RTO requirements. CDP can be an effective countermeasure against ransomware.

Benefit Rating: High

Market Penetration: 20% to 50% of target audience

Maturity: Mature mainstream

Sample Vendors: Actifio; Arcserve; Commvault; DataCore Software; Dell EMC; Rubrik; Veeam; Veritas; Zerto

Recommended Reading: “Magic Quadrant for Data Center Backup and Recovery Solutions”

“Critical Capabilities for Data Center Backup and Recovery Solutions”

Appendixes Figure 3. Hype Cycle for Storage and Data Protection Technologies, 2019

Gartner, Inc. | 441602 Page 40/45 Hype Cycle Phases, Benefit Ratings and Maturity Levels Table 1: Hype Cycle Phases

Phase Definition

Innovation A breakthrough, public demonstration, product launch or other event generates Trigger significant press and industry interest.

Peak of During this phase of overenthusiasm and unrealistic projections, a flurry of well- Inflated publicized activity by technology leaders results in some successes, but more Expectations failures, as the technology is pushed to its limits. The only enterprises making money are conference organizers and magazine publishers.

Trough of Because the technology does not live up to its overinflated expectations, it rapidly Disillusionment becomes unfashionable. Media interest wanes, except for a few cautionary tales.

Gartner, Inc. | 441602 Page 41/45 Phase Definition

Slope of Focused experimentation and solid hard work by an increasingly diverse range of Enlightenment organizations lead to a true understanding of the technology’s applicability, risks and benefits. Commercial off-the-shelf methodologies and tools ease the development process.

Plateau of The real-world benefits of the technology are demonstrated and accepted. Tools and Productivity methodologies are increasingly stable as they enter their second and third generations. Growing numbers of organizations feel comfortable with the reduced level of risk; the rapid growth phase of adoption begins. Approximately 20% of the technology’s target audience has adopted or is adopting the technology as it enters this phase.

Years to The time required for the technology to reach the Plateau of Productivity. Mainstream Adoption

Source: Gartner (July 2020) Table 2: Benefit Ratings

Benefit Definition Rating

Transformational Enables new ways of doing business across industries that will result in major shifts in industry dynamics

High Enables new ways of performing horizontal or vertical processes that will result in significantly increased revenue or cost savings for an enterprise

Moderate Provides incremental improvements to established processes that will result in increased revenue or cost savings for an enterprise

Low Slightly improves processes (for example, improved user experience) that will be difficult to translate into increased revenue or cost savings

Source: Gartner (July 2020) Table 3: Maturity Levels

Maturity Status Products/Vendors Level

Gartner, Inc. | 441602 Page 42/45 Maturity Status Products/Vendors Level

Embryonic ■ In labs ■ None

Emerging ■ Commercialization by vendors ■ First generation

■ Pilots and deployments by industry leaders ■ High price

■ Much customization

Adolescent ■ Maturing technology capabilities and process ■ Second generation understanding ■ Less customization ■ Uptake beyond early adopters

Early ■ Proven technology ■ Third generation mainstream ■ Vendors, technology and adoption rapidly ■ More out-of-box evolving methodologies

Mature ■ Robust technology ■ Several dominant mainstream vendors ■ Not much evolution in vendors or technology

Legacy ■ Not appropriate for new developments ■ Maintenance revenue focus ■ Cost of migration constrains replacement

Obsolete ■ Rarely used ■ Used/resale market only

Source: Gartner (July 2020) Evidence End-user inquiries

Gartner, Inc. | 441602 Page 43/45 Vendor briefings

Document Revision History Hype Cycle for Storage and Data Protection Technologies, 2019 - 11 July 2019

Hype Cycle for Storage Technologies, 2018 - 13 July 2018

Hype Cycle for Storage Technologies, 2017 - 19 July 2017

Hype Cycle for Storage Technologies, 2016 - 5 July 2016

Hype Cycle for Storage Technologies, 2015 - 13 July 2015

Hype Cycle for Storage Technologies, 2014 - 23 July 2014

Hype Cycle for Storage Technologies, 2013 - 24 July 2013

Hype Cycle for Storage Technologies, 2012 - 5 July 2012

Hype Cycle for Storage Technologies, 2011 - 26 July 2011

Hype Cycle for Storage Technologies, 2010 - 13 July 2010

Hype Cycle for Storage Hardware Technologies, 2009 - 17 July 2009

Hype Cycle for Storage Hardware Technologies, 2008 - 11 June 2008

Hype Cycle for Storage Hardware Technologies, 2007 - 19 June 2007

Hype Cycle for Storage Hardware Technologies, 2006 - 24 August 2006

Recommended by the Author Understanding Gartner’s Hype Cycles 2019 Strategic Roadmap for Storage The Future of Software-Defined Storage in Data Center, Edge and Hybrid Cloud Magic Quadrant for Primary Storage Magic Quadrant for Distributed File Systems and Object Storage Magic Quadrant for Hyperconverged Infrastructure Prepare Your Storage and Data Management Strategy for the Impact of Artificial Intelligence Workloads Magic Quadrant for Data Center Backup and Recovery Solutions An I&O Leader’s Guide to Storage for Containerized Workloads Market Guide for Hybrid Cloud Storage Tool: RFP for Primary Storage and Distributed File Systems and Object Storage

Gartner, Inc. | 441602 Page 44/45 Recommended For You Critical Capabilities for Object Storage Market Guide for Servers Critical Capabilities for Solid-State Arrays 2020 Strategic Roadmap for Storage Gartner Peer Insights ‘Voice of the Customer’: General-Purpose Disk Arrays

© 2020 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. This publication may not be reproduced or distributed in any form without Gartner's prior written permission. It consists of the opinions of Gartner's research organization, which should not be construed as statements of fact. While the information contained in this publication has been obtained from sources believed to be reliable, Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. Although Gartner research may address legal and financial issues, Gartner does not provide legal or investment advice and its research should not be construed or used as such. Your access and use of this publication are governed by Gartner’s Usage Policy. Gartner prides itself on its reputation for independence and objectivity. Its research is produced independently by its research organization without input or influence from any third party. For further information, see "Guiding Principles on Independence and Objectivity."

About Gartner Careers Newsroom Policies Privacy Policy Contact Us Site Index Help Get the App

© 2020 Gartner, Inc. and/or its Affiliates. All rights reserved.

Gartner, Inc. | 441602 Page 45/45