Migrating to In-Memory Computing for Modern, Cloud-Native Workloads
Total Page:16
File Type:pdf, Size:1020Kb
Migrating to in-memory computing for modern, cloud-native workloads WebSphere eXtreme Scale and Hazelcast In-Memory Computing Platform for IBM Cloud® Paks Migrating to in-memory computing for modern, cloud-native workloads Introduction Historically, single instance, multi-user applications Cloud-based application installations, using were sufficient to meet most commercial workloads. virtualization technology, offer the ability to scale With data access latency being a small fraction environments dynamically to meet demand of the overall execution time, it was not a primary peaks and valleys, optimizing computation concern. Over time, new requirements from costs to workload requirements. These dynamic applications like e-commerce, global supply chains environments further drive the need for and automated business processes required a in-memory data storage with extremely much higher level of computation and drove the fast, distributed, scalable data sharing. development of distributed processing using Cloud instances come and go as load dictates, clustered groups of applications. so immediate access to data upon instance creation and no data loss upon instance A distributed application architecture scales removal are paramount. Independently scaled, horizontally, taking advantage of ever-faster in-memory data storage meets all of these processors and networking technology, but with needs and enables additional processing it, data synchronization and coordinated access capabilities as well. become a separate, complex system. Centralized storage area network (SAN) systems became Data replication and synchronization between common, but as computational speeds continued cloud locations enable globally distributed to advance, the latencies of disk-based storage cloud environments to function in active/ and retrieval quickly became a significant active configurations for load balancing and bottleneck. In-memory data storage systems were fault tolerance. These capabilities require the developed to provide distributed, fault-tolerant, fastest transfer speeds possible. With advanced all-in-memory data access, which is orders of networking technology and memory-based magnitude faster than disk-based systems. data storage, previously impossible transfer throughput with extremely low latency can Moving to cloud be achieved. Mobile technology, ubiquitous networks, automated As organizations begin application modernization “smart” devices and myriad other new technologies projects and migrations from on-premises have changed the way users and applications installations to cloud architectures, the interact with each other. These technologies are infrastructure must also evolve to support cloud- constantly connected and always on, sending infinite native capabilities. IBM Cloud Pak® solutions streams of unique data events and expecting a are software stacks that help businesses deploy real-time response. These massively parallel, enterprise systems into any cloud environment. high-speed data interaction requirements are IBM Cloud Paks are built on Red Hat® OpenShift®, fueling the migration of applications to Kubernetes, containers and supporting software cloud-native architectures. to run enterprise applications. Migrating to in-memory computing for modern, cloud-native workloads Migrating to cloud-native in-memory computing IBM WebSphere® eXtreme Scale is an elastic, replicas created on other nodes for fault tolerance. scalable in-memory data grid designed for on- Failure detection and recovery are automatic. Each premises environments and does not support cloud system scales dynamically by adding or removing environments using Kubernetes and containers. To nodes to the cluster. support the migration of applications to the cloud, organizations will need to migrate to a solution The Hazelcast solution from IBM starts with the that supports cloud-native environments to meet distributed data grid, Hazelcast IMDG, and adds application performance expectations. Hazelcast Jet, the component that is a state-of- the-art, real-time stream processing engine. Jet Hazelcast In-Memory Computing Platform natively leverages all of the capabilities found in the Hazelcast in-memory storage layer, adding advanced Hazelcast In-Memory Computing Platform for stream processing capabilities to meet today’s IBM Cloud Paks is a fast, flexible and cloud-native newest workloads. For example, built-in time window distributed application platform that is certified for management correlates event streams for continuous deployment with IBM Cloud Paks and is comprised value aggregations and calculations. Streams can be of two projects/components: merged with other streams and enriched with data from other sources. And events can be transformed • Hazelcast In-Memory Data Grid (IMDG) for from one form to another, among other capabilities, in-memory data storage and data-local all in real time. distributed processing Machine learning (ML) has also brought capabilities • Hazelcast Jet for high throughput, low latency that were previously impossible for automated stream processing computer processing systems. Operationalizing this technology in production, however, can be extremely The platform supports application modernization complex and time consuming. Hazelcast solves this projects with capabilities designed for cloud problem by enabling ML models to be run directly deployments. Organizations can expand innovation within the stream processing engine with “inference for new applications through advanced features and runners.” These runners allow Java, Python and C/ capabilities not found in other in-memory storage C++ ML models to run in real time, taking advantage technology. Containerization and orchestration of all the in-memory, distributed capabilities of the management are fully supported on many platforms, Hazelcast solution. New versions of ML models can be including Red Hat OpenShift. loaded to replace older versions without downtime. At a high level, Hazelcast and eXtreme Scale both In summary, the Hazelcast In-Memory Computing use clusters of servers with partitions (shards) Platform for IBM Cloud Paks combines an innovative of memory distributed across all the cluster in-memory data storage layer with its third- members (nodes). With both approaches, each generation stream processing engine to create an node is the primary server for the data contained advanced in-memory processing platform. in partitions owned by that node, with backup Migrating to in-memory computing for modern, cloud-native workloads Technology comparison: Hazelcast and eXtreme Scale Operational features Table 1 lists supported cluster-wide operational features that contribute to ease of use, fault tolerance, scaling and flexibility for both platforms. Hazelcast 4.2 eXtreme Scale 8.6 Topology App-embedded client/server WebSphere client/server Bare metal, virtualized Installation platform Bare metal, virtualized containers, Kubernetes, cloud Cluster replication • • Elastic • • 100 GB per Instance • • Elastic • • Fault tolerance • • Disk persistence • • DB persistence • • Certified for IBM Cloud Paks • • Docker • Kubernetes • Eureka • Apache jclouds • Cloud native • OpenShift • Split brain protection • Quorum • • Serialization Multiple built-in and third party Java and XDF JSON Support • Table 1. Operational features comparison for Hazelcast 4.2 and eXtreme Scale 8.6 Migrating to in-memory computing for modern, cloud-native workloads Distributed structures The CAP theorem states simply that a distributed recently implemented a CP subsystem using the system must be able to tolerate a network partition (P) RAFT protocol to offer optional CP distributed and can be either available (A) or consistent (C), but data structures. These are noted in the Distributed not both (i.e., AP or CP). eXtreme Scale and traditional Structures (CP) table. Hazelcast structures are all of type AP. Hazelcast Hazelcast 4.2 eXtreme Scale 8.6 Map • • Multi-Map • Bare metal, virtualized Replicated Map • • Set • • List • • Queue • • Reliable Topic • • Topic • • Ring Buffer • • Flake ID Generator • • CRDT PN Counter • Table 2. AP-style distributed data structures supported by Hazelcast 4.2 and eXtreme Scale 8.6 Distributed structures (CP) Distributed structures (CP) CP-style distributed data structures, as noted earlier, CP-style distributed data structures, as noted earlier, are guaranteed to maintain data consistency across are guaranteed to maintain data consistency across their copies on a configurable number of grid nodes. their copies on a configurable number of grid nodes. This approach is necessary for some use cases that This approach is necessary for some use cases that require certain guarantees, such as uniqueness or require certain guarantees, such as uniqueness or monotonic progression, among others. See Table 3 for monotonic progression, among others. See Table 3 for supported CP-style data structures. supported CP-style data structures. Hazelcast 4.2 eXtreme Scale 8.6 Fenced Lock / Sem • Atomic Long • Count Down Latch • Atomic Reference • Table 3. CP-style distributed data structures supported by Hazelcast 4.2 and eXtreme Scale 8.6 Migrating to in-memory computing for modern, cloud-native workloads Distributed computation Distributed, in-memory data storage allows various parallel, distributed nature of the storage layer for calculations to be performed across data sets performance. Additionally, event-driven computations, held in the storage layer. For example, aggregating such as continuous query and