<<

Tech Insights : In- Memory Computing

Office of Technology Strategies (TS) / Architecture, Strategy, and Design (ASD) V3 Issue 10 October 2016

The TS office within OI&T’s ASD, interacts not only with the Enterprise Architecture pillar offices, but also with multiple external ven- dors, stakeholders within OI&T, and with strategic offices across the enterprise. TS works closely with IT and business owners to capture business rules and provide technical guidance as it relates to Sharing across the enterprise, specifi- cally for interagency operability.

Introduction Depending on the scale of operations and the faster and efficiently using IMC, enterprises

Whether virtualized in the cloud or located on business needs supporting its use, IMC can be can conserve two critical resources: capital physical server banks, every enterprise must incorporated into an existing enterprise archi- and labor time (i.e., productivity). decide how it will store data to achieve its tecture (EA) as Platform-as-a-Service (PaaS), a The resources conserved are compounded as business goals. One solution is the enterprise model that provides a an enterprise’s data collection increases in adoption of In-Memory Computing (IMC). In platform for creating Web applications over scale. For larger, more geographically dis- this Tech Insight, you will learn about the ben- the Internet; or IMC can be newly installed as persed enterprises with numerous data- efits that IMC brings to an enterprise; two Infrastructure-as-a-Service (IaaS), which pro- collecting users and devices, there often ap- common platforms that support IMC; and how vides virtual resources over the pears a need for dedicated and accessible IMC strengthens the computing environment Internet, such as virtual server space and net- data storage on a physical or virtual server. by utilizing the primary memory of a comput- work connections. Both methods grant an The results of relying on the disk memory of er, Memory (RAM). enterprise the capability of conducting data these servers to process data are untimely processing in an environment that is outside In-Memory Computing (IMC) and resource-intensive operations from both the data repository’s memory disk storage. a labor and productivity standpoint. IMC is a data processing operation that utiliz- In contrast, data that is stored on a disk that is es a computer’s Random-Access Memory For such enterprises, IMC ensures there is a inside a centralized data repository will re- (RAM) to process data. RAM is the main cost-effective and efficient means to process quire significantly more time to access, query, memory component of , servers, data. A user only needs to pull the data set and retrieve for processing than data stored and Internet of Things (IoT) computing devices they are analyzing into the IMC platform or on an IMC platform. The comparatively slower that holds data temporarily while the comput- infrastructure to utilize the RAM for data pro- processing time and lessened processing ca- er is in use. RAM is often referred to as prima- cessing. These data sets can be up to several pabilities of disc storage results in unneces- ry memory, to distinguish it from secondary terabytes in size. Because very large datasets sary expended time and resources. This com- storage devices, such as a hard drive, that can be placed onto an IMC platform or infra- parison holds true regardless of whether the stores data permanently when you save it. structure, users can conduct extensive data data is stored to a physical server or a virtual- RAM operates at high speed because the data processing. Most often, a user would utilize ized cloud server. is quickly accessible. Thus, IMC’s utilization of IMC to process real-time data, or to conduct RAM for data processing offers users a fast, Benefits of IMC predictive analytics, such as tracking web- efficient, and cost-effective means to meet metrics for user-facing Internet pages for user By performing data processing operations data processing needs. 1 behavior reports and content downloads.

Tech Insights : In- Memory Computing

Office of Technology Strategies (TS) / Architecture, Strategy, and Design (ASD) V3 Issue 10 October 2016

MIddleware RAM data storage is volatile, meaning the stored data is only main-

Middleware is a term that classifies the that acts as an inter- tained until there is a disruption in power. When the user powers off mediate bridging layer between user applications and an operating their computer, the data stored on their RAM is automatically deleted. system. Developers most often use this term to refer to the software This frees the space of their RAM for another session. As discussed that connects independent computers together in a cumulative net- earlier, data is only temporarily stored on a user’s RAM, which allows work, known as a distributed network. In the case of an IMC, middle- the user to rely on RAM to conduct repeated in-session, fast-paced ware connects the enterprise data storage infrastructure with a us- IMC data processing operations. er’s data processing software applications. As a piece of hardware, RAM is an integrated circuit of silicon transis-

In-Memory Data Grids (IMDGs) represent one form of middleware tors and capacitors, organized rationally in rows and columns into a application that supports IMC. IMDGs support data processing grid of memory cells that allows for easy storage, access, processing, through distributed computing by incorporating each computer’s and deletion. Each is comprised of a transistor and a ca- RAM, in addition to a virtual cloud or server network. The RAM from pacitor, operating in unison. In this two-dimensional grid, a single bit of each of these servers and devices works together to provide a data is stored at individual intersections, or addresses, of rows and platform that supports IMC. Therefore, IMDGs do not require signifi- columns, similar to an Excel spreadsheet. Regardless of the bit of cant infrastructure investments to support IMC, as they can be built data’s sequential address on the grid, a computer can retrieve or over the existing EA. Enterprises can even procure commercial off- change it. In this way, data is easily processed, queried, and trans- the-shelf (COTS) IMDGs products from vendors for minimal install- formed. ment and operating costs. One example of a COTS product is Both the rational organization and volatile properties of RAM enable Apache’s Hadoop -source software. the hardware to continuously provide fast and efficient data pro-

A second middleware platform that enables the enterprise adoption cessing. of IMC is In-Memory Data Base Management System (IMDBMS). IM- DBMSs are management systems designed to create and operate platforms to support IMC. There are several IMDBMS COTS products available, including SAP HANA, that combine the function of a data repository with application services that support the develop- ment and deployment of data processing applications that utilize the RAM that is inside the computing devices inside an enterprise’s entire computing environment. With built-in data processing applications, Figure 1: Traditional Computing v. In-Memory Computing Conclusion these COTS products deliver user-friendly data processing capabilities Although the use of IMC spans over a decade, it is recently thriving as a to users across enterprises. Like IMDGs, IMDBMSs require minimal result of reductions in RAM cost, coupled with simultaneous increases installment and operating costs. in RAM capacity. IMC offers a strong alternative to traditional data RAM storage and data processing, as it continues to provide a cost effective

Although there are numerous distinctions between both middleware and efficient means to process data. platforms, each relies on utilizing the computer or server’s RAM to more technology topics in the Office of Technology Strategies’ process enterprise data. Further, both middleware platforms utilize Tech Insights and Enterprise Design Patterns. If you have any questions RAM storage features and hardware to achieve efficient and timely about , don’t hesitate to ask TS for assistance. data processing.

2