In-Memory Computing (IMC)

Total Page:16

File Type:pdf, Size:1020Kb

In-Memory Computing (IMC) Tech Insights : In- Memory Computing Office of Technology Strategies (TS) / Architecture, Strategy, and Design (ASD) V3 Issue 10 October 2016 The TS office within OI&T’s ASD, interacts not only with the Enterprise Architecture pillar offices, but also with multiple external ven- dors, stakeholders within OI&T, and with strategic offices across the enterprise. TS works closely with IT and business owners to capture business rules and provide technical guidance as it relates to Data Sharing across the enterprise, specifi- cally for interagency operability. Introduction Depending on the scale of operations and the faster and efficiently using IMC, enterprises Whether virtualized in the cloud or located on business needs supporting its use, IMC can be can conserve two critical resources: capital physical server banks, every enterprise must incorporated into an existing enterprise archi- and labor time (i.e., productivity). decide how it will store data to achieve its tecture (EA) as Platform-as-a-Service (PaaS), a The resources conserved are compounded as business goals. One solution is the enterprise cloud computing model that provides a an enterprise’s data collection increases in adoption of In-Memory Computing (IMC). In platform for creating Web applications over scale. For larger, more geographically dis- this Tech Insight, you will learn about the ben- the Internet; or IMC can be newly installed as persed enterprises with numerous data- efits that IMC brings to an enterprise; two Infrastructure-as-a-Service (IaaS), which pro- collecting users and devices, there often ap- common platforms that support IMC; and how vides virtual computer resources over the pears a need for dedicated and accessible IMC strengthens the computing environment Internet, such as virtual server space and net- data storage on a physical or virtual server. by utilizing the primary memory of a comput- work connections. Both methods grant an The results of relying on the disk memory of er, Random Access Memory (RAM). enterprise the capability of conducting data these servers to process data are untimely processing in an environment that is outside In-Memory Computing (IMC) and resource-intensive operations from both the data repository’s memory disk storage. a labor and productivity standpoint. IMC is a data processing operation that utiliz- In contrast, data that is stored on a disk that is es a computer’s Random-Access Memory For such enterprises, IMC ensures there is a inside a centralized data repository will re- (RAM) to process data. RAM is the main cost-effective and efficient means to process quire significantly more time to access, query, memory component of computers, servers, data. A user only needs to pull the data set and retrieve for processing than data stored and Internet of Things (IoT) computing devices they are analyzing into the IMC platform or on an IMC platform. The comparatively slower that holds data temporarily while the comput- infrastructure to utilize the RAM for data pro- processing time and lessened processing ca- er is in use. RAM is often referred to as prima- cessing. These data sets can be up to several pabilities of disc storage results in unneces- ry memory, to distinguish it from secondary terabytes in size. Because very large datasets sary expended time and resources. This com- storage devices, such as a hard drive, that can be placed onto an IMC platform or infra- parison holds true regardless of whether the stores data permanently when you save it. structure, users can conduct extensive data data is stored to a physical server or a virtual- RAM operates at high speed because the data processing. Most often, a user would utilize ized cloud server. is quickly accessible. Thus, IMC’s utilization of IMC to process real-time data, or to conduct RAM for data processing offers users a fast, Benefits of IMC predictive analytics, such as tracking web- efficient, and cost-effective means to meet metrics for user-facing Internet pages for user By performing data processing operations data processing needs. 1 behavior reports and content downloads. Tech Insights : In- Memory Computing Office of Technology Strategies (TS) / Architecture, Strategy, and Design (ASD) V3 Issue 10 October 2016 MIddleware RAM data storage is volatile, meaning the stored data is only main- Middleware is a term that classifies the software that acts as an inter- tained until there is a disruption in power. When the user powers off mediate bridging layer between user applications and an operating their computer, the data stored on their RAM is automatically deleted. system. Developers most often use this term to refer to the software This frees the space of their RAM for another session. As discussed that connects independent computers together in a cumulative net- earlier, data is only temporarily stored on a user’s RAM, which allows work, known as a distributed network. In the case of an IMC, middle- the user to rely on RAM to conduct repeated in-session, fast-paced ware connects the enterprise data storage infrastructure with a us- IMC data processing operations. er’s data processing software applications. As a piece of hardware, RAM is an integrated circuit of silicon transis- In-Memory Data Grids (IMDGs) represent one form of middleware tors and capacitors, organized rationally in rows and columns into a application that supports IMC. IMDGs support data processing grid of memory cells that allows for easy storage, access, processing, through distributed computing by incorporating each computer’s and deletion. Each memory cell is comprised of a transistor and a ca- RAM, in addition to a virtual cloud or server network. The RAM from pacitor, operating in unison. In this two-dimensional grid, a single bit of each of these servers and devices works together to provide a data is stored at individual intersections, or addresses, of rows and platform that supports IMC. Therefore, IMDGs do not require signifi- columns, similar to an Excel spreadsheet. Regardless of the bit of cant infrastructure investments to support IMC, as they can be built data’s sequential address on the grid, a computer can retrieve or over the existing EA. Enterprises can even procure commercial off- change it. In this way, data is easily processed, queried, and trans- the-shelf (COTS) IMDGs products from vendors for minimal install- formed. ment and operating costs. One example of a COTS product is Both the rational organization and volatile properties of RAM enable Apache’s Hadoop open-source software. the hardware to continuously provide fast and efficient data pro- A second middleware platform that enables the enterprise adoption cessing. of IMC is In-Memory Data Base Management System (IMDBMS). IM- DBMSs are database management systems designed to create and operate platforms to support IMC. There are several IMDBMS COTS products available, including SAP HANA, that combine the function of a data repository with application services that support the develop- ment and deployment of data processing applications that utilize the RAM that is inside the computing devices inside an enterprise’s entire computing environment. With built-in data processing applications, Figure 1: Traditional Computing v. In-Memory Computing Conclusion these COTS products deliver user-friendly data processing capabilities Although the use of IMC spans over a decade, it is recently thriving as a to users across enterprises. Like IMDGs, IMDBMSs require minimal result of reductions in RAM cost, coupled with simultaneous increases installment and operating costs. in RAM capacity. IMC offers a strong alternative to traditional data RAM storage and data processing, as it continues to provide a cost effective Although there are numerous distinctions between both middleware and efficient means to process data. platforms, each relies on utilizing the computer or server’s RAM to Read more technology topics in the Office of Technology Strategies’ process enterprise data. Further, both middleware platforms utilize Tech Insights and Enterprise Design Patterns. If you have any questions RAM storage features and hardware to achieve efficient and timely about data transformation, don’t hesitate to ask TS for assistance. data processing. 2 .
Recommended publications
  • Hard Disk Drives
    37 Hard Disk Drives The last chapter introduced the general concept of an I/O device and showed you how the OS might interact with such a beast. In this chapter, we dive into more detail about one device in particular: the hard disk drive. These drives have been the main form of persistent data storage in computer systems for decades and much of the development of file sys- tem technology (coming soon) is predicated on their behavior. Thus, it is worth understanding the details of a disk’s operation before building the file system software that manages it. Many of these details are avail- able in excellent papers by Ruemmler and Wilkes [RW92] and Anderson, Dykes, and Riedel [ADR03]. CRUX: HOW TO STORE AND ACCESS DATA ON DISK How do modern hard-disk drives store data? What is the interface? How is the data actually laid out and accessed? How does disk schedul- ing improve performance? 37.1 The Interface Let’s start by understanding the interface to a modern disk drive. The basic interface for all modern drives is straightforward. The drive consists of a large number of sectors (512-byte blocks), each of which can be read or written. The sectors are numbered from 0 to n − 1 on a disk with n sectors. Thus, we can view the disk as an array of sectors; 0 to n − 1 is thus the address space of the drive. Multi-sector operations are possible; indeed, many file systems will read or write 4KB at a time (or more). However, when updating the disk, the only guarantee drive manufacturers make is that a single 512-byte write is atomic (i.e., it will either complete in its entirety or it won’t com- plete at all); thus, if an untimely power loss occurs, only a portion of a larger write may complete (sometimes called a torn write).
    [Show full text]
  • Nasdeluxe Z-Series
    NASdeluxe Z-Series Benefit from scalable ZFS data storage By partnering with Starline and with Starline Computer’s NASdeluxe Open-E, you receive highly efficient Z-series and Open-E JovianDSS. This and reliable storage solutions that software-defined storage solution is offer: Enhanced Storage Performance well-suited for a wide range of applica- tions. It caters perfectly to the needs • Great adaptability Tiered RAM and SSD cache of enterprises that are looking to de- • Tiered and all-flash storage Data integrity check ploy a flexible storage configuration systems which can be expanded to a high avail- Data compression and in-line • High IOPS through RAM and SSD ability cluster. Starline and Open-E can data deduplication caching look back on a strategic partnership of Thin provisioning and unlimited • Superb expandability with more than 10 years. As the first part- number of snapshots and clones ner with a Gold partnership level, Star- Starline’s high-density JBODs – line has always been working hand in without downtime Simplified management hand with Open-E to develop and de- Flexible scalability liver innovative data storage solutions. Starline’s NASdeluxe Z-Series offers In fact, Starline supports worldwide not only great features, but also great Hardware independence enterprises in managing and pro- flexibility – thanks to its modular archi- tecting their storage, with over 2,800 tecture. Open-E installations to date. www.starline.de Z-Series But even with a standard configuration with nearline HDDs IOPS and SSDs for caching, you will be able to achieve high IOPS 250 000 at a reasonable cost.
    [Show full text]
  • Use External Storage Devices Like Pen Drives, Cds, and Dvds
    External Intel® Learn Easy Steps Activity Card Storage Devices Using external storage devices like Pen Drives, CDs, and DVDs loading Videos Since the advent of computers, there has been a need to transfer data between devices and/or store them permanently. You may want to look at a file that you have created or an image that you have taken today one year later. For this it has to be stored somewhere securely. Similarly, you may want to give a document you have created or a digital picture you have taken to someone you know. There are many ways of doing this – online and offline. While online data transfer or storage requires the use of Internet, offline storage can be managed with minimum resources. The only requirement in this case would be a storage device. Earlier data storage devices used to mainly be Floppy drives which had a small storage space. However, with the development of computer technology, we today have pen drives, CD/DVD devices and other removable media to store and transfer data. With these, you store/save/copy files and folders containing data, pictures, videos, audio, etc. from your computer and even transfer them to another computer. They are called secondary storage devices. To access the data stored in these devices, you have to attach them to a computer and access the stored data. Some of the examples of external storage devices are- Pen drives, CDs, and DVDs. Introduction to Pen Drive/CD/DVD A pen drive is a small self-powered drive that connects to a computer directly through a USB port.
    [Show full text]
  • Nanotechnology Trends in Nonvolatile Memory Devices
    IBM Research Nanotechnology Trends in Nonvolatile Memory Devices Gian-Luca Bona [email protected] IBM Research, Almaden Research Center © 2008 IBM Corporation IBM Research The Elusive Universal Memory © 2008 IBM Corporation IBM Research Incumbent Semiconductor Memories SRAM Cost NOR FLASH DRAM NAND FLASH Attributes for universal memories: –Highest performance –Lowest active and standby power –Unlimited Read/Write endurance –Non-Volatility –Compatible to existing technologies –Continuously scalable –Lowest cost per bit Performance © 2008 IBM Corporation IBM Research Incumbent Semiconductor Memories SRAM Cost NOR FLASH DRAM NAND FLASH m+1 SLm SLm-1 WLn-1 WLn WLn+1 A new class of universal storage device : – a fast solid-state, nonvolatile RAM – enables compact, robust storage systems with solid state reliability and significantly improved cost- performance Performance © 2008 IBM Corporation IBM Research Non-volatile, universal semiconductor memory SL m+1 SL m SL m-1 WL n-1 WL n WL n+1 Everyone is looking for a dense (cheap) crosspoint memory. It is relatively easy to identify materials that show bistable hysteretic behavior (easily distinguishable, stable on/off states). IBM © 2006 IBM Corporation IBM Research The Memory Landscape © 2008 IBM Corporation IBM Research IBM Research Histogram of Memory Papers Papers presented at Symposium on VLSI Technology and IEDM; Ref.: G. Burr et al., IBM Journal of R&D, Vol.52, No.4/5, July 2008 © 2008 IBM Corporation IBM Research IBM Research Emerging Memory Technologies Memory technology remains an
    [Show full text]
  • Can We Store the Whole World's Data in DNA Storage?
    Can We Store the Whole World’s Data in DNA Storage? Bingzhe Li†, Nae Young Song†, Li Ou‡, and David H.C. Du† †Department of Computer Science and Engineering, University of Minnesota, Twin Cities ‡Department of Pediatrics, University of Minnesota, Twin Cities {lixx1743, song0455, ouxxx045, du}@umn.edu, Abstract DNA storage can achieve a theoretical density of 455 EB/g [9] and has a long-lasting property of several centuries [10,11]. The total amount of data in the world has been increasing These characteristics of DNA storage make it a great candi- rapidly. However, the increase of data storage capacity is date for archival storage. Many research studies focused on much slower than that of data generation. How to store and several research directions including encoding/decoding asso- archive such a huge amount of data becomes critical and ciated with error correction schemes [11–18], DNA storage challenging. Synthetic Deoxyribonucleic Acid (DNA) storage systems with microfluidic platforms [19–21], and applications is one of the promising candidates with high density and long- such as database on top of DNA storage [9]. Moreover, sev- term preservation for archival storage systems. The existing eral survey papers [22,23] on DNA storage mainly focused works have focused on the achievable feasibility of a small on the technology reviews of how to store data in DNA (in amount of data when using DNA as storage. In this paper, vivo or in vitro) including the encoding/decoding and synthe- we investigate the scalability and potentials of DNA storage sis/sequencing processes. In fact, the major focus of these when a huge amount of data, like all available data from the studies was to demonstrate the feasibility of DNA storage world, is to be stored.
    [Show full text]
  • The Future of DNA Data Storage the Future of DNA Data Storage
    The Future of DNA Data Storage The Future of DNA Data Storage September 2018 A POTOMAC INSTITUTE FOR POLICY STUDIES REPORT AC INST M IT O U T B T The Future O E P F O G S R IE of DNA P D O U Data LICY ST Storage September 2018 NOTICE: This report is a product of the Potomac Institute for Policy Studies. The conclusions of this report are our own, and do not necessarily represent the views of our sponsors or participants. Many thanks to the Potomac Institute staff and experts who reviewed and provided comments on this report. © 2018 Potomac Institute for Policy Studies Cover image: Alex Taliesen POTOMAC INSTITUTE FOR POLICY STUDIES 901 North Stuart St., Suite 1200 | Arlington, VA 22203 | 703-525-0770 | www.potomacinstitute.org CONTENTS EXECUTIVE SUMMARY 4 Findings 5 BACKGROUND 7 Data Storage Crisis 7 DNA as a Data Storage Medium 9 Advantages 10 History 11 CURRENT STATE OF DNA DATA STORAGE 13 Technology of DNA Data Storage 13 Writing Data to DNA 13 Reading Data from DNA 18 Key Players in DNA Data Storage 20 Academia 20 Research Consortium 21 Industry 21 Start-ups 21 Government 22 FORECAST OF DNA DATA STORAGE 23 DNA Synthesis Cost Forecast 23 Forecast for DNA Data Storage Tech Advancement 28 Increasing Data Storage Density in DNA 29 Advanced Coding Schemes 29 DNA Sequencing Methods 30 DNA Data Retrieval 31 CONCLUSIONS 32 ENDNOTES 33 Executive Summary The demand for digital data storage is currently has been developed to support applications in outpacing the world’s storage capabilities, and the life sciences industry and not for data storage the gap is widening as the amount of digital purposes.
    [Show full text]
  • Computer Files & Data Storage
    STORAGE & FILE CONCEPTS, UTILITIES (Pages 6, 150-158 - Discovering Computers & Microsoft Office 2010) I. Computer files – data, information or instructions residing on secondary storage are stored in the form of a file. A. Software files are also called program files. Program files (instructions) are created by a computer programmer and generally cannot be modified by a user. It’s important that we not move or delete program files because your computer requires them to perform operations. Program files are also referred to as “executables”. 1. You can identify a program file by its extension:“.EXE”, “.COM”, “.BAT”, “.DLL”, “.SYS”, or “.INI” (there are others) or a distinct program icon. B. Data files - when you select a “save” option while using an application program, you are in essence creating a data file. Users create data files. 1. File naming conventions refer to the guidelines followed while assigning file names and will vary with the operating system and application in use (see figure 4-1). File names in Windows 7 may be up to 255 characters, you're not allowed to use reserved characters or certain reserved words. File extensions are used to identify the application that was used to create the file and format data in a manner recognized by the source application used to create it. FALL 2012 1 II. Selecting secondary storage media A. There are three type of technologies for storage devices: magnetic, optical, & solid state, there are advantages & disadvantages between them. When selecting a secondary storage device, certain factors should be considered: 1. Capacity - the capacity of computer storage is expressed in bytes.
    [Show full text]
  • Digital Preservation Guide: 3.5-Inch Floppy Disks Caralie Heinrichs And
    DIGITAL PRESERVATION GUIDE: 3.5-Inch Floppy Disks Digital Preservation Guide: 3.5-Inch Floppy Disks Caralie Heinrichs and Emilie Vandal ISI 6354 University of Ottawa Jada Watson Friday, December 13, 2019 DIGITAL PRESERVATION GUIDE 2 Table of Contents Introduction ................................................................................................................................................. 3 History of the Floppy Disk ......................................................................................................................... 3 Where, when, and by whom was it developed? 3 Why was it developed? 4 How Does a 3.5-inch Floppy Disk Work? ................................................................................................. 5 Major parts of a floppy disk 5 Writing data on a floppy disk 7 Preservation and Digitization Challenges ................................................................................................. 8 Physical damage and degradation 8 Hardware and software obsolescence 9 Best Practices ............................................................................................................................................. 10 Storage conditions 10 Description and documentation 10 Creating a disk image 11 Ensuring authenticity: Write blockers 11 Ensuring reliability: Sustainability of the disk image file format 12 Metadata 12 Virus scanning 13 Ensuring integrity: checksums 13 Identifying personal or sensitive information 13 Best practices: Use of hardware and software 14 Hardware
    [Show full text]
  • AN568: EEPROM Emulation for Flash Microcontrollers
    AN568 EEPROM EMULATION FOR FLASH MICROCONTROLLERS 1. Introduction Non-volatile data storage is an important feature of many embedded systems. Dedicated, byte-writeable EEPROM devices are commonly used in such systems to store calibration constants and other parameters that may need to be updated periodically. These devices are typically accessed by an MCU in the system using a serial bus. This solution requires PCB real estate as well as I/O pins and serial bus resources on the MCU. Some cost efficiencies can be realized by using a small amount of the MCU’s flash memory for the EEPROM storage. This note describes firmware designed to emulate EEPROM storage on Silicon Laboratories’ flash-based C8051Fxxx MCUs. Figure 1 shows a map of the example firmware. The highlighted functions are the interface for the main application code. 2. Key Features Compile-Time Configurable Size: Between 4 and 255 bytes Portable: Works across C8051Fxxx device families and popular 8051 compilers Fault Tolerant: Resistant to corruption from power supply events and errant code Small Code Footprint: Less than 1 kB for interface functions + minimum two pages of Flash for data storage User Code Fxxx_EEPROM_Interface.c EEPROM_WriteBlock() EEPROM_ReadBlock() copySector() findCurrentSector() getBaseAddress() findNextSector() Fxxx_Flash_Interface.c FLASH_WriteErase() FLASH_BlankCheck() FLASH_Read() Figure 1. EEPROM Emulation Firmware Rev. 0.1 12/10 Copyright © 2010 by Silicon Laboratories AN568 AN568 3. Basic Operation A very simple example project and wrapper code is included with the source firmware. The example demonstrates how to set up a project with the appropriate files within the Silicon Labs IDE and how to call the EEPROM access functions from user code.
    [Show full text]
  • Modular Data Storage with Anvil
    Modular Data Storage with Anvil Mike Mammarella Shant Hovsepian Eddie Kohler UCLA UCLA UCLA/Meraki [email protected] [email protected] [email protected] http://www.read.cs.ucla.edu/anvil/ ABSTRACT age strategies and behaviors. We intend Anvil configura- Databases have achieved orders-of-magnitude performance tions to serve as single-machine back-end storage layers for improvements by changing the layout of stored data – for databases and other structured data management systems. instance, by arranging data in columns or compressing it be- The basic Anvil abstraction is the dTable, an abstract key- fore storage. These improvements have been implemented value store. Some dTables communicate directly with sta- in monolithic new engines, however, making it difficult to ble storage, while others layer above storage dTables, trans- experiment with feature combinations or extensions. We forming their contents. dTables can represent row stores present Anvil, a modular and extensible toolkit for build- and column stores, but their fine-grained modularity of- ing database back ends. Anvil’s storage modules, called dTa- fers database designers more possibilities. For example, a bles, have much finer granularity than prior work. For ex- typical Anvil configuration splits a single “table” into sev- ample, some dTables specialize in writing data, while oth- eral distinct dTables, including a log to absorb writes and ers provide optimized read-only formats. This specialization read-optimized structures to satisfy uncached queries. This makes both kinds of dTable simple to write and understand. split introduces opportunities for clean extensibility – for Unifying dTables implement more comprehensive function- example, we present a Bloom filter dTable that can slot ality by layering over other dTables – for instance, building a above read-optimized stores and improve the performance read/write store from read-only tables and a writable journal, of nonexistent key lookup.
    [Show full text]
  • Computer Hardware SIG Non-Removable Storage Devices – Feb
    Computer Hardware SIG Non-Removable Storage Devices – Feb. 1, 2012 A hard disk drive (HDD; also hard drive, hard disk, or disk drive) is a device for storing and retrieving digital information, primarily computer data. It consists of one or more rigid (hence "hard") rapidly rotating discs (often referred to as platters), coated with magnetic material and with magnetic heads arranged to write data to the surfaces and read it from them. Hard drives are classified as non-volatile, random access, digital, magnetic, data storage devices. Introduced by IBM in 1956, hard disk drives have decreased in cost and physical size over the years while dramatically increasing in capacity and speed. Hard disk drives have been the dominant device for secondary storage of data in general purpose computers since the early 1960s. They have maintained this position because advances in their recording capacity, cost, reliability, and speed have kept pace with the requirements for secondary storage Types of Hard Drives Parallel ATA (PATA), originally AT Attachment (old term IDE), is an interface standard for the connection of storage devices such as hard disks, solid-state drives, floppy drives, and optical disc drives in computers. The standard is maintained by X3/INCITS committee. It uses the underlying AT Attachment (ATA) and AT Attachment Packet Interface (ATAPI) standards. The Parallel ATA standard is the result of a long history of incremental technical development, which began with the original AT Attachment interface, developed for use in early PC AT equipment. The ATA interface itself evolved in several stages from Western Digital's original Integrated Drive Electronics (IDE) interface.
    [Show full text]
  • Hard Disk Drive Specifications Models: 2R015H1 & 2R010H1
    Hard Disk Drive Specifications Models: 2R015H1 & 2R010H1 P/N:1525/rev. A This publication could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein – which will be incorporated in revised editions of the publication. Maxtor may make changes or improvements in the product(s) described in this publication at any time and without notice. Copyright © 2001 Maxtor Corporation. All rights reserved. Maxtor®, MaxFax® and No Quibble Service® are registered trademarks of Maxtor Corporation. Other brands or products are trademarks or registered trademarks of their respective holders. Corporate Headquarters 510 Cottonwood Drive Milpitas, California 95035 Tel: 408-432-1700 Fax: 408-432-4510 Research and Development Center 2190 Miller Drive Longmont, Colorado 80501 Tel: 303-651-6000 Fax: 303-678-2165 Before You Begin Thank you for your interest in Maxtor hard drives. This manual provides technical information for OEM engineers and systems integrators regarding the installation and use of Maxtor hard drives. Drive repair should be performed only at an authorized repair center. For repair information, contact the Maxtor Customer Service Center at 800- 2MAXTOR or 408-922-2085. Before unpacking the hard drive, please review Sections 1 through 4. CAUTION Maxtor hard drives are precision products. Failure to follow these precautions and guidelines outlined here may lead to product failure, damage and invalidation of all warranties. 1 BEFORE unpacking or handling a drive, take all proper electro-static discharge (ESD) precautions, including personnel and equipment grounding. Stand-alone drives are sensitive to ESD damage. 2 BEFORE removing drives from their packing material, allow them to reach room temperature.
    [Show full text]